Here are three great ways to rile up Twitter users: Comment on the president. Say something about immigration. Review speakers.
That’s what I discovered when I posted the results of a blind listening test I conducted last week, in which five panelists were asked to rank the sound quality of an Apple HomePod, Sonos One, Google Home Max, and Amazon Echo Plus. (Go ahead and read it; I’ll wait right here.)
Nobody ranked the HomePod as No. 1 on most songs. For three panelists, the Sonos One was the overall winner; for the other two, the Google Home Max.
I was surprised, because most tech critics ranked the HomePod as the best. So did I, when Apple set up the blind listening test. (Here’s my full review of the HomePod.)
Soon enough, my little test started bouncing around the Twitterverse; here are some of the most popular theories and critiques of the test.
The curtain blocked the sound
The most common complaint about my test setup is about the curtain, which I used to hide the speakers from the panelists’ view. “I would think the curtain will block the highs, which is what HomePod excels at,” tweeted @toxicpath and others.
I don’t think that’s it. First, the cloth I used is extremely sheer — you can see right through it. It’s a single ply of very thin fabric.
Second, if the cloth affected the sound of the HomePod, wouldn’t it have affected the sound of its rivals in the same way?
Third, I used the same cloth the night before, during the dress rehearsal, when both panelists ranked the HomePod as the winner.
Some readers suggested repeating the test with the panelists’ backs turned to the speakers. That setup wouldn’t pass muster, either — our ears are naturally scooped to collect sound from the front.
Other readers proposed blindfolding the panelists. Well, OK, but how would they take their notes and record their rankings during the five musical tests of four speakers?
The curtain prevented the HomePod from adapting
The HomePod contains six microphones. They’re supposed to sample the proximity of the walls and ceilings around it, and reconfigure what’s coming out of it so that the important stuff is blasted “forward,” seven tweeters noted
“Since the HomePod adjusts its sound to the acoustics of the room, you should not have used a piece of fabric to hide the speakers,” wrote @markbooth and others. “The fabric may have affected the HomePod’s sound.”
Well, no. The HomePod re-samples its listening position after each time it’s moved, during the first few seconds of music playback. We let the HomePod do its room listening before hanging the curtain, so it had already had the chance to adjust its sound.
The listener positions affected their perceptions
One of the most interesting observations came from people like @JazzStevo. He noticed that the five listeners sat in a row, with the speakers in a parallel row. And when the results were tallied, the listeners closest to the Sonos One end all preferred the Sonos One!
Similarly, the listeners closest to the Google Home Max end both preferred the Google Home Max! (He made this cool diagram to make the point.)
It’s well established that loudness affects our perception of speakers. Before the test, we did volume-match the four speakers using a meter — but if you’re at the same end of the row as a certain speaker, of course it’s closer to you, and therefore louder!
There’s one huge problem with this theory, though: Why wouldn’t the center speaker, the Apple HomePod (B in the diagram), therefore sound best to the center listeners directly in front of it?
It doesn’t make sense that the center panelist would perceive speaker D as being the loudest.
Spotify over AirPlay has lower quality
One of the most compelling theories came from people like @osaddict: “I wonder if you would have gotten different results had the HomePod streamed directly from Apple Music. I have noticed some difference vs AirPlay.”
“Spotify over AirPlay is very underdriven,” adds @jaydisc. “AirPlay Spotify sounds shockingly worse than Apple Music streaming,” says @ErikVeland. “There’s definitely magic EQ sauce being applied to Apple Music that’s not in AirPlay.”
In other words, they’re saying, streaming Spotify over AirPlay (Apple’s Wi-Fi-based wireless streaming protocol) may not sound as good as streaming Apple Music to the HomePod. (To keep everything equal, my testing involved streaming the same songs from the same Spotify playlist to all four speakers.)
If true, that would easily explain why the HomePod won Apple’s listening test, and lost mine.
I checked with Apple; the reply was that there should be no such degradation as long as the Wi-Fi network is strong.
I did some testing of my own, playing the same song directly from Apple Music and then streaming from AirPlay and Spotify. @jaydisc is quite correct: In general, the songs from Apple Music come in at a higher volume than they do from Spotify. To make the songs sound identical, you have to boost the Spotify volume by a couple of notches.
Once you do that, though, there’s really no perceptible difference in the sound quality. I spent a whole morning doing comparisons — 30 different songs in different genres, “A/B”-ing them between Apple Music and volume-adjusted Spotify, over and over again, at different volume levels and listening positions. I made myself crazy trying to hear a difference.
If there is one, I swear that it’s too small to identify in regular use. It couldn’t have made a difference to my listening panel.
In any case, it would be impossible to control for this variable in a speaker comparison test, since none of the other speakers can stream from Apple Music.
In the meantime, I’m not the only one who’s been doing HomePod listening tests.
Consumer Reports finished up their testing of the HomePod, and concluded that “it’s not the best-sounding wireless speaker in our ratings — or even the best-sounding smart speaker.”
Maria Rerecich, the magazine’s director of electronics testing, was kind enough to share the details of her team’s testing process.
“It’s not a bad test, what you did,” she told me. “Having people come in and listen to it is a good thing to do.”
But her team’s goal was different. “We’re looking for fidelity and accuracy to the original tracks, more than somebody saying ‘Hey, that sounds good,’” she explained. Because, for example, “some people like bassy music, some don’t.”
So her team plays various kinds of tracks. Some are instrumental (“Are the instruments clear? Are they located in space properly? Are you getting a sense of the room they were in?”), vocals (“Are the treble, midrange, and bass clear? Is anything out of balance? Is the midrange muddy? Is anything oddly sizzly or peaky? Are there good dynamics?”), and so on. “We use the same tracks all the time,” Rerecich says, “so we know what they sound like.”
They compared the HomePod and other smart speakers against high-end reference speakers, which the magazine has all rated Excellent. “We can flip back and forth from the reference speakers to the test speakers; everything’s synced to the same track,” she says.
The bottom line? “Overall, the sound of the HomePod was a bit muddy compared with what the Sonos One and Google Home Max delivered,” says the resulting article.
Metered tests of static
FastCompany.com did some tests, too — not by listening, but by having the HomePod play white noise (static) and measuring its acoustical properties.
They enlisted NTi Audio AG, a manufacturer of acoustics testing equipment. “The company was kind enough to loan us a testing device, software, and a special microphone so that we could test the HomePod in a real-life natural habitat–my living room,” writes Mark Sullivan. “The company’s Brian MacMillan coached me on how to do the tests, then he and some other NTi people analyzed (and helped me understand) the results.”
The results? “’The developers have done an excellent job of having the HomePod adjust to the room; (it has) Impressive consistency in overall level and frequency response,’ said NTi’s MacMillan.”
On Reddit, an audio fan, WinterCharm, spent 8.5 hours testing the HomePod, using audiophile equipment: a calibrated microphone, Room EQ Wizard software, and a lot of technical knowledge. He measured the frequency response — again, not of music, but of a sine wave.
“What apple has achieved here is incredibly impressive — such tight control on bass from within a speaker is unheard of in the audio industry,” he writes. “What Apple has managed to do here is so crazy, that If you told me they had chalk, candles, and a pentagram on the floor of their Anechoic chambers, I would believe you. This is witchcraft. I have no other word for it.”
But he, too, then got enough methodology pushback from readers — over 1,400 comments, including this reply from Redditor edechamps, which calls his analysis “hilarious” and “garbage” — that he wound up backing off from his original conclusions.
Recipe for the perfect test
Many readers found fault with my testing protocol, and had suggestions to improve it. “I would suggest someone other than Grandpaw Pogue devise the test,” wrote @Dayv. “Pogue is bad at this and he should feel bad.”
@Dayv, and others, offered a simple prescription for a better test:
“Repeat the test with the speaker samples in a more randomized order,” he says. “Doing A, B, C, D or D, C, B, A gives undue weight to D and A. Need truly randomized ordering, and a lot more tests.” (My test was fairly random. I sometimes started with speaker A, and sometimes with speaker D. I also offered panelists the opportunity to hear any speakers again in any combination or order, which they often requested.)
“Blindfold them and have them take voice notes on an app, so that they can speak freely without influencing the other panelists,” adds @jocrz. “This would require the test to be given to one person at a time.”
“The people are sitting really close to all of the speakers — these are room devices, not desk speakers,” @Dayv goes on.
“I would sit each speaker alone on a table, blindfold listeners, and seat them around it,” suggests @jdmuccigrosso.
“You need an anechoic chamber if you want good measurement accuracy,” writes Redditor Edechamps. “It is impossible to accurately measure a speaker in a normal room.”
@cribasoft proposes “an all black media room. “I promise if you put the speakers on a table at the front, and only provided directional lighting where testers are sitting, they could see their paper but not any of the speakers.”
“Need a lot more opinions. Test people separately in the same seat,” says @Dayv.
And, of course, I’d have to persuade Google, Amazon, and Sonos to add Apple Music to their speakers, so that they’re all streaming from the identical servers.
Wow. That would be quite a test.
But you know what? If you need that much effort to hear that one speaker is obviously superior… well, then it probably isn’t.
I suspect, in the end, that my original conclusions are correct: That different pieces of music are different, and the people listening to it are different. There is no right answer.
David Pogue, tech columnist for Yahoo Finance, welcomes non-toxic comments in the Comments below. On the Web, he’s davidpogue.com. On Twitter, he’s @pogue. On email, he’s email@example.com. You can sign up to get his stuff by email, here.