Amir and Blind Testing


Let me start by saying I like watching Amir from ASR, so please let’s not get harsh or the thread will be deleted. Many times, Amir has noted that when we’re inserting a new component in our system, our brains go into (to paraphrase) “analytical mode” and we start hearing imaginary improvements. He has reiterated this many times, saying that when he switched to an expensive cable he heard improvements, but when he switched back to the cheap one, he also heard improvements because the brain switches from “music enjoyment mode” to “analytical mode.” Following this logic, which I agree with, wouldn’t blind testing, or any A/B testing be compromised because our brains are always in analytical mode and therefore feeding us inaccurate data? Seems to me you need to relax for a few hours at least and listen to a variety of music before your brain can accurately assess whether something is an actual improvement.  Perhaps A/B testing is a strawman argument, because the human brain is not a spectrum analyzer.  We are too affected by our biases to come up with any valid data.  Maybe. 

chayro

@adambennette not much discussion here about statistics... well, it’s not an exciting subject really.

Until an appreciation of stats demonstrates how often they may be used incorrectly to draw conclusions that have no basis. That’s just a general observation, so moving right along....

If in fact there is a difference and 40% say there is none, this is saying that those 40% of people have less than optimal hearing. Isn’t that conceding that it is a poor test audience, not to be relied upon?

If we are to trust out ears (whatever that means, despite it being some kind of mantra), ought not 100% agree that there is a difference?

And if prior to the test an unknown portion of the audience cannot trust their ears, on what basis can it then be said after the test that A and B are in fact different? These hard of hearing people may be saying there is a difference when none exists.

The good thing about the one person test is that the variability in peoples hearing is removed - if you have "poor" hearing, any AB test will be subject to a query and be discounted.

If "good" hearing, then multiple tests would need to be done in accordance with good practice (not so easy, as it happens) - to find some measure of confidence would involve discussing probability theory but there does exist common sense rules of thumb which I don’t really like much.

Clear as mud?

edit - by the way, there is no requirement that the individual/s tested be into music/audio gear, whatever.  The best test subjects would be teenagers or even slightly younger.  Just sayin'

Post removed 

The only thing your test would reveal is that under those test conditions, 40 percent of the subjects heard no difference. There’s no data to support your conclusion.

Under strict test conditions. This is a given. I do say that this isn’t easy.

Common sense suggests that something may be learnt from a test when there is a difference and 40% says there is none.  It is not a "dead" number - to an analyst it speaks information.

What would be more interesting (to some, anyway) is where there is no difference and 40% (or 60% or even just 5%) said there is a difference. Could any conclusions be drawn from this? Perhaps the test wasn’t blind or conducted properly (this includes a person who is partial to the outcome conducting the test)? Hmmm. Correctly, this aspect is conceded in the comment.

Given that ears are apparently to be trusted. As is often advocated by many good folk here.

In any event, individual tests are preferred for reasons.

ASR is pretty used to empty responses like that one. It basically says "I don’t actually have any good, civil arguments or evidence in response to ASR’s reviews...but since I still don’t like their conclusions...here’s a disparaging meme so I can feel like I got one over on them."

 

Embarrassing enough once. But..3 times?

Posting something like this which is the opposite of what’s in the link, imbues you with what, pray tell? A silly sense of accomplishment?

All the best,
Nonoise