In defense of ABX testing


We Audiophiles need to get ourselves out of the stoneage, reject mythology, and say goodbye to superstition. Especially the reviewers, who do us a disservice by endlessly writing articles claiming the latest tweak or gadget revolutionized the sound of their system. Likewise, any reviewer who claims that ABX testing is not applicable to high end audio needs to find a new career path. Like anything, there is a right way and many wrong ways. Hail Science!

Here's an interesting thread on the hydrogenaudio website:

http://www.hydrogenaud.io/forums/index.php?showtopic=108062

This caught my eye in particular:

"The problem with sighted evaluations is very visible in consumer high end audio, where all sorts of very poorly trained listeners claim that they have heard differences that, in technical terms are impossibly small or non existent.

The corresponding problem is that blind tests deal with this problem of false positives very effectively, but can easily produce false negatives."
psag
Post removed 
"Your last post leaves me with the impression you do not think there are differences in cables and therefore they cannot be heard, especially when you get defensive when I asked the results of your controlled listening tests. Beats me why you wouldn't want to disclose the results."

You're allowed to have any impression you like. It has nothing to do with me. It's your choice, not mine. As for the reason why I don't want to disclose the results, once again, I already gave it. It was clearly stated in my last post. Here it is again.

"And as to the results of the test's, its not relative to this discussion. You only want me to list the results so you can comb through them to find the slightest detail just so you can claim the whole thing is null and void, so you get to be right."
Zd542; I get what you are saying in response to my post. The closest I have seen is as I mentioned. Reviewers listening to one piece and some music, then swapping it out for another piece, without changing anything else and listening again. My point earlier and using the wine example was that blind A/B testing would show that most reviewers (not all) have no clothes and they can't have that. so the best I can hope for in this day is what I mentioned earlier.

However, companies respond to letters and not so much to phone calls and posts on chatboards. So maybe more letters to the magazines requesting A/B blind testing may help

enjoy
I agree that only a limited number of switches are needed, if the test conditions are good. What are good test conditions?: A treated room with good acoustics, high quality electronics, well-recorded music, the ability to do rapid switching (having a second person to manipulate the hardware helps), and familiarity with the musical selections. That's all you need to eliminate subjectivity and get to the truth.
"01-29-15: Psag
I agree that only a limited number of switches are needed, if the test conditions are good. What are good test conditions?: A treated room with good acoustics, high quality electronics, well-recorded music, the ability to do rapid switching (having a second person to manipulate the hardware helps), and familiarity with the musical selections. That's all you need to eliminate subjectivity and get to the truth."

What I was referring to was a very simple test. You have 2 cables in the system, 1 is copper, the other silver. The goal was to see if you could pick out the silver or copper, and that's it. Nothing subjective like what cable sounds better. That's just personal preference. So after you hear a 10 second clip of music, you say copper or silver. With such a small sample, you can't really weed out things that may produce bad results. For example, lets say that there really was no difference that a test subject could hear between the 2 cables. That would mean, both cables would sound identical. But we won't know that until after the test. That would also mean, every answer given would only be right by pure chance. So for this test, since there are only 2 answers, and going by the assumption that there is no difference, over time, the answers would have to conform to a 50/50 split. If we only got 10 samples under this scenario, there's a really good chance you wouldn't get a 50/50 split with just 10 tries. With 100 tries, you get much closer. An easy way to visualize, or even try this concept to see for your self, would be to filp a coin. Flip it 10x, and even though you should get 5 heads and 5 tails, with so few tries, you can easily get different results. The only way to reduce this type of error is to take a larger sample. Flip a coin 100 times, and you'll get much closer to the 50/50 split that you would expect to get from just pure chance.