Amir and Blind Testing


Let me start by saying I like watching Amir from ASR, so please let’s not get harsh or the thread will be deleted. Many times, Amir has noted that when we’re inserting a new component in our system, our brains go into (to paraphrase) “analytical mode” and we start hearing imaginary improvements. He has reiterated this many times, saying that when he switched to an expensive cable he heard improvements, but when he switched back to the cheap one, he also heard improvements because the brain switches from “music enjoyment mode” to “analytical mode.” Following this logic, which I agree with, wouldn’t blind testing, or any A/B testing be compromised because our brains are always in analytical mode and therefore feeding us inaccurate data? Seems to me you need to relax for a few hours at least and listen to a variety of music before your brain can accurately assess whether something is an actual improvement.  Perhaps A/B testing is a strawman argument, because the human brain is not a spectrum analyzer.  We are too affected by our biases to come up with any valid data.  Maybe. 

chayro

Most of you on this forum likely do not know or have ever heard of a gauge R&R.  Most also likely do not understand the concept of accuracy and precision.  That's not a slight.  This is a difficult concept and much work has been done to define it and apply it to test measuring equipment.  I want to start with something most of us know quite well- the bathroom scale.  If you are like me, we have a love/hate relationship with our bathroom scale.  It's a simple device that can either make or break our day and yet we typically do not think twice about whether or not it is telling us the truth.  What do I mean?  Well, for starters I can get on my bathroom scale three times consecutively and get three different readings with a range of 2 or more pounds.  Even worse, I find that I can move the scale around on the floor and get even more variation.  This is one of the newer scales with a digital readout to tenths of a pound.  While my bathroom scale indicates a precision of 0.1 lbs, the repeatability is much worse which implies the accuracy is likely off by a few pounds.  I don't know because my bathroom scale has no reference back to a standard.  I notice the scale at the Doctor's office has much better repeatability.  I see just 0- 0.1 lbs variation if I step off and back on again and the Doctor's scale has higher precision based only upon the display showing hundredths of a pound.  But I have rarely seen a calibration sticker on the scale in the Doctor's office.  I have seen stickers on the scales at a research dept and at the hospital.  Probably because they publish reports.  Accuracy is typically not well defined.  Typically, gages are rated accurate to within a certain percentage of full scale.  Let's say a bathroom scale is rated to +/-0.5% of full scale.  (Not likely that good for a $30 scale)  That means the manufacturer is stating that any reading will be (for a 400 lb scale) within +/-0.5% of 400 lbs or +/- 2 lbs.  So I could have lost one pound overnight but my bathroom scale might tell me that I gained one pound!  Isn't that frustrating.

What's my point?  Let's say you go to the butcher shop and you buy a 10 lb ham.  Then you stop by another shop and just to see, you weigh the ham on their scales and find it only weighs 9 lbs.  Wouldn't you be upset?   How about you stop at the gas station and buy 10 gallons of gasoline only to learn you actually got just 9 gallons.  Well, take comfort in knowing that by law those scales and gas pumps are calibrated back to a standard.  If you look at the scale at your butcher shop you should see a calibration sticker.  The same goes for your local gas pump.  Take a look on the face plate of the pump for the calibration sticker.  

If we count on these everyday items to telll us the truth then why not expect the same regarding measurements of stereo gear.  Knowing that calibration of the equipment to a standard was done, what test equipment was used, and also the procedure so that the measurements can be duplicated or verified by someone else is crucial to know that the data is telling us the truth.  Also important is to know how these particular measurement data relate to how the piece of gear performs.  For example, I can measure the resistance of two different speaker cables with an Ohmmeter or even a resistance bridge for more precision but still conclude no difference.  So why do they sound different?  Some speculate that better cables reject RF noise.  Sounds reasonable to me.  So why hasn't someone published test data showing the RF rejection characteristics of different cables?  Maybe they have but I just have not seen it.  This would not be easy testing.  It would require a Faraday cage and some sophisticated measurement equipment.  Still, we cannot and should not take every measurement at face value and make conclusions from that about what we are or are not hearing.  I had my own saying in Engineering:  "No-one believes the test data except for the person who took it.  Everyone believes the calculations except for the person who made them.

So in my vanity, I will take several readings on my bathroom scale but accept only the lowest reading.  I don’t do a statistical calculation of the group of readings.  That’s the very definition of biased testing, I think.  And what’s it matter?   When I go to the Doctor’s office they will not accept my weight based on my scale’s readout.  They take their own measurement on their scale.  No one believes the test data except for the one who took it…

Are you still going on about this? You don’t even expect measurements from 99.9% of audio companies, but you nit pick because a test site that produces likely accurate results within the framework of the measurements they are taking, produces results you don’t like?

My background is physics, so not an EE, but I understand most of the terms pretty well as we use similar measurements.

I personally don’t care if the amplifier I bought was 200 wpc into ohms or 195. I will never hear the difference and it is an acceptable margin of error or manufacturing tolerance. I would care about 170 because I paid for 200 and that is not an acceptable tolerance. In my industry, we specify batteries are either +/- tolerance or +/-0 (no lower than) depending on the product / contract.

Now if I am not mistaken, harmonic measurements, which are more important than power as long as power is close, is a relative measurement.

As well, as we discussed previously, it appears the test equipment in question both ships calibrated, as well as has a source and receiver. That provides a level of inherent feedback on current calibration.

Last, due to the relative nature of the critical measurements, the best measuring tested device, if available, could be used a 2nd reference to calibration to set a minimum benchmark. For example, if the best device you tested had a THD of 0.0010 %, and you test it again, 0.0010, you can be confident in the current operation of your system to testing devices with equal or higher distortion. We have a wide range of "reference standards" in our labs and production for validating current calibration, not to mention you are calibrating a whole fixture or system, not one item provided by a 3rd party vendor.

 

Jssmith, I appreciate your comments but I wouldn’t trust Amir’s opinion on judging the sound of anything. He’s obviously a horrible listener and the pure fact is he’s a measuring geek. He knows nothing about the quality of the sound of products. 

In the late 1970's I worked at a high-end audio shop in DC and they purchased what was then a "sophisticated" A/B switching unit, which allowed the user to switch back and forth between a pair of amplifiers (among other things).  I remember setting up a customer for an extended A/B comparison between a McIntosh 2205 and a Luxman LRS (Laboratory Reference System) power amp.  After a LONG audition the customer ordered the LRS amp, saying he thought that sounded "smoother" than the McIntosh.  I later found out that our tech had ordered a new main board for the switching unit, as it illuminated the lights for unit A and unit B when you pressed the button to make the switch, but the electronics inside the unit were not actually switching amplifiers...so my customer was actually only listening to the McIntosh amp.  When I found that out, I called him and let him know the situation, but he enjoyed the Luxman so much, he decided to keep it!