@adambennette not much discussion here about statistics... well, it’s not an exciting subject really.
Until an appreciation of stats demonstrates how often they may be used incorrectly to draw conclusions that have no basis. That’s just a general observation, so moving right along....
If in fact there is a difference and 40% say there is none, this is saying that those 40% of people have less than optimal hearing. Isn’t that conceding that it is a poor test audience, not to be relied upon?
If we are to trust out ears (whatever that means, despite it being some kind of mantra), ought not 100% agree that there is a difference?
And if prior to the test an unknown portion of the audience cannot trust their ears, on what basis can it then be said after the test that A and B are in fact different? These hard of hearing people may be saying there is a difference when none exists.
The good thing about the one person test is that the variability in peoples hearing is removed - if you have "poor" hearing, any AB test will be subject to a query and be discounted.
If "good" hearing, then multiple tests would need to be done in accordance with good practice (not so easy, as it happens) - to find some measure of confidence would involve discussing probability theory but there does exist common sense rules of thumb which I don’t really like much.
Clear as mud?
edit - by the way, there is no requirement that the individual/s tested be into music/audio gear, whatever. The best test subjects would be teenagers or even slightly younger. Just sayin'