cleeds,
A listener can’t "fail" a listening test - that’s a common misnomer about scientific listening tests.
Did you notice the word was in quotes? That indicates it’s use was qualified - used advisedly - in this case a short-form term for not producing positive results in a blind listening test. I’d already clarified in more detail what inferences, strictly speaking, can be drawn from blind tests, which along with the quotes should have indicated I was using the term "fail" advisedly, not in a strict philosophical sense.
Secondly, it’s a misnomer to think that scientists don’t talk of subjects "failing" tests. Of course they do. For instance study subjects in medical trials can be said to have "failed to respond to the control treatment," etc.
More pointedly, you can test claims about individual people. If an individual claims to have a certain ability - e.g. to identify where hidden water is by dowsing - and controlled blind testing shows their positive hits turn out to be the same as expected for random guesses - one can rightly speak of that subject having "failed to demonstrate the ability in question under controlled test conditions." Exactly what I wrote about in the case of an individual audiophile who claims he can hear a difference between cable A and B, where the blind test results don’t support the claim.
A double-blind listening test doesn’t test the listener. It tests the devices under test.
Of course double blind (or single blind) listening tests can test a listener.
What do you think happens in a hearing test? It’s not testing the equipment; it’s testing what the listener can discern. The same can be said when testing an individual’s ability to discern between two audio cables.
Two different cables *may* be producing slightly different signals. Or they may not. But you can test if an individual reliably discerns between them. If they produce statistically relevant postive results, it supports the claim they can hear a difference between the cables, and also implies there *is* a difference to be detected between the cables. But if they do not produce statistically relevant positive results, you can't determine there is no difference between the cables; only that the listener in question failed to demonstrate the ability to discern between them under controlled conditions.
It may have been an off day for the individual, or it may be that they can’t reliably discern a difference, but other listeners can. So you can test claims relating to individuals via blind tests, using the device in question, but that does not necessarily constitute being able to come to conclusions about the device used in the test.
If you want to test a more general question like "are there audible differences between cable A and cable B?" then you set up many more tests, with a wider arrange of listeners, and gather ever more evidence pro or con for the hypothesis.
One’s confidence grows in scale with the amount of evidence, and at some point it could be reasonable to conclude "cable A is not audibly different than cable B." Just as wide ranging tests of human hearing sets the general audible high frequency limit for humans, with qualifications, at 20kHz.
It’s just a standard inductive inference from particular instances to a general conclusion. It’s never conclusive, but no inductive inference is conclusive in any absolute sense.
(And purveyors of pseudo-science love to harp about inductive inferences not being conclusive - "just because THOSE tests didn’t show an effect for my claim, it doesn’t mean there isn’t one that wouldn’t be demonstrated by another test! You could be wrong you know, you scientific dogmatists!" And they use lack of Absolute Certainty in the scientific method to insert their own wacky claims that "science hasn’t disproved!")