Thanks, Rzado, for the refresher course. Let me try to summarize for anyone who fell asleep in class. In a DBT, if you get a statistically significant result (at least 12 correct out of 16 in one of Radzo's examples), you can safely conclude that you heard a difference between the two sounds you were comparing. If you don't score that high, however, you can't be sure whether you heard a difference or not. And the fewer trials you do, the more uncertain you should be.
This doesn't mean that DBTs are hopelessly inconclusive, however. Some, especially those that use a panel of subjects, involve a much higher number of trials. Also, there's nothing to stop anyone who gets an inconclusive result from conducting the test again. This can get statistically messy, because the tests aren't independent, and if you repeat the test often enough you're liable to get a significant result through dumb luck. But if you keep getting inconclusive results, the probability that you're missing something audible goes way down.
To summarize, a single DBT can prove that a difference is audible. A thousand DBTs can't prove that it's inaudible--but the inference is pretty strong.
As for my statement about statistics not being the weak link, I meant that there are numerous ways to do a DBT poorly. There are also numerous ways to misinterpret statistics, in this or any other field. Most of the published results that I am familiar with handle the statistics properly, however.
This doesn't mean that DBTs are hopelessly inconclusive, however. Some, especially those that use a panel of subjects, involve a much higher number of trials. Also, there's nothing to stop anyone who gets an inconclusive result from conducting the test again. This can get statistically messy, because the tests aren't independent, and if you repeat the test often enough you're liable to get a significant result through dumb luck. But if you keep getting inconclusive results, the probability that you're missing something audible goes way down.
To summarize, a single DBT can prove that a difference is audible. A thousand DBTs can't prove that it's inaudible--but the inference is pretty strong.
As for my statement about statistics not being the weak link, I meant that there are numerous ways to do a DBT poorly. There are also numerous ways to misinterpret statistics, in this or any other field. Most of the published results that I am familiar with handle the statistics properly, however.