speakers for 24/96 audio


is it correct to assume that 24/96 audio would be indistinguishable from cd quality when listened to with speakers with a 20khz 3db and rapid hi frequency roll-off?

Or more precisely, that the only benefit comes from the shift from 16 to 24 bit, not the increased sample rate, as they higher freq content is filtered out anyhow?

related to this, which advice would you have for sub $5k speakerset with good higher freq capabilities for 24/96 audio?

thanks!
mizuno
Well, Bryon, that was a very interesting article. I'm not sure what to think after reading it... is this yet another investigation into a micro-problem that doesn't really affect music reproduction, or is it a significant factor? I certainly don't know. I can't even venture a guess.

Anyway, Kunchur admits to listening to cassettes. I haven't heard cassettes for many years, but 16/44 CDs must sound like a revelation by comparison. ;-)
Hi Bryon,

Your question about the audibility of jitter that is on a time scale far shorter than the temporal resolution of our hearing is a good one. The answer is that we are not hearing the nanoseconds or picoseconds of timing error itself. What we are hearing are the spectral components corresponding to the FLUCTUATION in timing among different clock periods (actually, among different clock half-periods, since both the positive-going and negative-going edges of S/PDIF and AES/EBU signals are utilized), and their interaction with the spectral components of the audio.

For example, assume that the worst case jitter for a particular setup amounts to +/- 1 ns. The amount of mistiming for any given clock period will fluctuate within that maximum possible 1 ns of error, with the fluctuations occurring at frequencies that range throughout the audible spectrum (and higher). That is all referred to as the "jitter spectrum," which will consist of very low level broadband noise (corresponding to random fluctuation) plus larger discrete spectral components corresponding to specific contributors to the jitter.

Think of it as timing that varies within that +/- 1 ns or so range of error, but which varies SLOWLY, at audible rates.

All of those constituents of the jitter spectrum will in turn intermodulate with the audio data, resulting in spurious spectral components at frequencies equal to the sums of and the differences between the frequencies of the spectral components of the audio and the jitter.

If you haven't seen it, you'll find a lot of the material in this paper to be of interest (interspersed with some really heavy-going theoretical stuff, which can be skimmed over without missing out on the basic points):

http://www.scalatech.co.uk/papers/aes93.pdf

Malcolm Hawksford, btw, is a distinguished British academician who has researched and written extensively on audiophile-related matters.

One interesting point he makes is that the jitter spectrum itself, apart from the intermodulation that will occur between it and the audio, will typically include spectral components that are not only at audible frequencies, but that are highly correlated with the audio! He also addresses at some length the question of how much jitter may be audible.

So to answer your last question first, no, I don't think that the audibility of jitter on a nanosecond or picosecond scale has a relation to the plausibility of Kunchur's claim.

As far as point no. 1 in my previous post is concerned, yes I think that the quote you provided about closely spaced peaks being merged together does seem to provide a logical connection between his experimental results and a rationale for hi rez sample rates. It hadn't occurred to me to look at it that way. So that point would seem to be answered.

Best regards,
-- Al
Byron,

I appreciate your questions. You are definitely curious enough to look into this and I commend you on your interest.

However, poor Kunchur seems a very confused individual.

His test simply shows how two pure tones can interfere with eachother in a way that becomes audible. However, his conclusions are completely bogus. The listener is NOT hearing temporal time-domain effects of microseconds. The listener is actually hearing changes in the combined resultant waveform which has been altered by offsetting one source to the other (combined - meaning both waves and including all room reflections).

As I explained, this will lead to TOTAL destructive interference of the primary direct signal as heard by the listener at an offset of 2.5 CM. This is like a signal that is TOTALLY out of phase. The direct sound will be inaudible and all the listener hears is all the sound around the room (reflected sounds). Since we detect the direction of sound from the relative timing of the wave front (or nerve bundle triggers) across each ear then we lose that ability when a signal is out of phase.

Poor Kunchur is conflating things in a bad way - this is bad science.

However, his remarks about speaker alignment and panels are partly valid. It is almost certain that large radiating surfaces can cause the kind of interference at certain frequencies like what he achieved in this experiment. This manifests itself in a speaker response that has many suckouts across the frequency spectrum. In fact the anechoic response of a large panel response will look like a comb with many total suckouts across the frequency range. The result is that some sounds and some frequencies will not be as tightly imaged as with a point source speaker. Since most sounds are made up from many harmonics this effect will not be complete but on the whole it will lead to a larger more diffuse soundstage with some sounds imaging precisely and others more diffuse than when compared to a point source speaker. There is an audio tool called a flanger that is used for electric guitar - it achieves a similar effect but even stronger.

Also Jitter is not audible in the sense you describe. It is audible when non-random jitter over a great many 1000'sa and 100,000's of samples combines in a way that introduces new frequencies. We hear those new frequencies that are created by the non-random modulation of the clock (random jitter is just white noise at very low inaudible levels).

We are totally UNABLE to hear jitter effects on a few samples.
07-05-11: Almarg
...we are not hearing the nanoseconds or picoseconds of timing error itself. What we are hearing are the spectral components corresponding to the FLUCTUATION in timing among different clock periods...

That's what I suspected, Al, but I wasn't sure.

And thanks for your explanation of jitter. I was aware that jitter resulted in frequency modulation, but I didn't know that it was a kind of intermodulation distortion. Your explanation is much appreciated.

Shadorne - You may be right that Kunchur's methodology is flawed. I've read a few other experiments on human temporal resolution with similar methodologies, but my memory of them is a little vague. In any case, I have a question about your observation that "Some sample rates are noted for being better than others for reducing audible jitter." I'd be interested to hear a technical explanation for why that is the case.

Finally, I have a general question about high resolution audio that anyone might be able to answer:

My understanding is that the principal advantage of larger bit depth is greater dynamic range. What is the principal advantage of higher sampling rates, if it is not better temporal resolution?

Bryon
What is the principal advantage of higher sampling rates, if it is not better temporal resolution?

None above redbook CD except it allows cheaper and better filtering which may improve very slightly the audible band. However, higher sample rates do allow you to go to one bit resolution (like SACD format which is a DSD stream but SACD has very high levels of out of band noise - so to be honest I am not sure I accept that it is even as good as 24 bit/96)