speakers for 24/96 audio


is it correct to assume that 24/96 audio would be indistinguishable from cd quality when listened to with speakers with a 20khz 3db and rapid hi frequency roll-off?

Or more precisely, that the only benefit comes from the shift from 16 to 24 bit, not the increased sample rate, as they higher freq content is filtered out anyhow?

related to this, which advice would you have for sub $5k speakerset with good higher freq capabilities for 24/96 audio?

thanks!
mizuno

Showing 12 responses by shadorne

And if they do sound so unpleasant, why when I listen to higher res stuff through a Benchmark DAC it doesn't sound noticeably better?

You can buy Tom Petty Mojo in CD or in HD and compare. There is a difference but most of the difference is due to audio compression applied to the CD master to make it "hot" - see CD loudness wars and what artists and producers try to do make the music to make it sell.

Basically they compress everything - especially drums - so that the dynamic range of peaks above RMS is usually no more than 6 to 10 db. Whereas a good recording in pop/rock may have 20 db peaks and a classical recording may have 30 db peaks above RMS.

The HD files - such as those on HD tracks are usually much less compressed than the 16/44.1 equivalents.
To add to my last comment...from a purely "technical" perspective I would agree that CD quality is more than adequate as a playback medium. The problem is NOT the CD media itself but more a problem with what the producer and mastering do to the music BEFORE it gets issued as a CD.

Of course 24/48 or 24/96 is essential in the Studio because there is much more dynamic range and signal manipulation required in that environment.
That video is wrong. It is showing a stair step signal which is NOT what the output of a DAC would look like. The output will be smoothed by a filter in order to eliminate all that horrible spurious high frequency signal from the stair steps. The output filter will remove the stair step and restore the sine wave so that the signals are much more alike - even a little above Nyquist - absolutely No need to got 100Hz sampling to properly render a 10Hz sine wave.

warning not everything you see from Universities is accurate.

Also it is WRONG to compare signals in this way. We hear frequencies NOT the waveform as presented graphically! The closeness of the waveforms as presented graphically is NOT a proxy for how close alike they will sound!
. Closer you get to Nyquist frequency the more samples you need to properly reconstruct original waveform - not possible to do for short high frequency sounds.

Not so. The waveform is perfectly reconstructed. The mathematics are quite rigorous. The main issue with digital is

1. Anti alias filtering (higher frequencies must be eliminated prior to ADC or they can fold in)
2. Jitter

Both of the above add spurious non musical signals. Both can be managed
On the S/N discussion, this is usually around 100 dB on good gear. I am certain this is achievable because my speakers can hit about 112 dB SPL at the listening position (12 feet back) as measured with a SPL meter whilst I cannot hear any sound (when no music is playing) from the tweeter unless my ear is within about 6 inches. This translates to roughly 100dB(taking into account the difference in distance which is around 12 dB and assuming the threshold for hearing hiss is around 20 dB in the room with inherent ambient noise around)

I think the ambient room noise and the speakers peak clean SPL are the limiting factors in a typical setup.

I think tape hiss or vinyl noise is limiting you to about 60 or 70 dB dynamic range on analog recordings.

I think high quality digital recordings can probably achieve around 90 dB dynamic range - limitations being the ambient noise during the recording process.

This is why CD is more than good enough for playback. This is why there are a few rebook CD recordings that are world class.

Of course, in a studio the signals are manipulated - this creates the need for even greater dynamic range (24 bit or 144 dB) - not that they will necessarily have better S/N but they may want to boost some sounds by 20 dB or so and may apply digital filters (the accuracy of said filters improves significantly if you have more bits)
Most sounds last at least a hundredth of a second or longer. My point is that even for a 15 KHz sound you are likely to be hearing 15000/100 = 150 cycles. It is irrelevant that the amplitude of a few cycles may not be graphically represented perfectly. The problem is the context we are talking about is related to hearing rather than graphical presentation of a waveform.

Although Kijanki is right about the graphical accuracy my point is that,as regards to human hearing and music, this is not so relevant. In essence the engineers at Sony and Philips did a thorough job when they came up with rebook CD! Perhaps if redbook CD was not as good as it is then SACD would not have failed. The problem is that SACD and other higher resolution formats are very much into diminishing returns compared to a well produced CD.

I would add that the graphical representation of waveforms and the "digital staircase" form one of the biggest and most enduring audiophile myths that analog is inherently better than digital. In fact, most of the benefits of analog come from the added distortion that is pleasing to the ear - analog tape machines are wonderful devices for audio compression(removing dynamic range)!
Byron,

There is no solid evidence for this - so it is indeed controversial. If a mere few microseconds were important then speaker and listener position would be dependent down to a millimeter or less than a tenth of an inch. It is generally accepted that 1 msec is the point at which time differences become audible (roughly 1 foot). Our ears are roughly 6 to 8 inches apart. Since temporal differences are detected by the difference in arrival at each ear - this all suggests that our "resolution" is close to that length which is about 0.5 msec in time ( at the speed of sound in air).

What these findings may be related to is "jitter" - it has been shown mathematically that non random time errors can produce audible "sidebands" around musical signals and that jitter of 1 microsecond can be quite audible due to our ability to hear these non-musical sounds or tones or sidebands. If you increase the sample rate then you will change the way jitter affects the sound - a significantly higher sample rate would likely reduce the deleterious effects of jitter. Some sample rates are noted for being better than others for reducing audible jitter. Benchmark found that 110 Khz worked better than other rates with the DAC chip they use.
Al,

Glad to hear we can all agree. Sony and Philips engineers did a great job with redbook CD, it would indeed be hard to go against all their research.

I agree that transients close to the Nyquist are going to be the most challenging to reproduce faithfully, however, there is really not much in th eway of sounds that one can call music above 15 KHz anyway.
That said, I think we are all in agreement that the main usefulness of 24 bits is in the creation of the recording.

Agreed.

However, I would add that high resolution recordings are targeted at audiophiles - so this new high resolution media is useful in that you tend to get a better quality recording that has NOT been heavily compressed for mass consumption. So they ARE useful to audiophiles but not so much from the "improved resolution" but mostly because audio that gets formatted this way tends to be a better quality master rather than a master intended for restaurant, pub, iPod & car FM radio play.
I agree with Al.

This shows how good we are at hearing sounds and nothing to do with temporal resolution.

The wavelength at 7KHz is 5cm. Therefore in order to get the direct sound completely out of phase at the listener one need only move one speaker back by 2.5 cm (half a wavelength). This will result in the direct sound being Zero and will probably reduce the SPL level to be clearly audible. The fact that only a 2.9 mm movement was audible suggests that reflections may also have played a role here too.

The use of pure signal of a single tone with no (audible) harmonics can often gives surprising results! This is not reflective of musical instruments that have many harmonics so it is hard to draw any conclusion other than a test tone produces an audible result. Anyway my money is that there is enough of an amplitude difference here to make it audible in the case of a pure test tone. A pure test tone will fluctuate as you move around the room (you get peaks and suckouts depending on how it all adds up (reflection and direct sound).
Byron,

I appreciate your questions. You are definitely curious enough to look into this and I commend you on your interest.

However, poor Kunchur seems a very confused individual.

His test simply shows how two pure tones can interfere with eachother in a way that becomes audible. However, his conclusions are completely bogus. The listener is NOT hearing temporal time-domain effects of microseconds. The listener is actually hearing changes in the combined resultant waveform which has been altered by offsetting one source to the other (combined - meaning both waves and including all room reflections).

As I explained, this will lead to TOTAL destructive interference of the primary direct signal as heard by the listener at an offset of 2.5 CM. This is like a signal that is TOTALLY out of phase. The direct sound will be inaudible and all the listener hears is all the sound around the room (reflected sounds). Since we detect the direction of sound from the relative timing of the wave front (or nerve bundle triggers) across each ear then we lose that ability when a signal is out of phase.

Poor Kunchur is conflating things in a bad way - this is bad science.

However, his remarks about speaker alignment and panels are partly valid. It is almost certain that large radiating surfaces can cause the kind of interference at certain frequencies like what he achieved in this experiment. This manifests itself in a speaker response that has many suckouts across the frequency spectrum. In fact the anechoic response of a large panel response will look like a comb with many total suckouts across the frequency range. The result is that some sounds and some frequencies will not be as tightly imaged as with a point source speaker. Since most sounds are made up from many harmonics this effect will not be complete but on the whole it will lead to a larger more diffuse soundstage with some sounds imaging precisely and others more diffuse than when compared to a point source speaker. There is an audio tool called a flanger that is used for electric guitar - it achieves a similar effect but even stronger.

Also Jitter is not audible in the sense you describe. It is audible when non-random jitter over a great many 1000'sa and 100,000's of samples combines in a way that introduces new frequencies. We hear those new frequencies that are created by the non-random modulation of the clock (random jitter is just white noise at very low inaudible levels).

We are totally UNABLE to hear jitter effects on a few samples.
What is the principal advantage of higher sampling rates, if it is not better temporal resolution?

None above redbook CD except it allows cheaper and better filtering which may improve very slightly the audible band. However, higher sample rates do allow you to go to one bit resolution (like SACD format which is a DSD stream but SACD has very high levels of out of band noise - so to be honest I am not sure I accept that it is even as good as 24 bit/96)