Has anyone been able to define well or measure differences between vinyl and digital?


It’s obvious right? They sound different, and I’m sure they measure differently. Well we know the dynamic range of cd’s is larger than vinyl.

But do we have an agreed description or agreed measurements of the differences between vinyl and digital?

I know this is a hot topic so I am asking not for trouble but for well reasoned and detailed replies, if possible. And courtesy among us. Please.

I’ve always wondered why vinyl sounds more open, airy and transparent in the mid range. And of cd’s and most digital sounds quieter and yet lifeless than compared with vinyl. YMMV of course, I am looking for the reasons, and appreciation of one another’s experience.

128x128johnread57

@akgwhiz , some good thoughts. Dither provides added dynamic range in digital. Is there an equivalent with hearing and analog noise? Don't neurons have discrete trigger levels?

 

@teo_audio

The place it counts is in the micro expression of transients and micro transients and the differences in level and timing between them.

This sums it up well.

This is where digital and class d falls apart. Those are the points of greatest distortion, in digital and class D.

I would agree, if "digital and class d" meant "mass market digital and mass market class d". Highest-end digital and class d are much harder to differentiate from highest-end analog.

In science, things are supposed to correlate to the situation at hand. Do you understand the question? Is the measurement relevant to the question at hand? If not, go back to the start and have a go at it again. Even when done, keep questioning the results and facts don’t exist..so that all finalized things can be gone over again and altered according to new results on the complex investigation of it all. That’s science.

Exactly. That's what I was pointing out to certain ASR folks. If a theory doesn't fit facts, keep working on the theory, instead of rejecting facts out of hand.

Engineering is specifically NOT exploration, engineering is designed for building things that work and use scientific theories turned into scientific law. Law...Which is a falsehood built for the engineering trade and training within it, for linear minds which are principally dogmatic in form and function.

I see it differently. Not a falsehood, but a model simple enough to be applicable in economical way to a day-to-day engineering.

In audio, the measurement and the analysis is wrong, just plain wrong. Too many engineering minds on the job, trying to play it safe and keep things ordered & black and white.

Measurements are measurements. If they are done competently, with calibrated instruments, and only what is actually measured is claimed, I'm happy to use them.

Analysis is a different story. Analysis always presupposes a theory, or at least a paradigm. And this I consider too rigid in the current mainstream audio.

This is why the audiophile conundrum has existed for about 50 years. The ignorance of projection in the pundits that surround the engineering trade and ideals that are involved in the audio world. Interference (engineers from other areas) from outside audio (even more ignorant!!) helps keep the insanity frothing along nicely.

There are other reasons for relative ignorance of the hearing system fundamental properties among practicing engineers. One of them is that not all relevant knowledge is even discovered yet. Another is that some very relevant knowledge was discovered relatively recently, and practicing engineers weren't taught it.

To clarify, an engineer is not trained to commit to the scientific method or invention, they are trained to follow the books, as that is why they are engineers, not scientists who explore and change things as required when required.

Agree. People like me, trained as scientists, are often perceived as "irreverent" in regard to dozens of audio engineering handbooks published over several past decades. Most engineers (not all) take doubting certain things written in these handbooks as a manifestation of sheer stupidity.

Meanwhile, a whole parallel world of peer-reviewed audio science publications exists. It is instrumental to observe how drastically the theories changed over the past 50 years, prompted by more and more sophisticated experiments, and breakthrough discoveries in the field of mammalian audio system physiology. 

If you want to explore in formal sense, go back to school and get trained to see all as theories, which are subject to change from/on new data, tests and proofing, correlation, etc. Get trained as a scientist.

Not practical for most practicing engineers. The change will only occur gradually, as older generations retire and new ones are taking their place.

When this mess erupts into fully blown projections in insanity of following the dogmatic rule books of engineering, we end up with things like ASR.

I like pretty much all ASR measurements. What I don't agree with is some of the analysis they derive from the measurements. ASR crowd is very uneven: there are bone fide luminaries posting there, and also folks who keep scoring points for slighting others. Guess who ends up with more points?

The longer a problem sits unsolved, unresolved.. the more fundamental the error in the formulation of the question.

Agree. As an example, Ptolemaic System was generally believed to be true for about 14 centuries.

Thus, the audiophile conundrum is deeper than this surface level stuff that people generally think it is. It’s deep in the minds involved, regarding how they explore reality.

Indeed.

As long as dogmatic minds try to figure out what is wrong in audiophiles vs measurements, without moving to true and proper scientific method...the longer they’ll be spinning around and getting no real correlating clarity in any of it.

I'd say the truly dogmatic minds don't even try to figure out what is wrong. They just reject the facts as aberrations, just like later-centuries Ptolemaic scholars ignored observed deviations in planets movements not explainable by their preferred theory.

Let's dissect thinking behind ignoring one of such facts in audio: certain types of music, for instance classical symphonies and gamelan, tend to not sound right when published in CD format.

What is usually offered as grounds for rejecting such statement? The Sampling Theorem and one of the ways to calculate dynamic range of a digital format.

 

The Sampling Theorem (https://en.wikipedia.org/wiki/Nyquist–Shannon_sampling_theorem) reads in its original edition:

If a function x(t) contains no frequencies higher than B hertz, it is completely determined by giving its ordinates at a series of points spaced 1/(2B) seconds apart.

This theorem is often taken as "proof" that sampling frequency of 44.1KHz is sufficient for encoding any meaningful music signal. Because, "obviously", everything above 20 KHz can't be heard by humans, and thus is not worth encoding.

Let's look closely. What does "contains no frequencies higher than B hertz" actually mean? It means, using formulation in same Wikipedia article, that "Strictly speaking, the theorem only applies to a class of mathematical functions having a Fourier transform that is zero outside of a finite region of frequencies."

Do analog signals corresponding to practical music pieces have Fourier transform that is zero outside of a finite region of frequencies? Absolutely not! Because, as another theorem from Fourier analysis proves, only functions of infinite duration can have such Fourier transform.

Let it slowly sink in. The Sampling Theorem, strictly speaking, is not applicable to analog signals corresponding to practical music pieces. But, obviously, some form of Fourier transform is widely used in audio digital signal processing. What's going on here?

What is actually being used, in discrete form, are variations of Short-Term Fourier Transform.

A fragment of a signal, let's say with a duration of 20-25 milliseconds, is taken, then multiplied by a so-called "smoothing window". The resulting function of time is guaranteed to smoothly start as 0, and smoothly end as 0.

Then the signal is mathematically virtually replicated infinite number of times. Since it is now of infinite duration, the Fourier Transform result has limited range of frequencies.

Then the process repeats with another fragment of the signal, starting at 10-12.5 milliseconds later than the previous piece. For the purposes of digital filters, this process sometimes virtually repeats with shift of just one digital sample duration.

So, in practical applications, digital signal processing uses an approximation of Fourier Transform. Correspondingly, the Sampling Theorem only works approximately. Most of the time more than well enough. Sometimes not at all.

 

Now let's consider the issue of sufficient dynamic range. It is oft-cited that CD format has dynamic range of 96 dB. Let's see, approximately, how one could come to such conclusion. 1 bit corresponds to 6 dB SPL. So, "obviously", 16 bits correspond to 6 dB x 16 = 96 dB.

According to https://hub.yamaha.com/audio/music/what-is-dynamic-range-and-why-does-it-matter/:

As a group, classical recordings have the widest dynamic range of any genre. The same study cited above found that recorded classical music typically offers between about 20 dB and 32 dB of dynamic range. While that might seem like a lot, it’s still quite a bit smaller than that of a live symphony orchestra performance, which can be as large as 90 dB.

Technically, those are good news, aren't they? Live symphony orchestra dynamic range is 90 dB. CD dynamic range is presumably 96 dB. 90 < 96. So, CD should be able to reproduce the whole dynamic range of a symphony orchestra, right?

In practice though, we have those presumably stupid music producers and audio engineers, who fell to the Dark Side during the Loudness Wars, and who will only record classical music CDs with 20 dB to 32 dB of dynamic range.

What would happen if they attempted the 90 dB? They would need to allocate 90 / 6 = 15 bits to the dynamic range encoding. For the quietest sound, they'd only be left with 1 bit for encoding it. Wait, what?

Yes, imagine a quiet passage in a symphony, nevertheless involving a dozen of instruments, each with a complex multi-overtone spectrum. With frequency slides, amplitude rides, tremolos etc. All of this would need to be recorded with just one bit at 44,100 Hz!

This is exact equivalent of DSD encoding, only its frequency is 64 times lower. Correspondingly, the highest frequency that we can hope to encode with similar fidelity as DSD will be 44,100 / 2 / 64 = 344.5 Hz. Say goodbye to the "micro expression of transients and micro transients"!

How much different is what the audio engineers are actually doing? Let's say they decided to limit the dynamic range to 30 dB. This corresponds to 30 / 6 = 5 bits. This leaves 11 bits to encode the quietest part of the symphony.

How good are 11 bits? 2 to the power of 11 is 2,048. 1/2,048 = 0.00049. An average digitization error would be half of that, which is 0.025%. Interesting, it is just below the widely accepted threshold of THD defining a hi-fi power amplifier, which is 0.03%.

This is not a coincidence. If they'd allocated less bits for the quietest parts of the signal, they would hear noise and distortions in them, similarly to how they'd be able to hear noise and distortions introduced by a low-sound-quality power amplifier.

If they's wanted to go audiophile quality for the quietest passages, they'd need to up the ante 3 bits more, to 14 bits. so that digitization noise and distortions would be approximately equal to that of a high-quality DAC, and thus would be likely unnoticeable even on a high-quality professional headphones.

So, for faithful reproduction of a symphony we would need 90 / 6 = 15 bits for encoding the dynamic range, and 14 bits for encoding the shape of the signal. 15 + 14 = 29 bits. Uh-oh, but professional ADC and DAC only encode 24 bits? How could they manage to effectively push to 29?

And here we come to understanding of why arguably overkill digital formats are desirable. The seemingly excessive amount of information per second inherent in 24/192, DSD128, and especially in 24/384 and DSD256 can be divided between encoding the dynamic range, encoding the shape of the quietest signal, and sampling the signal frequently enough to capture its evolution over shorter periods of time.

How all of the above relates to the current thread theme? By the virtue of analog recording and reproduction system, which in principle doesn't place fundamental limits, other than noise and maximum acceleration of mechanical parts, on either effective bit depth or sampling frequency.

It is commonly accepted that the best analog systems have about 70 dB of dynamic range. Which would roughly correspond to 70 / 6 = 12 bits. This gives an excuse for proponents of CD superiority over LP to claim that this must be so because obviously 16 > 12.

However, in order for the quietest signal to be still distinguishable, it only needs to be 6 db, or 1 bit, above the noise floor. This leaves an equivalent of 11 bits for dynamic range, which is more than twice of the 5 bits of the usable CD dynamic range.

Instead of the last 1 bit with which to encode the shape of the quiet signal, a high-quality analog system has many more. Actual number depends on the analog media granularity and its speed, yet the most important fact is that there is no hard stop similar to the one a digital system would have.

So, the analog system would reproduce the quiet passages in higher fidelity signal-shape wise, superimposed with noticeable noise of course. Yet the human hearing system is capable of filtering out this noise at higher levels of processing in the brain, and enjoying the quiet passage hidden underneath.

Viewed from this perspective, LP has twice as wide usable dynamic range in comparison with CD. But higher noise and distortions. For classical music especially, this could be a desirable compromise. For some other genres, for instance, extremely-narrow-dynamic-range very-simple-signal-shape electronic dance music, CD could be preferable.

I would expect a classical recording made by multiple microphones sampled at 24/384, or even at 32/384, and delivered in DSD256 after careful mixing and mastering, to be the ultimate one for the time being. As I recall, they produce such recordings in Europe.

@fair

Ref: your long tech commentary and explanation above.

This is the single best explanation I’ve (ever) heard.

Thanks for taking the time to write this here.

 

Some have stated here that digital sounds 'lifeless'.

The reason for this is obvious and immutable.  It arises because the analogue signal has been chopped up into billions of pieces.  It is chopped up in two dimensions: frequency and time.  Once it has been diced in this way, all the expensive gizmos in the world cannot put it back the way it was.  It will never sound like the original analogue experience.  Of the two dimensions, chopping time is by far the more damaging.  However good your clock the timing will be forever artificial.  it will never again sound like the real thing.

Digital sound could be compared with digital images.  It could be said that with sufficient resolution digital imaging can be of very high quality.  This may be so, but for imaging, the image is not chopped in the time dimension. 

@akgwhiz    I do not agree that a preference for vinyl is caused by noise and distortion being "desirable".  The preference arises not from negative attributes of vinyl being perceived perversely as positive, but from the negative consequences of digitisation that cannot be reversed.

@fair ,

There are far too many errors and misinterpretations in your post. It will be highly misleading to someone who is not familiar with digital audio.

Start off with sampling theory and window functions. The requirement for infinite time is only required for infinite precision. We obviously do not need infinite precision as our ears do not have infinite dynamic range, and audio does not extend to 0Hz. Purely practical, the inherent noise of the quietest rooms and the onset of pain sets hard limits on what we need. Hence we do not even need infinite time. The windowing function does its required job. Sampling theorem is absolutely applicable to music. These theories are tested day in and day out. All our communications are based on them.

Short term Fourier Transforms are analysis functions primarily. They make pretty graphs, and are used for signal analysis. The data that comes out of them is bounded by the window width, which defines the lowest frequency that can be represented, the sample rate, which sets the upper bound, and both which define how fine of frequency analysis can be done. They do what they do accurately, understanding their limitations.

This is exact equivalent of DSD encoding, only its frequency is 64 times lower. Correspondingly, the highest frequency that we can hope to encode with similar fidelity as DSD will be 44,100 / 2 / 64 = 344.5 Hz. Say goodbye to the "micro expression of transients and micro transients"!

I am going to highlight this last paragraph. This is 100% false. That is not how DSD works. The single bit in DSD is not equivalent to a single bit change in PCM. No direct comparisons can be made. Hence you conclusion cannot be made and can be assumed false.

There are two flaws in your statement of equivalence 11 bits and 0.03% distortion detection. More like 3 flaws. That distortion limit is at full scale. Assume your stereo is set for 100db peaks, which is fairly loud and you have low distortion playback. There is a particular distortion level evident at that volume. In your analysis, you are claiming to be able to hear distortion at the bit level, on sounds that are only 70db. Are you claiming to be able to hear 0.03% distortion on a 70db peak signal. Not average, peak. That is a low volume level. If you are a very quiet room, 25db, that is only 45db above the noise of your room. That is 8 bits. So there is still 3 bits of addition digital range below the noise floor. Further, CD is dithered. Dither improves the dynamic range where our hearing is most sensitive for added noise where it is not. That extends the dynamic range to where we are most sensitive to 110db. Your argument fails with that information.

 

So, for faithful reproduction of a symphony we would need 90 / 6 = 15 bits for encoding the dynamic range, and 14 bits for encoding the shape of the signal. 15 + 14 = 29 bits.

This is obviously not at all accurate. You are stacking flaws in your understanding of how digital works to come to incorrect conclusions. The digital bit depth only needs to be large enough to encompass the full dynamic range. By shifting noise, we don't even need that many bits for the dynamic range. DSD has 1 bit depth. The noise is shifted to provide large dynamic range. CD has 16 bits. The noise is shifted to increase the dynamic range.

 

However, in order for the quietest signal to be still distinguishable, it only needs to be 6 db, or 1 bit, above the noise floor. This leaves an equivalent of 11 bits for dynamic range, which is more than twice of the 5 bits of the usable CD dynamic range.

 

You are basing this conclusion on a stack of fundamental flaws. It does not represent reality. More accurate is that we can hear below the noise floor. Vinyl has a signal to noise ratio of about 70db, sometimes higher, but the dynamic range can extend 10 or 20db. CD almost beats this with a raw dynamic range over 90db. The dithering extends this to 110db far higher than vinyl.

 

Viewed from this perspective, LP has twice as wide usable dynamic range in comparison with CD. But higher noise and distortions.

This is also based on a stack of flawed assumptions. It is incorrect