How Science Got Sound Wrong


I don't believe I've posted this before or if it has been posted before but I found it quite interesting despite its technical aspect. I didn't post this for a digital vs analog discussion. We've beat that horse to death several times. I play 90% vinyl. But I still can enjoy my CD's.  

https://www.fairobserver.com/more/science/neil-young-vinyl-lp-records-digital-audio-science-news-wil...
128x128artemus_5
@erik_squires "Microtime, as the article envisions it, is not a thing."

Don't agree. It seems to me that he defines it quite clearly in terms of microsecond (neural) phenomena. And also, it seems to me that someone with a Ph.D. in this area is likely to know something about this area.

Where he could be clearer is about the relationship between math and science - like how to not screw it up when applying math to the physical world. But that's a highly technical subject all on its own (for access to the literature see Theory of Measurement by Krantz et al, Academic Press, in 3 volumes), and surprise, many scientists get it quite wrong. Let alone engineers.
It is an interesting article, and I certainly will not fault his credentials w.r.t neurobiology, though it sounds like his knowledge w.r.t. the auditory processing system is 2-skin layers deep but no doubt still deeper than mine. But, even that I will not fault.

What I will fault is his knowledge of signal processing and how that relates to analog/digital conversion and analog signal reconstruction. He seems to process the same limitations in his knowledge as Teo_Audio illustrates above with his record example, that Millercarbon alludes to, and whoever did not calculation w.r.t. bandwidth.

I will start off with the usual example. Records, almost all of them made in the last 2 decades (and longer) were recorded, mixed, mastered on digital recording and processing systems. Therefore, whatever disadvantages you think apply to digital systems w.r.t. this timing "thing" absolutely and unequivocally apply to records recorded in digital.

So back to the paper, Teo’s error in logic / knowledge, miller’s interpretation. The most recent research shows that us lowly humans can time the difference of arrival of a signal to each ear to about 5-10 micro-seconds. Using that mainly, and other information, we can place the angle of something in front of us to about 1 degree. 5usec =~1.5mm of travel. Divide the circumference of the head by 1.5mm and you get about 360, or 1 degree of resolution. Following?

So how does the brain measure this timing? By the latest research, it appears to have 2 mechanisms, one, that works on higher frequencies, higher than the wavelength of the head’s size, that is based on group delay / correlation, i.e. the brain can match the same signal arriving to both ears and time the difference and another mechanism for lower frequencies, that can detect phase, likely by a simple comparator and timing mechanism. The two overlap. Still following? You will not this happens with relatively low frequencies, i.e. still frequencies within the range identified for human hearing. I know know ... but the timing, what about the timing. So let’s talk about that.

First a statement: In a bandwidth limited system (as digital audio systems are), any signal on those two (or more) channels will be time accurate to the jitter and SNR limit of the system, and NOT the sampling rate. Let me state that another way. Any difference in timing captured by a digital audio system, assuming the signal is within the frequency limits of that system, will be captured. Let me state that a 3rd way with an example. We have a 96KHz ADC with 10 pico-second jitter. We have two identical signals, bandwidth limited to say 10Khz. One signal arrives at the first ADC 1-microsecond before it arrives at the other ADC. We then store it and play it back. What will we get? ... We will get 2 signals, essentially exactly the same, with one signal delayed by 1-microsecond.

So, all those arguments the neurobiologist made in that extensive article, all his knowledge, are all for naught because he does not understand digital signal processing and ADC systems and analog reconstruction. If he did, he would have known that digital audio systems, within the limits of bandwidth, are not limited in inter-channel timing accuracy to the sample rate, but to the jitter. Whether the signals leave both channels at time A or time B does not matter, as long as the relationship in timing between the two channel is accurate .... which it is in digital analog systems.

.... and if you are reading this GK, not once did I need to consult wikipedia :-) ...
David, I don't quite follow your third last paragraph.

Your "first statement" is indeed a statement, capable of being true or false, but is it true? It needs justification, it seems to me. It is not the same at all as the sentence following, "Let me state that another way." And the sentences following "Let me state that a 3rd way ..." do not convince me that the phenomenon is independent of sampling rate.

The nature of the signals is irrelevant. It is the relative timing of the encoding that matters. If the sampling rate is not high enough, or the jitter rate not low enough, then two signals differing by 1 microsecond will be encoded as identical.

Perhaps an example will help you to understand my confusion. It seems to me that if sampling is done at a frequency of 1Hz, and two signals differing by 1 us are detected, they will be encoded in the same pulse about 999,999 times out of 1,000,000. Which logically implies that sampling rate is intrinsic to the issue. 

Perhaps you could point out the source of my confusion.
New here but I found his points, yes his science, very intriguing.  To the point I thought, heck, hes got it right.  But I cant help but wonder, even a fully digital stream/source/path ultimately has to be reproduced through a vibrating speaker.  It seems that this is a massive integration or smoothing, each connected (albeit complex) peak and trough lasting way longer than the neural timing. Accepting his points, maybe this is digital's way to get by as well as it does.  BTW, I'm not picking sides, just the way I stated it.  
terry9,   No worries on being confused about this. I find that many audio writers, many people in the audio industry period, and certain many (most) on audio forums do not get this concept. When you do the math (no literally go through the math), which I have not done in years, it becomes quite obvious how it works (after the 3rd of 4th reading).

Let me do a more real world signal. We have a 24 bit audio system, so it captures with a resolution of about 1/16.7 million, though practically will be closer to 1/1-2 million. Let’s say the system is sampling at 100Khz, and the system is bandwidth limited to 20KHz. Now let’s say we have 10KHz signal.

One key concept in a bandwidth limited system is that you cannot have just a pulse 1 waveform long, i.e. you can’t have a 1Khz waveform that last exactly 1 cycle. That would violate the bandwidth of the system because in a bandwidth limited system you cannot start and stop instantly. You can’t start and stop instantly in the real world either.

Here is where it gets harder. So these two signals, both 1KHz tones, separated by 1 microsecond arrived at these two ADCs. Let’s assume that Signal B arrives at Channel 2, 1 microsecond before Signal A arrives at Channel 1. To make the math easy for me, let’s assume that Signal A arrives at exactly 0 phase. Here are the digital outputs for the first 10 samples at 1KH and 20KHz. This is a DC offset AC signal, so the numbers go from 1 to 2^24.

You can easily tell these numbers do not represent the same signal, there is definitely something different about them. Your next question may be about accuracy / resolution. Jitter will obviously impact the inter-channel timing accuracy. I have not looked at the math in a while, but as you approach the SNR, I remember there is an increase in the inter-channel timing uncertainty.

  • 1KHZ
  • Ch1 / Ch2
  • 8,388,608 / 8,441,314
  • 8,915,333 / 8,967,925
  • 9,439,979 / 9,492,249
  • 9,960,476 / 10,012,218
  • 10,474,769 / 10,525,779
  • 10,980,830 / 11,030,906
  • 11,476,660 / 11,525,605
  • 11,960,303 / 12,007,923
  • 12,429,850 / 12,475,958
  • 12,883,448 / 12,927,862

We can do it at 20Khz as well
  • Ch1 / Ch2
  • 8,388,608 / 9,439,979
  • 16,366,648 / 16,628,630
  • 13,319,308 / 12,429,850
  • 3,457,907 / 2,646,210
  • 410,567 / 798,368
  • 8,388,607 / 9,439,979
  • 16,366,648 / 16,628,630
  • 13,319,308 / 12,429,850
  • 3,457,907 / 2,646,210
  • 410,567 / 798,368
This is 20KHz, 90db down from full. As you can see, there are still substantial differences between the channels. This is 20-30db above the noise floor of a good ADC.

  • Ch1 / Ch2
  • 265 / 298
  • 281 / 314
  • 298 / 331
  • 314 / 347
  • 331 / 362
  • 347 / 378
  • 362 / 393
  • 378 / 407
  • 393 / 421
  • 407 / 434

terry91,067 posts11-15-2019 12:29amThe nature of the signals is irrelevant. It is the relative timing of the encoding that matters. If the sampling rate is not high enough, or the jitter rate not low enough, then two signals differing by 1 microsecond will be encoded as identical.

Perhaps an example will help you to understand my confusion. It seems to me that if sampling is done at a frequency of 1Hz, and two signals differing by 1 us are detected, they will be encoded in the same pulse about 999,999 times out of 1,000,000. Which logically implies that sampling rate is intrinsic to the issue.

Perhaps you could point out the source of my confusion.