We are getting somewhere.
From where I am sitting you have not provided one explanation because every single explanation or example you have used is wrong, stacking misunderstanding on top of misunderstanding.
This is precisely how it should feel, from the point of view of someone remaining in an old paradigm. New paradigm overturns some of the old paradigm's assumptions and conclusions, which is obviously "wrong" in the context of the old paradigm.
Fourier analysis is not a paradigm, it is a mathematical translation from time to frequency, it just is.
Mathematically, Fourier Analysis is a theory based on integral transforms with harmonic kernels. "Integral" means that there is an integral involved, calculated from low boundary of integration to high boundary of integration.
Direct Fourier Transform takes bounds from time domain. Reverse Fourier transform takes bounds from frequency domain. Time domain and frequency domain can be, as classes of specific cases, continuous or discrete.
This theory is beautiful in its simplicity. For instance, formulas for direct and reverse transforms, in their traditional formulation, only differ in one sign in one place. The simplicity affords efficient implementation of the transforms in computer code.
The accuracy, as I previously wrote, is based on suitable bandwidth limitations, and appropriate windowing functions, much which occur naturally in audio, but are still supplemented by the appropriate analog filters, over sampling, and digital processing.
And here we move away from the theory and arrive to a paradigm. The "suitable bandwidth limitations" remove information contained in original analog air pressure variations over time at the point of recording.
Central belief of the paradigm states that removal of frequency components beyond 20Hz and 20 KHz is perceptually benign, for all types of music, and all listeners. Technically, this is the crux of our disagreements. I do not subscribe to this belief.
People are not just guessing at the implementation and not considering what the underlying waveforms can and do look like. Let me break just one section down to illustrate your logic flaws and misunderstandings. It carries through to the rest of what you have wrote:
You start with a flawed premise, proceed to a flawed understanding of digitization, and finish with an incorrect understanding of reconstruction.
I value your opinion. Couldn't asked for a better illustration of what I had to endure in my prior discussions at ASR.
However, my professors, from leading European universities, and their teaching assistants, had other opinions, giving me straight As on all courses related to Fourier Analysis and DSP.
The theory I'm using today contains the classic Fourier Analysis, and classic DSP based on it, as subsets. I absolutely do use them in domains of their applicability, when I believe they are going to provide accuracy sufficient for a task at hand.
Yet there is more, which came mostly from research conducted by others over past three decades. Unfortunately, too much of it is still widely dispersed in numerous peer-reviewed papers, rather than concentrated in a few engineering handbooks.
Flawed premise: 12 KHz sine wave do not suddenly appear, starting at 0. As I previously wrote, we are dealing with a bandwidth limited and defined system. You cannot go from 0, silence, directly into what looks exactly like a sine wave. That transition exceeds the 20KHz (or whatever we are using). Also, the digitizer, filters, etc. will have been running and settled to required accuracy by the time this tone burst arrives. Whatever you send it, will have been limited in frequency, by design, by the analog filters preceding the digitizer.
When one writes enough code processing real-life music recorded with high enough fidelity, one absolutely starts believing that such music components do exist: going from zero to almost pain threshold in a matter of microseconds, and then rapidly decaying.
One of the best examples of music genres rich in such components that I know of is Indonesian Gamelan. It is a curious genre: worshiped by its devotees in native land, and almost completely ignored by listeners outside the region.
Even the best gamelan CD recordings of famous Indonesian ensembles sound to me like incoherent early practices. Live, classical gamelan compositions, played with passion by experienced musicians, sound heavenly to me.
Flawed understanding of Digitization: As written above, the digitizer was already running when the tone burst arrives. Whether the sample clock is shifted globally the equivalent of 1/8 of a 12KHz tone, or not, will have no impact on the digitization of the information in the band limited analog signal.
This depends greatly on the nature of the band-limiting filter used. For analog filters this statement is generally true, with understanding that perfect brick wall filters don't exist, so there are still some smaller artifacts to be expected because of that. For digital filters applied to stream of oversampled values in some ADC devices, not so much.
Flawed understanding of reconstruction: When I reconstruct the analog signal, using the captured data, whether I use the original clock, or the shifted one, the resulting waveform that results will be exactly the same. In relationship to the data file, all the analog information will be shifted by about 10 useconds. That will happen equally on all channels. The waveforms will look exactly the same either case. One set of data files will have an extra 10 useconds of silence at the front of them (or at the end).
See my comment above. Depends on the nature of bandwidth-limiting filter.
I am sure you believe this, but you used flawed logic, a flawed understanding of the waveform, and a flawed understanding of digitization, reconstruction, and the associated math.
I don't believe so. I used more sophisticated understanding of those. Knowing, both from learning the theory and from practical experience, that absolute predicted perfection isn't practically achievable, and that one needs to very carefully look at what artifacts are produced by this or that digitization method, and whether the artifacts may be heard under certain conditions by certain listeners.
I went back and looked looked at the research. In lab controlled situations, humans can detect, a very specific signal up to 25db below the noise floor, A-weighted. That is not listening to music, that is an experiment designed to give a human the best possible chance. For vinyl, that means in a controlled experiment, maybe you could hear a tone at -95db referencing 0db as max. With CD, the same would be true at -110db (or more) due to the 100% use of dithering.
Research on masking of signal by noise is kind of 101 of psychoacoustics. What we already established is that I'm more interested in how practically encountered noise is masking practically encountered quiet music passages. I do realize that masking thresholds for specially constructed signals and noise patterns may be different.
To be sure we are on the same page. Class-D amplifiers are analog amplifiers. They are not digital. I will correct you. Perception of distortion. You are making an assumption of something that is there, without proof it is there.
Yes an no. Implemented with analog circuitry, yes. But, at some point inside a class-D amp analog signal is transformed into a sequence of discrete +V and -V segments, starting and ending at analog time boundaries.
So, it is kind of a hybrid. Analog in time domain throughout, discrete at an intermediate stage in amplitude domain. Not what most people would call classically digital, yet not quite purely analog either.
Which theory is it that you are using? I noted many flaws in your understanding of critical elements of digital audio, and assertions that are also incorrect. I have already falsified your theory.
I see it differently. The old paradigm is falsified by phenomena for which it gives invalid predictions. For instance, according to the old paradigm, LPs shall be long gone, the way of cassette tape recorders and VCR video tapes. Yet LPs persisted, and the classic paradigm produces no convincing explanation as to why.
New paradigm not only explains why LPs persisted for so long, but also specifically predicts what they'll be replaced with. To repeat once again, to the best of my understanding, eventually they'll be replaced by digital recordings with information density equal to, or higher than, those of PCM 192/24 and DSD128 formats.
Perhaps not important to this discussion, but 16/44.1 is a delivery format. From what my colleagues tell me, is has not been used as a digitization format in decades, and depending on your point of demarcation, it has not been used as a digitization format since the 1980’s, as all the hardware internally samples at a higher rate and bit depth.
Yet another phenomenon not explained by the old paradigm.
According to the old paradigm, 16/44.1 format shall be sufficient for capturing and delivering any type of music, yet in practice all those "pesky producers and sound engineers", for some mysterious reasons, want to use digital formats providing higher information density.
The new paradigm not only qualitatively explains this phenomenon, but also accurately describes the numerical parameters of the formats that were found sufficient by trial and error by many highly qualified practitioners.
The new paradigm also explains why gear providing even higher information densities, easily available these days (e.g. I own several nice 32/768 ADC/DACs), isn't as widely used at its highest settings in practical music production.