Erik says “The idea of modeling multiple non-linear systems at once to derive a master model of behavior could probably be its’ own thing.” It certainly is. I recently finished a little project where I applied a machine-learning neural network model to classify 1000 clips of music according to a ‘harmonic signature’ (mostly “live content” case #1, or mostly “synthesized”, case #2). Interestingly, although FFT, spectral centroids, RMS energy etc. were important in defining the signature, the most compelling predictors were the Mel-Frequency Cepstral Coefficients. Why is this important?
Because the purpose of these coefficients is try to capture exactly what audiokineses refers to as “the transfer function of a device (how that device alters the input signal) through a psychoacoustic (i.e. perception-based) lens.” Two examples will explain: ‘the sound of a baby crying.’ Why is the baby crying? Hungry? Lonely? Needs diaper changed? We could do technical analysis forever and not know – but the mother knows instantly. Someone singing, “I don’t know what to do.” Why? Boredom? Lost love? We know from the voice; not just the context.
I think that psychoacoustic perception is exactly where we need to look to understand that last 10% or 20% beyond the point where purely technical/engineering analysis stops reliably explaining what we know to be true in our ears.
The science is not dead; it’s getting more and more interesting.
Because the purpose of these coefficients is try to capture exactly what audiokineses refers to as “the transfer function of a device (how that device alters the input signal) through a psychoacoustic (i.e. perception-based) lens.” Two examples will explain: ‘the sound of a baby crying.’ Why is the baby crying? Hungry? Lonely? Needs diaper changed? We could do technical analysis forever and not know – but the mother knows instantly. Someone singing, “I don’t know what to do.” Why? Boredom? Lost love? We know from the voice; not just the context.
I think that psychoacoustic perception is exactly where we need to look to understand that last 10% or 20% beyond the point where purely technical/engineering analysis stops reliably explaining what we know to be true in our ears.
The science is not dead; it’s getting more and more interesting.