In blindtest to detect DIFFERENCE in sound working with the short term memory is the best yes...
But a difference in sound is an IMPROVEMENT or a DEGRADATION in some direction relative to many aoustical cues at the same time : timbre spectral evenlope, timbre time envelope, imaging, soundstage, dynamic, LEV/ASW ratio, reverberation time, and others acoustical cues relating to the head/torso relation to each speaker for each ears... Then to detect if a difference will be qualified good or bad we need TIME and the detection must be made in a KNOWN acoustical environment with a known musical cues...
To evaluate these changes only relatively long listening session linked to the body/feeling/ brain memory make sense in the acoustical room/ speakers TUNING direction..
We have very short term memory of sound yes but also a body memory linked to emotions associated with past sound long term experiences... It is why hearing sounds must be learned in acoustic and music....The long term memory is not a direct storage of the sound but a feeling engrammation related to the sound...
This acoustic feeling is what a musician use to qualify a sound/music well done... Perfect pitch hearing is another matter...Most musician must learn to reproduce and recognize pitch...They learn how to feel it...And the feeling associated with the tone nevermind the timbre is memorized...It is why a minor chord or a major chord own different feeling meaning...Joy or sadness for example...
By the way i studied practical acoustic in my room for the last 2 years, i discovered that we dont understand what sound really is...i say that but i am not a scientist for sure...It is my informed opinion based on my listening experiments...
Acoustic is a marvellous very complex field and philosophically super deep....
For example sound is not a wave but the wave is the image of the resonant body source....
For sure without air wave there is no sound hearing, but air also make possible fire but the air are not the fire but a cause of his qualitative manifestation.. To hear sound we need a resonant qualified body to be the source interacting though air wave with our ears ...To have fire we need a combusting matter interacting with air and our feeling body called that "hot".......
Human ears/brain are able to detect QUALITATIVE information from the resonant sound source by the mediation of the wave image...Like blind people use waves image trough a resonant echo to reconstruct without eyes the environment geometry and the object matter various densities around them ...