Yes, this it the most current knowledge and there are no indication it is incorrect, but even with these cues, it can be difficult to accurately assess height. I spent a number of years doing R&D on hearing aids and similar audio "devices". Our group believed we were one of the first to look at how the design of the hearing aid could be improved with the goal of preserving positional cues most take for granted. Unfortunately that R&D was abandoned after I left as well as other programs to pump up the balance sheet before selling. It was a bit contentious at the time as well. It indicated issues with signal processing delay differences masking timing cues.
"Technically", just as you have indicated, frequency filters that mimic the pinna, can provide a sense of height in head-phonic playback and encoded in only two channels. There has been a fair amount of research done with HATS (head and torso simulators) for recording, but, as you indicated, it requires tailoring to the individual to work properly. If you attempt that technique with speakers, you get not only the HATS transform, plus the listener ... and two pinnas are not better than one. W.R.T. your particular situation, making a wild ass guess, the microphone above his head, if not omni and not pointed at him, created a filtering effect that simulated height with pinna filtering. Curious if the wavefront from the electrostats is less impacted by torso/head/pinna than would normally occur with dynamic speakers. Interesting! I may have to pick up a pair now and do some testing.
Speaking of interesting, to the last post about difficulty of creating a stable image outside the speakers, have you done much research on ambiphonics?
"Technically", just as you have indicated, frequency filters that mimic the pinna, can provide a sense of height in head-phonic playback and encoded in only two channels. There has been a fair amount of research done with HATS (head and torso simulators) for recording, but, as you indicated, it requires tailoring to the individual to work properly. If you attempt that technique with speakers, you get not only the HATS transform, plus the listener ... and two pinnas are not better than one. W.R.T. your particular situation, making a wild ass guess, the microphone above his head, if not omni and not pointed at him, created a filtering effect that simulated height with pinna filtering. Curious if the wavefront from the electrostats is less impacted by torso/head/pinna than would normally occur with dynamic speakers. Interesting! I may have to pick up a pair now and do some testing.
Speaking of interesting, to the last post about difficulty of creating a stable image outside the speakers, have you done much research on ambiphonics?
Regarding height cues out in the "real world", my understanding is that the way sound diffracts around the head and outer ear (the pinna) from above is what gives us height cues. I have read papers and articles about encoding these "head and pinna transforms" into a signal to convey height information, but to really do it right, the equalizations would have to be tailored to the individual's ears. (One possible application would be in the helmets of fighter pilots, so that an audible threat warning would also convey the direction. Head position tracking would of course have to be included.)
I don't see how height information could be encoded in a normal two-channel recording... BUT something weird happened to me years ago.