Image depth


Can anyone offer a technical explanation of how a stereo system recreates image depth? Why are some center images behind the speakers, and others in front of the speakers, for example.
Should there be any depth to a mono recording, or should the image be directly in line with the speakers?
cakids
Post removed 
Do an internet search using the phrase "stereo recording techniques".  There are a vast number of articles describing different microphone placement and their impact on stereo sound reproduction.
Can anyone offer a technical explanation of how a stereo system recreates image depth?

As far as the recordings go, leave that for others, but how to get your system to extract the most do this.

 Remove anything between the speakers (equipment racks ect) and you’ll increase the depth perspective as your ears and eyes hear and see it.

Then one "BIG" step further is to remove the back wall from in between the speakers like I did, leaving a little 1-2mt behind each speaker for bass loading, and then hear and see your image depth go back much further to the back wall of the next room.

Cheers George
What I’m getting from reading about recording techniques, is that godd recording techniques will fool the ear-brain location finding function to create an approximate aural image, but does not actually duplicate the exact sonic signature (amplitude, phase) that would impinge on the ears during a live performance. The exception is binaural recording, which puts the mikes on a simulated human head.
Your ears/brain use several things for placement of a sound:
  • The difference in arrival time for approximately the same sound between your two ears gives you angular position. This works best between about 120Hz to about 1500Hz, but predominantly below 800Hz. This is that highly accurate timing you hear assigned to the ears/brain, but keep in mind it is phase difference, not absolute timing.
  • Spectral notches due to head shape provide front/back cues, emphasis on cues. It is not perfectly accurate.
  • Most of depth comes from volume cues
  • Volume cues can also give angular position predominantly at high frequencies, predominantly >1500Hz, but again starting at about 1000Hz. One side of your head shields the other side resulting in a level difference based on frequency.
  • Some filtering of frequencies by your torso and pinna provide some level of height cues, but the ear/brain is not great at height detection.

With most recording techniques, even for live music, most of the angular position information is lost. What you perceive in the recording is artificial, put their by the recording engineer.

There are microphone techniques, both with torso/head simulators and stereo microphones that can capture or theoretically capture the differential timing of what a human would hear. Big However, being able to use that on playback is pretty much limited to headphones, though you may get lucky with speaker placement and the odd recording and extract some of that.

Height information is highly specific to individuals, so capturing it on 2 channel is pretty much impossible, and actually not done.
So what does this mean? Most of the sound-stage, imaging, etc. even in live recordings indicative of the recording and far more influenced by the recording/mixing process. That that end, most of what audiophiles described is "simulated" especially height.

It also means when someone says wall-wall sound-stage, that is probably either hyperbole, or a pleasant, but highly inaccurate representation of the music.