Soundstage Width and Depth


I’m curious about what your systems produce when it comes to soundstage. My speakers are about 8’ apart and I sit about 10’ from the front plane of the speakers. The speakers are toed in so that they each are pointed at a spot about 8” from my ears on each side. (Laser verified) My room is treated with bass absorption and diffusers.

In many recordings my soundstage is approx 28’ wide and, although this is tougher to determine, I would say on most recordings I’m hearing sounds 10’-15’ further back than the speaker plane. Some sounds, usually lead guitars, are presented slightly in front of the plane of the speakers. There are also recordings that produce height in the soundstage. Some fill the room floor to ceiling, while others are more on the same plane about 5’ from the floor. I do get layers usually in about the same order, guitars, lead singer, bass guitar, drums, violins and backup instruments and singers in order front to back. Again this is recording dependent. Intimate recordings that feature a singer playing a guitar usually has all of the sound between the speakers. Is this what everyone experiences? Could the depth be deeper? Do many of you hear sounds in front of the speaker plane? Do you have any recordings that accentuate the front to back soundstage?
128x128baclagg
@erik_squires  My listening area is my living room and I had to balance aesthetics with functionality when it comes to room treatments. They have been my best investment to date. GIK was great to work with and my sound stage, imaging and tonality were all improved beyond my expectations.
Soundstage width which extends to the outside of the speakers can be encoded on the recording, but it can also be the result of strong early sidewall reflections. The Precedence Effect is not completely effective at suppressing directional cues from significant early lateral reflections, which can tend to pull sound images to the outside of the speaker plane. Toole calls this an "increase in Apparent Source Width (ASW)", and finds that most listeners enjoy it.

But this reflection-induced increase in Apparent Source Width comes at a price, if I understand Geddes correctly, and that price is clarity and/or imaging precision and/or depth of image, assuming the latter is on the recording.

In my opinion image depth and a sense of spaciousness and/or envelopment are all related, in this sense: They are spatial cues which are on the recording itself, rather than being contributed by room reflections (as is the case with increased Apparent Source Width). When the soundstage seems to go significantly deeper than the wall behind the speakers, and/or it seems that you are enveloped in a much larger acoustic space than your room, that is not coming from the acoustic signature of your small playback room.

We can think of the spatial cues which are on the recording as being in competition with the spatial cues generated by the playback room. The ear/brain system will tend to pick whichever cues are the most convincing. Unfortunately the playback room’s "small room signature" has a natural advantage, but with good speaker setup and/or good room treatment it is possible to weaken the playback room’s signature while effectively presenting the venue cues which are on the recording (whether they be real or synthetic).

Briefly, the technique includes minimizing strong, distinct ("specular") early reflections while preserving enough reverberant energy that we have a fair amount of relatively late-onset, spectrally-correct reflections. This is a bit more nuanced than merely hitting a target RT60, as RT60 tells you nothing about what is happening early on, and it is the earliest reflections which most strongly convey the characteristic signature of a small room.

As others have noted, when you are hearing a significantly different spatial presentation from one recording to the next, THAT is very good sign. It means that the recording’s venue cues are dominating over your playback room’s signature.

Duke
Duke,

The source "appearing" outside the speakers is "encoded" in the music with either mixing methods or microphone techniques, but and it is a big but, how that appears on playback is highly dependent on speaker, listening position, microphone or mixing technique and very important the listener themselves. The microphone technique or mixing (playing with timing) can have more impact on perceived playback that anything to do with the venue.


Keep in mind that how this works is not by recreation of a real source outside the speaker as would be the case with a first reflection, but by tricking the brain with a delayed signal from the other speaker hitting the opposite ear to generate timing information that the brain may perceive as equivalent to the timing information of a sound wrapping around the head to determine direction. Not everyone interprets it the same and the effect can be positive or negative and is influenced it appears by individual interpretation of timing and volume cues for direction, not to mention what works for one recording on one system could fall apart completely with a different recording or system. 


For those discussing "height", we have a hard enough time getting height in a real 3d environment. Nothing that comes out of your speakers (in a single plane) is height beyond that your room acoustics may generate and it would bear little reality to the actual recording environment (if it had any height at all in the first place).
Heaudio123 wrote: "The source "appearing" outside the speakers is "encoded" in the music with either mixing methods or microphone techniques, but and it is a big but, how that appears on playback is highly dependent on speaker, listening position, microphone or mixing technique and very important the listener themselves."

Thanks for adding this, as I know virtually nothing about microphone or mixing techniques.

Heaudio123 again: "Keep in mind that how this works is... by tricking the brain with a delayed signal from the other speaker hitting the opposite ear to generate timing information that the brain may perceive as equivalent to the timing information of a sound wrapping around the head to determine direction."

Very interesting! This "reflection timing = direction/angle" information is related to why cabinet edge diffraction is generally more detrimental to imaging on a wide cabinet than on a narrow one: The longer the time delay for the diffracted signal, the greater the angle (the further around to the side) of the false cue it conveys. So a narrow cabinet’s diffraction cues indicate a narrow false angle, while a wide cabinet’s cues indicate a wider false angle and thus blur the correct image more. However if the cabinet is sufficiently wide the Precedence Effect may start to mask those false angular cues. One of the reasons for flush-mounting studio monitors is to eliminate edge diffraction entirely, which makes the imaging more trustworthy.

Regarding height cues out in the "real world", my understanding is that the way sound diffracts around the head and outer ear (the pinna) from above is what gives us height cues. I have read papers and articles about encoding these "head and pinna transforms" into a signal to convey height information, but to really do it right, the equalizations would have to be tailored to each individual’s head and ear shape. (One possible application would be in the helmets of fighter pilots, so that an audible threat warning could convey complete directional information, including azimuth. Head position tracking would have to be included because fighter pilots swivel their heads a lot.)

I don’t see how height information could be encoded in a normal two-channel recording... BUT something weird happened to me years ago:

I bought a new CD that had just been put out by a musician I was friends with, Coco Robichaux. Listening over my SoundLab electrostats (floor-to-ceiling fullrange single-driver line-souce-approximating speakers), I heard his voice coming from normal height on most songs but on one song in particular his voice came from the bottom of the speaker, down at the floor! I played the song for others and some heard it coming from down near the floor and some did not.

So I asked Coco about that song. What he told me was very interesting: The recording process had been rushed, and on THAT song, the microphone had been incorrectly positioned ABOVE his head in the recording booth! So relative to the microphone location, his voice WAS coming from the direction of the floor.

I can only speculate about HOW this accidental height information was included: Perhaps the signal that the microphone picked up was altered by its location above his head, and upon playback my brain interpreted that as something it was familiar with, namely height information. Maybe my head and ears were sufficiently similar to Coco’s, at least from that angle. 

Duke
Yes, this it the most current knowledge and there are no indication it is incorrect, but even with these cues, it can be difficult to accurately assess height. I spent a number of years doing R&D on hearing aids and similar audio "devices". Our group believed we were one of the first to look at how the design of the hearing aid could be improved with the goal of preserving positional cues most take for granted. Unfortunately that R&D was abandoned after I left as well as other programs to pump up the balance sheet before selling. It was a bit contentious at the time as well. It indicated issues with signal processing delay differences masking timing cues.

"Technically",  just as you have indicated, frequency filters that mimic the pinna, can provide a sense of height in head-phonic playback and encoded in only two channels. There has been a fair amount of research done with HATS (head and torso simulators) for recording, but, as you indicated, it requires tailoring to the individual to work properly. If you attempt that technique with speakers, you get not only the HATS transform, plus the listener ... and two pinnas are not better than one.  W.R.T. your particular situation, making a wild ass guess, the microphone above his head, if not omni and not pointed at him, created a filtering effect that simulated height with pinna filtering. Curious if the wavefront from the electrostats is less impacted by torso/head/pinna than would normally occur with dynamic speakers.  Interesting!  I may have to pick up a pair now and do some testing.


Speaking of interesting, to the last post about difficulty of creating a stable image outside the speakers, have you done much research on ambiphonics?


Regarding height cues out in the "real world", my understanding is that the way sound diffracts around the head and outer ear (the pinna) from above is what gives us height cues. I have read papers and articles about encoding these "head and pinna transforms" into a signal to convey height information, but to really do it right, the equalizations would have to be tailored to the individual's ears. (One possible application would be in the helmets of fighter pilots, so that an audible threat warning would also convey the direction. Head position tracking would of course have to be included.)  

I don't see how height information could be encoded in a normal two-channel recording... BUT something weird happened to me years ago.