If you don't have a wide sweet spot, are you really an audiophile?


Hi, it’s me, professional audio troll. I’ve been thinking about something as my new home listening room comes together:

The glory of having a wide sweet spot.

We focus far too much on the dentist chair type of listener experience. A sound which is truly superb only in one location. Then we try to optimize everything exactly in that virtual shoebox we keep our heads in. How many of us look for and optimize our listening experience to have a wide sweet spot instead?

I am reminded of listening to the Magico S1 Mk II speakers. While not flawless one thing they do exceptionally well is, in a good room, provide a very good, stable stereo image across almost any reasonable listening location. Revel’s also do this. There’s no sudden feeling of the image clicking when you are exactly equidistant from the two speakers. The image is good and very stable. Even directly in front of one speaker you can still get a sense of what is in the center and opposite sides. You don’t really notice a loss of focus when off axis like you can in so many setups.

Compare and contrast this with the opposite extreme, Sanders' ESL’s, which are OK off axis but when you are sitting in the right spot you suddenly feel like you are wearing headphones. The situation is very binary. You are either in the sweet spot or you are not.

From now on I’m declaring that I’m going all-in on wide-sweet spot listening. Being able to relax on one side of the couch or another, or meander around the house while enjoying great sounding music is a luxury we should all attempt to recreate.
erik_squires
Audio2design, thank you for your post. You are mostly right. It is all about timing and volume. You are also probably right about certain situations.
The vast majority of recording is done multi micing, not with stereo microphones. Then it becomes all about volume differentials between the channels, to where the sound was mixed. Now the timing event becomes paramount and that can happen only when your head is equidistant from the speakers that are properly balance (volume) Unless you prefer to go the ambisonic route. Your central nervous system was designed to work with head shading. It increases the volume differential between the ears allowing more accurate location of the threat. Timing also changes. In order to produce an accurate image you have to be equidistant from speakers balance correctly and both speakers have to have the exact same frequency response curve. Very few systems meet all these criteria and do not image as well as is theoretically possible. Yes, the way the recording was done influences all of this. 
With a good system one can sit comfortable in a chair and enjoy an accurate image. If you move side to side enough you will hear the center image melt. With line source speakers you can move all the way to a side wall and the instruments mixed to the other side will still be loud and clear coming from that side as if you were at a concert but the center image will be vague. With point source speaker the volume drops off much more acutely with distance so the center image shifts entirely to the side you are on including instruments in the center channel but mixed a little to the opposite side.

I use line source ESLs which have been digitally corrected and produce identical frequency response curves. I frequently have to adjust the balance with different records a few dB to improve the focus, something you would never notice in most systems because the image specificity is just not there. Volume and timing have to match up!
As you would expect some recordings produce better images than others. Mono records can not be listened to from the listening position.
It sounds like you are listening through a crack in a door, weird. I sit off center when I listen to mono. Everything opens up. 
I have listened to corrected point source speakers particularly a friends Watt/Puppy JL Audio subwoofer system and dead on center it produces a beautiful miniature image. Move off center and it falls apart as you would expect. 
It is sort of the exact opposite of what the OP says, the more noticeable the sweet spot the better the system. If you can not differentiate the exact center from two feet over your system is not imaging. Some people may be happier this way. Ignorance is bliss.
 
Your central nervous system was designed to work with head shading. It increases the volume differential between the ears allowing more accurate location of the threat. Timing also changes. In order to produce an accurate image you have to be equidistant from speakers balance correctly and both speakers have to have the exact same frequency response curve. Very few systems meet all these criteria and do not image as well as is theoretically possible. Yes, the way the recording was done influences all of this.


I think we are predominantly in agreement and acoustic cross-talk cancellation is an area of both academic and professional research for me.  Of note, the speakers I PM'ed you about have some ability to correct frequency response both direct and reflected.

The volume differential from head shading is critical, at >1,500Hz, but practically within limited range, more than just a perfect sweet spot, you can achieve this if that is your goal. It is conditional on speaker dispersion, or when you correct you will create as many problems as you solve.

W.R.T. timing, the current literature, and consensus on whether timing in recordings is accurately portrayed in a stereo speaker setup is debatable and the argument is leaning towards how the timing information is perceived is not what was captured. The reasons I illustrated above, but the biggest being crosstalk and filtering due to reinforcement and cancellation from the same sound having different arrival times. This is best illustrated by comparing timing panning using speakers and headphones, both with narrow band (<=1000Hz) and wider band signals. Lots of trade-offs too, going wider on the speakers, can improve extraction of timing detail, but screws up other location aspects and hurts the center image. Go narrower and you get a more accurate center image. The reality is 2 channel via speakers is imperfect. Signal processing will get us closer to reality, but uphill commercial struggle, and has its technical issues. More speakers just increases cross-talk issues, but more speakers working under concepts of ambisonics has the potential to move us forward.



Audio2design, thank you for your post. You are mostly right. It is all about timing and volume. You are also probably right about certain situations.
The vast majority of recording is done multi micing, not with stereo microphones. Then it becomes all about volume differentials between the channels, to where the sound was mixed. Now the timing event becomes paramount and that can happen only when your head is equidistant from the speakers that are properly balance (volume) Unless you prefer to go the ambisonic route. Your central nervous system was designed to work with head shading. It increases the volume differential between the ears allowing more accurate location of the threat. Timing also changes. In order to produce an accurate image you have to be equidistant from speakers balance correctly and both speakers have to have the exact same frequency response curve. Very few systems meet all these criteria and do not image as well as is theoretically possible. Yes, the way the recording was done influences all of this.
Half truth....

The missing half is in acoustic science and called the first frontwave law related to the different  possible thresholds timing  of direct and reflected waves and their interpration by the ears..

Imaging is not first a fact in digital recording tech. but in acoustic first...

I created my own mechanical equalizer for balancing the timing of the  different  waves  without microphone... It worked so well my imaging i call depth imaging fill the room...My measured standard is the range of the human voice and his timbre perceived by the ears...Not a a set of very narrow testing frequencies for a very minute location of the head using a mic... 


 Then imaging is FIRST : timing + the law of the first wavefront..
After that you can speak of timing+volume ...

missing this point is complete reversal and misunderstanding of the phenomena...

Acoustic neurophysiology is FIRST  recording engineering second for the explanation....  
Post removed 
You may feel your response is erudite, but to me, you just told me "I like Oranges", after I told you it was 7 below freezing and snowing outside. Perhaps to you there was some correlation, but I am just shaking my head and I suspect others are too at this point.
Insulting is your only argument...

i just put a simple point here you never answer to it...

Any reader can read that for himself...

I will simplify my post for your own understanding...



Acoustic explain imaging.....Engineereing use the acoustical explanation for better recording technique...

yes toe in speakers matter and anything pertaining to timing and volume...

BUT timing of wavefront matter MOST because it is ACOUSTIC science first... It is the same thing for the concept of timbre which is acoustical one...

This was my point...






How in the world this simple fact which is totally true correspond to you answer about orange and freezing...

You are a very intelligent person, but you are not a very "gentle" and very trustfull one sorry...