I second Johnsonwu's observation re: a widebander, for it's coherence. It doesn't have to be just a widebander-it can be augmented with a supertweeter and a sub (built in or not). So many variables here.
The best and really, only time I heard jaw dropping, 3-D sound with that layering you allude to was with some entry level Unity Audio speakers a couple of decades ago. Most of it was due to some very high quality tube equipment and room placement/design. I've never heard it that good again. Ever.