Lots to unpack in this thread.
Would it not be best to start with how we perceive sound? Our hearing is sensitive to timing over a narrow frequency range, about 200-1500Hz. It would make sense to have time alignment over that frequency range. Many speakers with mid-woofers in the 5-7" range by virtue of where their crossover frequencies are already are time aligned in this frequency range (single driver).
In terms of sound stage, all the other information we use for position, is frequency and volume based, not timing based. With that being the case, is there a good argument for time alignment over the whole frequency range?
@mijostyn , you appear to be advocating that a flat in-room frequency response is the ideal scenario. That is not supported by most people's listening impressions or research into preference, all which suggest a sloping reduced output at higher frequencies.
There is a misconception that in-room frequency should be perfectly flat in order to perfectly recreate the original performance. It sounds great on the surface but it is a flawed premise as you are not trying to recreate the performance, you are trying to recreate what was heard by the recording/mixing engineer, and they have already adjusted the frequency response based on what they were hearing at their workstation which is usually two somewhat near field speakers, but the total response ends up closer to downward sloping at higher frequencies, especially true when they do final mix and test it on larger audio systems and/or headphones which appear to sound best when targeted at a downward slope past about 3KHz.
Some of this may even harken back to the attenuation you would experience seeing a live orchestra or concert hall at typical seating distances and distance to instruments (the front row is rarely where the best sound is).