i am curious to know what you believe to be the source/reason/technical underpinning behind top level tube gear being able to generally (exceptions exist of course) provide a greater sense of air and vast imaging (size, depth in particular) - compared to similarly high level solid state gear?
I know what you mean. IMO it has to do with the distortion signature- neither a tube or solid state preamp will be making any significant distortion but its a simple fact that distortion is inescapable.
Its been shown that the lower ordered harmonics serve two functions both of which are helpful. The first is that if they are there at sufficient amplitude, they can mask higher ordered harmonics that otherwise are perceived as brightness.
The 2nd and more important function is that somehow the 2nd and 3rd harmonic are helpful to the ear in some way in helping it to perceive soundstage width and depth. You might be easily convinced that this is some sort of effect rather than being neutral, but if you listen to a direct microphone feed and compare that to the actual musical performance you find that the sound stage is simply being presented in a more natural fashion.
I think more research could be done in this area, but I'm not holding my breath for it to happen. But it is a documented phenomena.