I have come to believe, with limited data that the issue is not the amplifier’s power rating but how consistently it performs across the audio band, and this is a place where the math doesn’t quite live up to the audible effects.The problem isn't the math, its how the amplifier is measured which is something else altogether. Most traditional amplifiers (tube and solid state) that employ feedback don't/can't use enough, so as frequency is increased distortion increases too. This results in brightness/harshness, and is fundamentally at the tubes/transistors debate.
To get around this problem, distortion is usually measured at 100Hz which is too low a frequency for the problems I described to show up. This has become a tradition, so there are those that do this and don't realize that its only done that way to sweep dirt under the carpet.
The reason not enough feedback is used is that you need in excess of 35dB of feedback to prevent it causing brightness (distortion) through its application. This is because traditional amplifiers lack the Gain Bandwidth Product to really allow them the proper amount of feedback at 7KHz and higher. In addition, phase margin is a problem so amps with this much feedback can be unstable and go into oscillation.
If you can run enough feedback, the amp will sound just as smooth as any tube amp running zero feedback. I think what you are describing is really just the amp showing off its limitations.