I’m going to answer your question seriously--because it seems like a serious question, well-posed.
All I try to convey to others whenever discussing audio gear & music reproduction is my preference for gear that sounds as much as possible like music performed live in a real performance space. The example I usually use is the symphony orchestra in a large hall with good acoustics: the music comes at the listener as a series of large wavefronts. One doesn’t hear bass vs midrange vs treble, but instead, large waves of sound launched from the stage.
IRL music, the upper mids & lower treble are never edgy; the treble sounds only as "airy" as the dimensions of the hall allows; bass hits the diaphragm, being felt as well as heard; and dynamics are epic & natural.
Only by comparing audio reproduction to the real thing can one rise above the obsession with gear voiced this or that particular way. This means letting go of things like enhanced/edgy transients; spotlighted & sculpted image placement in the soundstage; boomy, hyped bass, and other audio familiar tropes.
To paraphrase Duke Ellington: if it sounds like real music, it is good audio.