The OP's question, regarding how noise that is not audible beyond a very short distance from the speaker when no music is playing might have audible significance, is an excellent one, that I've pondered myself at times.
I think that Kijanki's answers are on the mark.
In the digital domain, the explanation is easy, namely jitter effects, as he indicated.
In the analog domain, it is not that clear, but the one explanation that occurs to me relates to intermodulation effects, as he also indicated. The ear is much more sensitive to some frequencies than to others, as can be seen in the figure in this Wikipedia writeup on the
Fletcher-Munson Effect. Non-linearities in the speakers, and perhaps also in the electronic components that are in the analog signal path, will result to some degree in intermodulation effects, producing (at very low but conceivably significant levels) new frequencies corresponding to the sum and difference between all of the frequency components that are present. Hiss typically contains a mix of essentially all frequencies within some broad range, especially in the upper treble region (and beyond, at ultrasonic frequencies that are inaudible in themselves). Perhaps intermodulation of some of the frequency components of the music (and perhaps also frequency components of recorded noise, tape hiss, and/or LP surface noise) with those upper treble and ultrasonic system noise components results in difference frequencies in the lower treble or mid-range regions, where the ear is more sensitive.
That's my speculation, anyway, elaborating on what IMO were excellent answers by Kijanki.
Regards,
-- Al