@mikhailark Reviewers like Amir at ASR also include multitone tests to look at more extensive IMD phenomena:
In terms of music comparison, there is software like Delta Wave Null Comparator that can facilitate doing exactly that. To use with DACs, you have to confront that the ADC will have digitization limitations. For speakers and headphones, you have microphone/space considerations that are better accommodated using systems like Klippel that use repeated measurements to achieve anechoic approximations in regular spaces.
I recommend Audio Science Review as a resource for learning more about measurement techniques.