And the sound changes all by itself day to day, hour to hour, month to month.But if conducting the test over a month changes should average out. I would say if you cannot identify a difference in that scenario it is not worth the upgrade.
I think ideally it you could create a computer controlled AB that would only change (or not, double blind) after making a choice. Review the results and have an additional data point to make your decision. You can always revert to feels if you like.