The paper is not perfect (nor the studies it references) and even its justification for 24/96 is quite weak, but with rapidly shrinking storage/bandwidth costs, there is not a lot of reason not to standardize on 24/96.
Somewhere I have a link that showed slightly better timing discrimination in some subjects, with a bandwidth just slightly over 20KHz, but virtually no benefit to going much higher than this. This would also suggest Redbook may not be perfect for everyone, but 24/96 would cover everyone.
You can always take away information at the playback stage if you are worried about distortion at >20KHz.