Transients are impulse responses in relation to steady state and it can be measured. Impulse response is to a specific input from base level. It is used to check time alinement. It is not related to distortion but most systems with good transient response are also low distortion. It comes down to good power to weight ratio of the drivers with pistonic motion.
When it comes to transducers, “transient response” is how close a driver can mimic the original signal. If given a theoretically perfect amplifier and speaker enclosure, any deviation from perfect transient response is indeed driver distortion. The more imperfect the response, the higher the distortion.
You are correct that lower mass diaphragms tend to have lower distortion, especially if they can resist bending modes within their implemented bandwidth.
Sometimes a driver can have good piston behavior but still cause a comb-filtering effect, which is another, typically less subjectively offensive distortion. Regardless, any deviation from perfection is a distortion.
The reason higher quality drivers sound quicker, subjectively, is because they are lower in distortion.
Ported, sealed, OB, panel…all such designs can sound slow/fast or dynamic/anemic depending on how well they are designed and the quality of their transducers. Unfortunately, most dealer-sold tower speakers under ≈$7K/pair (and stand-mounts under ≈$4K/pair) employ very mediocre transducers.