I'm not an expert in the field, but I believe the research into head related transfer functions (HRTF) can teach us is that we localize sounds based on the complicated comb filtering caused by the shape of our heads, body, ear and even our hair styles.
It's not phase, it's amplitude that seems to matter here.
It's not phase, it's amplitude that seems to matter here.