It's a very good question and one that comes up often. Let me try to paint a picture for you.
First, all studio control rooms are heavily treated to minimize any acoustic aberrations. Studio monitors are very flat across the frequency range, capable of very high SPL and bullet proof. They are designed to be extremely durable and to provide a consistent sound over a very long time. The intention is to hear every detail of what is on the tracks.
Most control rooms have several systems.
If the room is used for live recording (and not all are), there will be a big pair of monitors mounted up high that can fill the entire room with high, distortion free SPLs. Keep in mind that a big control room can easily be the size of a living room. This is traditionally the province of brands like JBL, Westlake Audio and Tannoy. These are used so that everyone involved in the session can listen as the band lays down tracks, and/or the engineer plays back a mix.
In addition the engineer will have one or two pairs of speakers on or aimed at his position at the mixing console. This is where you typically see small self=powered Tannoys and ATCs, Yamaha NSMs, and even the lowly 3" Auratone (basicallya 3" car speaker in a 5" box finished in vinyl). The BBC engineers used Spendors for this task.
These speakers enable the engineer to do two things. First monitor accurately at low volumes while s/he works out the myriad of details of a mix. Second, these kinds of speakers are used to emulate what the vast majority of people hear on TV and in their car.
In fact for many years the acid test was to make a cassette of the mix to listen to on the drive home. It had to sound good in the car. While I am no longer involved in the business, I wouldn't be surprised to find people making MP3s to listen to on their iPods and Zunes - it has to sound good on ear buds.
Note that a specialized room: say one used for mastering will be set up differently from one used to build sound effects and dialogue. While accuracy is always paramount, as Albert's post demonstrates there are many ways to skin the cat - probably the closest parallel to what an audiophile would want is a mastering room like Steve's.
The best self powered speakers are very accurate, can play very loud and are easy to move around. As an added bonus they work very well where space is at a premium - for instance a mobile truck or a small control room.
As for why (most)audiophiles wouldn't want the same gear: many people would not like the flat voicing, some of it is brutally expensive and requires massive power, a lot of it is either too big or too small, and probably more then anything else, much of it lacks any semblance of WAF.