First consider radiating area. Typical loudspeakers are only about 1% efficient in an overall sense (that is, for 100 watts of electrical input power, you get an output of 1 watt of acoustical power). The reason for this is that there is a bad impedance mismatch between the surface of the driver and the surrounding air. Think of air as a medium, like a fluid only much lighter and more compressible. The driver only "sees" a very small, light load on its diaphragm, and as such is unable to impart much force against it, because the air moves out of the way so easily. The driver can exert much more force electromechanically than the air can accept acoustically, thus the "impedance mismatch". The reason that horns have such fantastic efficiency is that they gradually expand, allowing the driver to "see" a much larger surface area of air. That is, the driver ends up loaded by the area of the horn opening rather than its own diaphragm area. This makes the air appear much "stiffer" to the driver and results in a much better impedance match. The other way to achieve better impedance matching is to use a lot of direct-radiator surface area. Doubling the radiating area gives you 3dB of efficiency all by itself, just because of the improved coupling to the air. So a 12" driver is inherently four times more efficient at coupling to the air than a 6" driver is, and generally will require four times the enclosure volume as well. (The inside of the box "sees" four times as much air being pushed into it, so for the same compliance, needs four times the volume. It's the same as if there were four 6" drivers in the box.) The reason that 12" drivers aren't vastly more efficient than 6" drivers is that they have a much higher moving mass, see below...
For every additional octave of bass extension, you have to move four times the volume of air. This leads immediately to the necessity for large drivers with large excursions, which in turn requires large enclosures to support them. There is a tradeoff that can be made, though, if you can live with a lower output level capability.
Driver efficiency is primarily a function of magnet force and moving mass. (Think back to F=ma. Sound is nothing but the acceleration of air molecules back and forth. The higher the acceleration, the louder the sound.) So if you increase the mass, you get a lower efficiency but also a lower resonant frequency (better bass extension). Thus you can take a 6" driver which would normally have a resonant frequency of 60Hz, and by doubling the mass and the suspension compliance, get the resonant frequency (and thus the extension) down one octave to 30 Hz. You lose 6dB of efficiency and 12dB of output capability in the process! (Remember the four times air volume principle? This is where it comes back to bite you.) But in many cases, a tradeoff like this is made in order to get good bass at limited output levels out of a small driver.
Hope this helps.