Because the CPU "thinks" in 32-bits, therefore anything not 32-bits needs to be ...

Because the CPU "thinks" in 32-bits, therefore anything not 32-bits needs to be processed first. You would need to first convert your memory address to 32-bits before the CPU could read it, which brings something that would normally be a single CPU clock tick into more than that (it's slightly more complicated than that, but only in ways that make it worse). On top of that, since compilers don't normally do this stuff for you, you would need to end up writing a wrapper to handle every possible memory call, and write it in assembly.

On top of that, you can't use those wasted bits for anything else anyway, so what's the point?