The fact they are on the same package means that the electrical signals have a lot less far to travel from memory to cpu, and therefore you don’t have the signal losses or interference from the board having to route memory lines externally.
As a result you would be able to drive a higher bandwidth because you don’t need to be as limiting with the transfer time of signals.
Or you could use less power for the same speed. Hard to tell what Apple did, without some detailed benchmarks. I suppose one could bench memcpy and derive the clock rate from that.
It's unlikely you're going to transfer data any faster - they're using commodity drams like anyone else - they will however be able to save a clocks's worth of latency here and there which is useful
As a result you would be able to drive a higher bandwidth because you don’t need to be as limiting with the transfer time of signals.