I don't think you read my comment properly. I'm saying that shared memory parallelism has done its time even at the hardware level within a CPU.
Most programming languages are using construct that assume memory coherency for synchronization (like atomics). It may very well be that hardware channels and DMA become more prevalent in the future as more and more cores are packed and shared memory becomes prohibitive. This would be a totally different paradigm.