So it's called an "AI Engine", but its performance is worse than just running th...

heavyset_go · 2026-01-04T04:04:43 1767499483

The point is offloading ML workloads to hardware that is energy efficient, not necessarily "fast" hardware.

You want to minimize the real and energy costs at the expense of time.

Assuming NPUs don't get pulled from consumer hardware altogether, theoretically the time/efficiency trade-off gap will become smaller and smaller as time goes on.

shetaye · 2026-01-04T03:47:14 1767498434

The CPU baseline seems to be the beefy host CPU. The AIE is presumably faster than what you could do with the FPGA (DPS, LUT, etc.) alone.