> Hopefully Apple optimizes Core ML to map transformer workloads to the ANE. If ...

ls-a · 2025-09-08T16:40:24 1757349624

I thought Apple MLX can do that if you convert your model using it https://mlx-framework.org/

woadwarrior01 · 2025-09-08T19:17:13 1757359033

MLX does not support the ANE.

https://github.com/ml-explore/mlx/issues/18

elpakal · 2025-09-09T00:27:23 1757377643

Yes it does.

That’s just an issue with stale and incorrect information. Here are the docs https://opensource.apple.com/projects/mlx/

woadwarrior01 · 2025-09-09T08:30:30 1757406630

No, it categorically doesn't. Not just that, it's CPU support is quite lacking (fp32 only). Currently, there are two ways to support the ANE: CoreML and MPSGraph.

y1n0 · 2025-09-09T01:36:27 1757381787

Nothing in that documentation says anything about the Apple Neural Engine. MLX runs on the GPU.

jychang · 2025-09-09T04:51:34 1757393494

None of that uses the ANE.

GeekyBear · 2025-09-08T16:53:54 1757350434

It does indeed, and is more modern than Core ML.

coffeecoders · 2025-09-08T16:51:38 1757350298

It is less about conversion and more about extending ANE support for transformer-style models or giving developers more control.

The issue is in targeting specific hardware blocks. When you convert with coremltools, Core ML takes over and doesn't provide fine-grained control - run on GPU, CPU or ANE. Also, ANE isn't really designed with transformers in mind, so most LLM inference defaults to GPU.

aurareturn · 2025-09-08T19:55:43 1757361343

Neural Engine is optimized for power efficiency, not performance.

Look for Apple to add matmul acceleration into the GPU instead. Thats how to truly speed up local LLMs.