Thanks for being a part of WWDC25!

How did we do? We’d love to know your thoughts on this year’s conference. Take the survey here

How to Ensure Quantized Models Run on ANE on iPhone 15 (iOS 18 Beta 8)

When I use CoreML to infer a w8a8 model on iPhone 15 (iOS 18 beta 8), the model uses CPU inference instead of ANE, which results in slower inference speed. The model I am using is from the coremltools documentation, which indicates that on iOS 17, quantized models can run on ANE properly and achieve faster speeds. How can I make the quantized model run correctly on ANE to achieve the desired inference speed? To reproduce this issue, you can download the Weight & Activation quantized model from the following link: https://apple.github.io/coremltools/docs-guides/source/opt-quantization-perf.html.

How to Ensure Quantized Models Run on ANE on iPhone 15 (iOS 18 Beta 8)
 
 
Q