I’m trying to follow Apple’s “WWDC24: Bring your machine learning and AI models to Apple Silicon” session to convert the Mistral-7B-Instruct-v0.2 model into a Core ML package, but I’ve run into a roadblock that I can’t seem to overcome. I’ve uploaded my full conversion script here for reference:
https://pastebin.com/T7Zchzfc
When I run the script, it progresses through tracing and MIL conversion but then fails at the backend_mlprogram stage with this error:
https://pastebin.com/fUdEzzKM
The core of the error is:
ValueError: Op "keyCache_tmp" (op_type: identity) Input x="keyCache" expects list, tensor, or scalar but got state[tensor[1,32,8,2048,128,fp16]]
I’ve registered my KV-cache buffers in a StatefulMistralWrapper
subclass of nn.Module
, matching the keyCache
and valueCache
state names in my ct.StateType
definitions, but Core ML’s backend pass reports the state tensor as an invalid input. I’m using Core ML Tools 8.3.0 on Python 3.9.6, targeting iOS18, and forcing CPU conversion (MPS wasn’t available). Any pointers on how to satisfy the handle_unused_inputs
pass or properly declare/cache state for GQA models in Core ML would be greatly appreciated!
Thanks in advance for your help,
Usman Khan