InferenceError referencing context length in FoundationModels framework

Question

Created 11h

Replies 1

Boosts 0

Participants 2

I'm experimenting with downloading an audio file of spoken content, using the Speech framework to transcribe it, then using FoundationModels to clean up the formatting to add paragraph breaks and such. I have this code to do that cleanup:

private func cleanupText(_ text: String) async throws -> String? {
    print("Cleaning up text of length \(text.count)...")
    let session = LanguageModelSession(instructions: "The content you read is a transcription of a speech. Separate it into paragraphs by adding newlines. Do not modify the content - only add newlines.")
    
    let response = try await session.respond(to: .init(text), generating: String.self)
    return response.content
}

The content length is about 29,000 characters. And I get this error:

InferenceError::inferenceFailed::Failed to run inference: Context length of 4096 was exceeded during singleExtend..

Is 4096 a reference to a max input length? Or is this a bug?

This is running on an M1 iPad Air, with iPadOS 26 Seed 1.

Boost

Answer 1

ziopiero OP

1h

I have the same problem. Note that 4096 is max token length, not text length, but I don't know how token length has been calculated

0