Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

A Summary of the WWDC25 Group Lab - Machine Learning and AI Frameworks
At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Machine Learning and AI Frameworks. What are you most excited about in the Foundation Models framework? The Foundation Models framework provides access to an on-device Large Language Model (LLM), enabling entirely on-device processing for intelligent features. This allows you to build features such as personalized search suggestions and dynamic NPC generation in games. The combination of guided generation and streaming capabilities is particularly exciting for creating delightful animations and features with reliable output. The seamless integration with SwiftUI and the new design material Liquid Glass is also a major advantage. When should I still bring my own LLM via CoreML? It's generally recommended to first explore Apple's built-in system models and APIs, including the Foundation Models framework, as they are highly optimized for Apple devices and cover a wide range of use cases. However, Core ML is still valuable if you need more control or choice over the specific model being deployed, such as customizing existing system models or augmenting prompts. Core ML provides the tools to get these models on-device, but you are responsible for model distribution and updates. Should I migrate PyTorch code to MLX? MLX is an open-source, general-purpose machine learning framework designed for Apple Silicon from the ground up. It offers a familiar API, similar to PyTorch, and supports C, C++, Python, and Swift. MLX emphasizes unified memory, a key feature of Apple Silicon hardware, which can improve performance. It's recommended to try MLX and see if its programming model and features better suit your application's needs. MLX shines when working with state-of-the-art, larger models. Can I test Foundation Models in Xcode simulator or device? Yes, you can use the Xcode simulator to test Foundation Models use cases. However, your Mac must be running macOS Tahoe. You can test on a physical iPhone running iOS 18 by connecting it to your Mac and running Playgrounds or live previews directly on the device. Which on-device models will be supported? any open source models? The Foundation Models framework currently supports Apple's first-party models only. This allows for platform-wide optimizations, improving battery life and reducing latency. While Core ML can be used to integrate open-source models, it's generally recommended to first explore the built-in system models and APIs provided by Apple, including those in the Vision, Natural Language, and Speech frameworks, as they are highly optimized for Apple devices. For frontier models, MLX can run very large models. How often will the Foundational Model be updated? How do we test for stability when the model is updated? The Foundation Model will be updated in sync with operating system updates. You can test your app against new model versions during the beta period by downloading the beta OS and running your app. It is highly recommended to create an "eval set" of golden prompts and responses to evaluate the performance of your features as the model changes or as you tweak your prompts. Report any unsatisfactory or satisfactory cases using Feedback Assistant. Which on-device model/API can I use to extract text data from images such as: nutrition labels, ingredient lists, cashier receipts, etc? Thank you. The Vision framework offers the RecognizeDocumentRequest which is specifically designed for these use cases. It not only recognizes text in images but also provides the structure of the document, such as rows in a receipt or the layout of a nutrition label. It can also identify data like phone numbers, addresses, and prices. What is the context window for the model? What are max tokens in and max tokens out? The context window for the Foundation Model is 4,096 tokens. The split between input and output tokens is flexible. For example, if you input 4,000 tokens, you'll have 96 tokens remaining for the output. The API takes in text, converting it to tokens under the hood. When estimating token count, a good rule of thumb is 3-4 characters per token for languages like English, and 1 character per token for languages like Japanese or Chinese. Handle potential errors gracefully by asking for shorter prompts or starting a new session if the token limit is exceeded. Is there a rate limit for Foundation Models API that is limited by power or temperature condition on the iPhone? Yes, there are rate limits, particularly when your app is in the background. A budget is allocated for background app usage, but exceeding it will result in rate-limiting errors. In the foreground, there is no rate limit unless the device is under heavy load (e.g., camera open, game mode). The system dynamically balances performance, battery life, and thermal conditions, which can affect the token throughput. Use appropriate quality of service settings for your tasks (e.g., background priority for background work) to help the system manage resources effectively. Do the foundation models support languages other than English? Yes, the on-device Foundation Model is multilingual and supports all languages supported by Apple Intelligence. To get the model to output in a specific language, prompt it with instructions indicating the user's preferred language using the locale API (e.g., "The user's preferred language is en-US"). Putting the instructions in English, but then putting the user prompt in the desired output language is a recommended practice. Are larger server-based models available through Foundation Models? No, the Foundation Models API currently only provides access to the on-device Large Language Model at the core of Apple Intelligence. It does not support server-side models. On-device models are preferred for privacy and for performance reasons. Is it possible to run Retrieval-Augmented Generation (RAG) using the Foundation Models framework? Yes, it is possible to run RAG on-device, but the Foundation Models framework does not include a built-in embedding model. You'll need to use a separate database to store vectors and implement nearest neighbor or cosine distance searches. The Natural Language framework offers simple word and sentence embeddings that can be used. Consider using a combination of Foundation Models and Core ML, using Core ML for your embedding model.
1
0
304
2d
ActivityClassifier doesn't classify movement
I'm using a custom create ML model to classify the movement of a user's hand in a game, The classifier has 3 different spell movements, but my code constantly predicts all of them at an equal 1/3 probability regardless of movement which leads me to believe my code isn't correct (as opposed to the model) which in CreateML at least gives me a heavily weighted prediction My code is below. On adding debug prints everywhere all the data looks good to me and matches similar to my test CSV data So I'm thinking my issue must be in the setup of my model code? /// Feeds samples into the model and keeps a sliding window of the last N frames. final class WandGestureStreamer { static let shared = WandGestureStreamer() private let model: SpellActivityClassifier private var samples: [Transform] = [] private let windowSize = 100 // number of frames the model expects /// RNN hidden state passed between inferences private var stateIn: MLMultiArray /// Last transform dropped from the window for continuity private var lastDropped: Transform? private init() { let config = MLModelConfiguration() self.model = try! SpellActivityClassifier(configuration: config) // Initialize stateIn to the model’s required shape let constraint = self.model.model.modelDescription .inputDescriptionsByName["stateIn"]! .multiArrayConstraint! self.stateIn = try! MLMultiArray(shape: constraint.shape, dataType: .double) } /// Call once per frame with the latest wand position (or any feature vector). func appendSample(_ sample: Transform) { samples.append(sample) // drop oldest frame if over capacity, retaining it for delta at window start if samples.count > windowSize { lastDropped = samples.removeFirst() } } func classifyIfReady(threshold: Double = 0.6) -> (label: String, confidence: Double)? { guard samples.count == windowSize else { return nil } do { let input = try makeInput(initialState: stateIn) let output = try model.prediction(input: input) // Save state for continuity stateIn = output.stateOut let best = output.label let conf = output.labelProbability[best] ?? 0 // If you’ve recognized a gesture with high confidence: if conf > threshold { return (best, conf) } else { return nil } } catch { print("Error", error.localizedDescription, error) return nil } } /// Constructs a SpellActivityClassifierInput from recorded wand transforms. func makeInput(initialState: MLMultiArray) throws -> SpellActivityClassifierInput { let count = samples.count as NSNumber let shape = [count] let timeArr = try MLMultiArray(shape: shape, dataType: .double) let dxArr = try MLMultiArray(shape: shape, dataType: .double) let dyArr = try MLMultiArray(shape: shape, dataType: .double) let dzArr = try MLMultiArray(shape: shape, dataType: .double) let rwArr = try MLMultiArray(shape: shape, dataType: .double) let rxArr = try MLMultiArray(shape: shape, dataType: .double) let ryArr = try MLMultiArray(shape: shape, dataType: .double) let rzArr = try MLMultiArray(shape: shape, dataType: .double) for (i, sample) in samples.enumerated() { let previousSample = i > 0 ? samples[i - 1] : lastDropped let model = WandMovementRecording.DataModel(transform: sample, previous: previousSample) // print("model", model) timeArr[i] = NSNumber(value: model.timestamp) dxArr[i] = NSNumber(value: model.dx) dyArr[i] = NSNumber(value: model.dy) dzArr[i] = NSNumber(value: model.dz) let rot = model.rotation rwArr[i] = NSNumber(value: rot.w) rxArr[i] = NSNumber(value: rot.x) ryArr[i] = NSNumber(value: rot.y) rzArr[i] = NSNumber(value: rot.z) } return SpellActivityClassifierInput( dx: dxArr, dy: dyArr, dz: dzArr, rotation_w: rwArr, rotation_x: rxArr, rotation_y: ryArr, rotation_z: rzArr, timestamp: timeArr, stateIn: initialState ) } }
0
0
21
3h
Apple's Illusion of Thinking paper and Path to Real AI Reasoning
Hey everyone I'm Manish Mehta, field CTO at Centific. I recently read Apple's white paper, The Illusion of Thinking and it got me thinking about the current state of AI reasoning. Who here has read it? The paper highlights how LLMs often rely on pattern recognition rather than genuine understanding. When faced with complex tasks, their performance can degrade significantly. I was just thinking that to move beyond this problem, we need to explore approaches that combines Deeper Reasoning Architectures for true cognitive capability with Deep Human Partnership to guide AI toward better judgment and understanding. The first part means fundamentally rewiring AI to reason. This involves advancing deeper architectures like World Models, which can build internal simulations to understand real-world scenarios , and Neurosymbolic systems, which combines neural networks with symbolic reasoning for deeper self-verification. Additionally, we need to look at deep human partnership and scalable oversight. An AI cannot learn certain things from data alone, it lacks the real-world judgment an AI will never have. Among other things, deep domain expert human partners are needed to instill this wisdom , validate the AI's entire reasoning process , build its ethical guardrails , and act as skilled adversaries to find hidden flaws before they can cause harm. What do you all think? Is this focus on a deeper partnership between advanced AI reasoning and deep human judgment the right path forward? Agree? Disagree? Thanks
1
0
59
13h
Initializing session with transcript ignores tools
When I initialize a session with an existing transcript using this initializer: public convenience init(model: SystemLanguageModel = .default, guardrails: LanguageModelSession.Guardrails = .default, tools: [any Tool] = [], transcript: Transcript) The tools get ignored. I noticed that when doing that, the model never use the tools. When inspecting the transcript, I can see that the instruction entry does not have any tools available to it. I tried this for both transcripts that already include an instruction entry and ones that don't - both yielding the same result.. Is this the intended behavior / am I missing something here?
0
0
125
23h
Stream response
With respond() methods, the foundation model works well enough. With streamResponse() methods, the responses are very repetitive, verbose, and messy. My app with foundation model uses more than 500 MB memory on an iPad Pro when running from Xcode. Devices supporting Apple Intelligence have at least 8GB memory. Should Apple use a bigger model (using 3 ~ 4 GB memory) for better stream responses?
2
0
135
1d
Provide actionable feedback for the Foundation Models framework using LanguageModelFeedbackAttachment
We are really excited to have introduced the Foundation Models framework in WWDC25. When using the framework, you might have feedback about how it can better fit your use cases. The best way of provide your feedback is to file a feedback report with relevant details. Specific to the Foundation Models framework, it’s super important to add the following information in your report: The language model feedback attachment This attachment contains the session transcript, including the instructions, the prompts, and the responses. Without that, we can’t reason the model’s behavior, and hence can hardly take any action. If you believe that what you’d report is related to the system configuration, please capture a sysdiagnose and attach it to your feedback report as well. The framework is still new. Your actionable feedback helps us evolve the framework quickly, and we appreciate that. Thanks ——  The Foundation Models framework team
0
0
163
1d
InferenceError with Apple Foundation Model – Context Length Exceeded on macOS 26.0 Beta
Hello Team, I'm currently working on a proof of concept using Apple's Foundation Model for a RAG-based chat system on my MacBook Pro with the M1 Max chip. Environment details: macOS: 26.0 Beta Xcode: 26.0 beta 2 (17A5241o) Target platform: iPad (as the iPhone simulator does not support Foundation models) While testing, even with very small input prompts to the LLM, I intermittently encounter the following error: InferenceError::inference-Failed::Failed to run inference: Context length of 4096 was exceeded during singleExtend. Has anyone else experienced this issue? Are there known limitations or workarounds for context length handling in this setup? Any insights would be appreciated. Thank you!
3
0
140
3d
Foundation Models / Playgrounds Hello World - Help!
I am using Foundation Models for the first time and no response is being provided to me. Code import Playgrounds import FoundationModels #Playground { let session = LanguageModelSession() let result = try await session.respond(to: "List all the states in the USA") print(result.content) } Canvas Output What I did New file Code Canvas refreshes but nothing happens Am I missing a step or setup here? Please help. Something so basic is not working I do not know what to do. Running 40GPU, 16CPU MacBook Pro.. IOS26/Xcodebeta2/Tahoe allocated 8CPU, 48GB memory in Parallels VM. Settings for Playgrounds in Xcode Thank you for your help in advance.
5
0
158
4d
Swipe-to-Type Broken in iOS 26 Beta 1 & 2 Siri Typing Mode
I’ve been testing silent Siri engagement via typing on iOS 18 and also on iOS 26 beta 1 and beta 2. While normal typing works perfectly in type-to-Siri mode, I’ve noticed that swipe-to-type gestures don’t work within Siri’s input field. Interestingly, you still feel the usual haptic feedback associated with swipe typing, but no text appears in the Siri text box. Swipe-to-type continues to work flawlessly in other apps like Messages and Notes, so this seems to be an issue specific to Siri’s typing input handler in these betas. Hopefully, it will be fixed in the next release because swipe typing is essential to my silent Siri workflow.
1
0
46
6d
Max tokens for Foundation Models
Do we know what a safe max token limit is? After some iterating, I have come to believe 4096 might be the limit on device. Could you help me out by answering any of these questions: Is 4096 the correct limit? Do all devices have the same limit? Will the limit change over time or by device? The errors I get when going over the limit do not seem to say, hey you are over, so it's just by trial and error that I figure these issues out. Thanks for the fun new toys. Regards, Rob
1
0
82
6d
How to pass data to FoundationModels with a stable identifier
For example: I have a list of to-dos, each with a unique id (a GUID). I want to feed them to the LLM model and have the model rewrite the items so they start with an action verb. I'd like to get them back and identify which rewritten item corresponds to which original item. I obviously can't compare the text, as it has changed. I've tried passing the original GUIDs in with each to-do, but the extra GUID characters pollutes the input and confuses the model. I've tried numbering them in order and adding an originalSortOrder field to my generable type, but it doesn't work reliably. Any suggestions? I could do them one at a time, but I also have a use case where I'm asking for them to be organized in sections, and while I've instructed the model not to rename anything, it still happens. It's just all very nondeterministic.
2
0
147
6d
Foundation Models Error: Local Sanitizer Asset
Hi, I just upgraded to macOS Tahoe Beta 2 and now I'm getting this error when I try to initialize my Foundation Models' session: Error Resource (Local Sanitizer Asset) unavailable error. import FoundationModels #Playground { let session = LanguageModelSession() do { let result = try await session.respond(to: "Tell me 3 colors") print(result.content) } catch { print("Error", error) } } I couldn't find any resource guiding me on how to solve this. Any help/workaround? Thank you!
1
4
257
1w
Safety Guardrail errors for tiny prompt (dropped into large app)
I was able to open a new project and play around with the Foundation Model, but when I dropped this class in a production app (with a lot of files) I'm running into Safety Guardrail errors for this very small prompt. Specifically it's "Safety guardrail was triggered after consecutive failures during streaming." Does it have something to do with the size of the app? I don't know what else to try to get it to work? import FoundationModels import Playgrounds @available(iOS 26.0, *) #Playground { Task { do { let session = LanguageModelSession() let prompt = "Write a short story about a talking cat." let response = try await session.respond(to: prompt) print(response) } catch { print("Error: \(error)") } } }
3
2
157
1w
macOS 26 Beta 2 - Foundation Models - Symbol not found
It seems like there was an undocumented change that made Transcript.init(entries: [Transcript.Entry] initializer private, which broke my application, which relies on (manual) reconstruction of Transcript entries. Worked fine on beta 1, on beta 2 there's this error dyld[72381]: Symbol not found: _$s16FoundationModels10TranscriptV7entriesACSayAC5EntryOG_tcfC Referenced from: <44342398-591C-3850-9889-87C9458E1440> /Users/mika/experiments/apple-on-device-ai/fm Expected in: <66A793F6-CB22-3D1D-A560-D1BD5B109B0D> /System/Library/Frameworks/FoundationModels.framework/Versions/A/FoundationModels Is this a part of an API transition, if so - Apple, please update your documentation
3
0
211
1w
visionOS 26 beta 2: Symbol Not Found on Foundation Models
When I try to run visionOS 26 beta 2 on my device the app crashes on Launch: dyld[904]: Symbol not found: _$s16FoundationModels10TranscriptV7entriesACSayAC5EntryOG_tcfC Referenced from: <A71932DD-53EB-39E2-9733-32E9D961D186> /private/var/containers/Bundle/Application/53866099-99B1-4BBD-8C94-CD022646EB5D/VisionPets.app/VisionPets.debug.dylib Expected in: <F68A7984-6B48-3958-A48D-E9F541868C62> /System/Library/Frameworks/FoundationModels.framework/FoundationModels Symbol not found: _$s16FoundationModels10TranscriptV7entriesACSayAC5EntryOG_tcfC Referenced from: <A71932DD-53EB-39E2-9733-32E9D961D186> /private/var/containers/Bundle/Application/53866099-99B1-4BBD-8C94-CD022646EB5D/VisionPets.app/VisionPets.debug.dylib Expected in: <F68A7984-6B48-3958-A48D-E9F541868C62> /System/Library/Frameworks/FoundationModels.framework/FoundationModels dyld config: DYLD_LIBRARY_PATH=/usr/lib/system/introspection DYLD_INSERT_LIBRARIES=/usr/lib/libLogRedirect.dylib:/usr/lib/libBacktraceRecording.dylib:/usr/lib/libMainThreadChecker.dylib:/usr/lib/libViewDebuggerSupport.dylib:/System/Library/PrivateFrameworks/GPUToolsCapture.framework/GPUToolsCapture Symbol not found: _$s16FoundationModels10TranscriptV7entriesACSayAC5EntryOG_tcfC Referenced from: <A71932DD-53EB-39E2-9733-32E9D961D186> /private/var/containers/Bundle/Application/53866099-99B1-4BBD-8C94-CD022646EB5D/VisionPets.app/VisionPets.debug.dylib Expected in: <F68A7984-6B48-3958-A48D-E9F541868C62> /System/Library/Frameworks/FoundationModels.framework/FoundationModels dyld config: DYLD_LIBRARY_PATH=/usr/lib/system/introspection DYLD_INSERT_LIBRARIES=/usr/lib/libLogRedirect.dylib:/usr/lib/libBacktraceRecording.dylib:/usr/lib/libMainThreadChecker.dylib:/usr/lib/libViewDebuggerSupport.dylib:/System/Library/PrivateFrameworks/GPUToolsCapture.framework/GPUToolsCapture Message from debugger: Terminated due to ****** 6
5
0
102
1w
Overly strict foundation model rate limit when used in app extension
I am calling into an app extension from a Safari Web Extension (sendNativeMessage, which in turn results in a call to NSExtensionRequestHandling’s beginRequest). My Safari extension aims to make use of the new foundation models for some of the features it provides. In my testing, I hit the rate limit by sending 4 requests, waiting 30 seconds between each. This makes the FoundationModels framework (which would otherwise serve my use case perfectly well) unusable in this context, because the model is called in response to user input, and this rate of user input is perfectly plausible in a real world scenario. The error thrown as a result of the rate limit is “Safety guardrail was triggered after consecutive failures during streaming.", but looking at the system logs in Console.app shows the rate limit as the real culprit. My suggestions: Please introduce sensible rate limits for app extensions, through an entitlement if need be. If it is rate limited to 1 request per every couple of seconds, that would already fix the issue for me. Please document the rate limit. Please make the thrown error reflect that it is the result of a rate limit and not a generic guardrail violation. IMPORTANT: please indicate in the thrown error when it is safe to try again. Filed a feedback here: FB18332004
3
1
126
1w