
Get started with Foundation Models adapter training
Teach the on-device language model new skills specific to your app by training a custom adapter. This toolkit contains a Python training workflow and utilities to package adapters for use with the Foundation Models framework.
Overview
While the on-device system language model is powerful, it may not be capable of all specialized tasks. Adapters are an advanced technique that adapt a large language model (LLM) with new skills or domains. With the adapter training toolkit, you can train adapters to specialize the on-device system LLM’s abilities, and then use your adapter with the Foundation Models framework. On this page you can download the toolkit, and learn about the full process of training and deploying custom adapters for your app.
The adapter training toolkit contains:
- Python sample code for each adapter training step
- Model assets that match a specific system model version
- Utilities to export an
.fmadapter
package - Utilities to bundle an adapter as a background asset pack
Important
Each adapter is compatible with a single specific system model version. To support people using your app who have devices on OS versions using different system model versions, you will need to train a different adapter for every version of the system model.
Foundation Models Framework Adapter Entitlement
When you’re ready to deploy adapters in your app, the Account Holder of a membership in the Apple Developer Program will need to request the Foundation Models Framework Adapter Entitlement. You don’t need this entitlement to train or locally test adapters.
Download toolkit
To download any adapter training toolkit version, you’ll need to be a member of the Apple Developer Program and will first need to agree to the terms and conditions of the toolkit.
Remember you may need to download multiple toolkit versions. Each version contains the unique model assets compatible with a specific OS version range. To support people on different OS versions using your app, you must train an adapter for each version of the toolkit.
Version | Changes | OS Compatibility |
---|---|---|
Beta 0.1.0 | Initial release | macOS 26* |
* Custom adapter support on iOS, iPadOS, and visionOS coming soon.
When do new versions come out? A new toolkit will be released for every system model update. The system model is shared across iOS, macOS, and visionOS, and system model updates will occur as part of those platforms’ OS updates (though not every OS update will have a model update). Be sure to install and use the latest beta software releases so that you have time to train a new adapter before people start using your app with the new system model version. Additionally, with the Foundation Models Framework Adapter Entitlement, the Account Holder of your membership in the Apple Developer Program will get an email update when a new toolkit version is available. Otherwise, when a new beta comes out, check here for any new toolkit versions.
How to train adapters
This guide provides a conceptual walkthrough of the steps to train an adapter. Each toolkit version also includes a sample code end-to-end Jupyter notebook in ./examples
.
Requirements
- Mac with Apple silicon or Linux GPU machines
- Python 3.11 or later
1. When to consider an adapter
Adapters are an effective way to teach the model specialized tasks, but they have steep requirements to train (and re-train for OS updates), so adapters aren’t suitable for all situations. Before considering adapters, try to get the most out of the system model using prompt engineering or tool calling. With the Foundation Models framework, tool calling is an effective way to give the system model access to outside knowledge sources or services.
Adapter training is worth considering if you have a dataset suitable for use with an LLM, or if your app is already using a fine-tuned server-based LLM and you want to try replicating that functionality with the on-device LLM for reduced costs. Other reasons to use an adapter include:
- You need the model to become a subject-matter expert.
- You need the model to adhere to a specific style, format, or policy.
- Prompt engineering isn’t achieving the required accuracy or consistency for your task.
- You want lower latency at inference. If your prompt-engineered solutions require lengthy prompts with examples for every call, an adapter specialized for that task offers minimal prompting.
Take into consideration that you will need:
- A dataset of prompt and response pairs that demonstrate your target skill
- A process for evaluating the quality of your adapters
- A process to load your adapters into your app from a server
Each adapter will take approximately 160 MB of storage space in your app. Like other big assets, adapters shouldn’t be part of your app’s main bundle because with multiple adapter versions your app will become too big for people to install. Instead, host your adapters on a server so that each person using your app can download just one adapter compatible with their device. For more on how, see Bundle adapters as asset packs below.
2. Set up virtual environment
Once you’ve downloaded the toolkit, it’s recommended to set up a Python virtual environment, using a Python environment manager like conda or venv:
conda create -n adapter-training python=3.11
conda activate adapter-training
cd /path/to/toolkit
3. Install dependencies
Next, use pip to install all the packages required by the toolkit:
pip install -r requirements.txt
Finally, start running the toolkit’s walkthrough Jupyter notebook to finish setup:
jupyter notebook ./examples/end_to_end_example.ipynb
4. Test generation
Verify your setup is ready by loading and running inference with system base model assets in the assets
folder. The Jupyter notebook in examples
demonstrates how to run inference, or you can run examples/generate.py
from the command line:
python -m examples.generate --prompt "Prompt here"
Note
Toolkit model assets include base model weights optimized for efficient adapter training. You are only permitted to use these assets for training adapters. The behavior of the toolkit base model may not match the performance of the Foundation Models framework or other features, since it has no adapters.
5. Prepare a dataset
To train an adapter, you’ll need to prepare a dataset in the jsonl
format expected by the model. As a rough estimate of how much data you’ll need, consider:
- 100 to 1,000 samples to teach the model basic tasks
- 5,000+ samples to teach the model complex tasks
The full expected data schema, including special fields you need to support guided generation and improve AI safety, can be found in the toolkit in Schema.md
. The most basic schema is a list of prompt and response pairs:
jsonl
[{"role": "user", "content": "PROMPT"}, {"role": "assistant", "content": "RESPONSE"}]
Here "role"
identifies who is providing the content. The role "user"
can refer to any entity providing the input prompt, such as you the developer, people using your app, or a mix of sources. The role "assistant"
always refers to the model. Replace the "content"
values above with your prompt and response, which can be text written in any language supported by Apple Intelligence.
Utilities to help you prepare your data, including options for specifying language and locale, can be found in examples/data.py
.
After formatting, split your data into train and eval sets. The train set is used to optimize the adapter parameters during training. The eval set is used to monitor performance during training, such as identifying overfitting, and providing feedback to help you tune hyper-parameters.
Tip
Focus on quality over quantity. A smaller dataset of clear, consistent, and well-structured samples may be more effective than larger dataset of noisy, low-quality samples.
6. Start adapter training
Adapter training is faster and less memory-intensive than fine-tuning an entire large language model. This is because the system model uses a parameter-efficient fine-tuning (PEFT) approach known as LoRA (Low-Rank Adaptation). In LoRA, the original model weights are frozen, and small trainable weight matrices called “adapters” are embedded through the model’s network. During training, only adapter weights are updated, significantly reducing the number of parameters to train. This approach also allows the base system model to be shared across many different use cases and apps that can each have a specialized adapter.
Start training by running the walkthrough Jupyter notebook in examples
, or the sample code in examples/train_adapter.py
. You can modify and customize the training sample code to meet your use cases’s needs. For convenience, examples/train_adapter.py
can be run from the command line:
python -m examples.train_adapter \
--train-data /path/to/train.jsonl \
--eval-data /path/to/valid.jsonl \
--epochs 5 \
--learning-rate 1e-3 \
--batch-size 4 \
--checkpoint-dir /path/to/my_checkpoints/
Use the data you prepared for train-data
and eval-data
. The additional training arguments are:
epochs
is number of training iterations. More epochs will take longer, but may improve your adapter’s quality.learning-rate
is a floating-point number indicating how much to adjust the model’s parameters at each step. Adjustments should be tailored to the specific use case.batch-size
is the number of examples in a single training step. Choose batch size based on the machine you’re running the training process on.checkpoint-dir
is a folder you create so that the training process can save checkpoints of your adapter as it trains.
During and after training, you can compare your adapter’s checkpoints to pick the one that best meets your quality goals. Checkpoints are also handy for resuming training in case the process fails midway, or you decide to train again for a few more epochs.
7. Optionally train the draft model
After training an adapter, you can train a matching draft model. Each toolkit includes assets for the system draft model, which is a small version of the system base model that can speed up inference via a technique called speculative decoding. Training the draft model is very similar to training an adapter, with some additional metrics so that you can measure how much your draft model speeds up inference. This step is optional. If you choose not to train the draft model, speculative decoding will not be available for your adapter’s use case. For more details on how draft models work, please refer to the papers Leviathan et al., 2022 (arXiv:2211.17192) and Chen et al., 2023 (arXiv:2302.01318).
Just like adapter training, you can train using the examples
Jupyter notebook, or by running the sample code in train_draft_model.py
from the command line:
python -m examples.train_draft_model \
--base-checkpoint /path/to/my_checkpoints/base-model-final.pt \
--train-data /path/to/train.jsonl \
--eval-data /path/to/valid.jsonl \
--epochs 5 \
--learning-rate 1e-3 \
--batch-size 4 \
--checkpoint-dir /path/to/my_checkpoints/
Training arguments are the same as training an adapter, except for:
base-checkpoint
is the base model checkpoint after adapter training as the target for draft model training. Choose the checkpoint you intend to export for your adapter.checkpoint-dir
is where you’d like your draft model checkpoints saved
After you train the draft model, if you’re not seeing much inference speedup, try experimenting with retraining the draft model using different hyper-parameters, more epochs, or alternative data to improve performance.
8. Evaluate adapter quality
Congratulations, you’ve trained an adapter! After training, you will need to evaluate how well your adapter has improved the system model’s behavior for your specific use case. Since each adapter is specialized, evaluation needs to be a custom process that makes sense for your specific use case. Typically, adapters are evaluated by both quantitative metrics, such as match to a target dataset, and qualitative metrics, such as human grading or auto-grading by a larger server-based LLM. You will want to come up with a standardized eval process, so that you can evaluate each of your adapters for each model version, and ensure they all meet your performance goals. Be sure to also evaluate your adapter for AI safety.
To start running inference with your new adapter, see the walkthrough Jupyter notebook, or call the sample code examples/generate.py
from the command line.
python -m examples.generate \
--prompt "Your prompt here" \
--base-checkpoint /path/to/my_checkpoints/base-model-final.pt \
--draft-checkpoint /path/to/my_checkpoints/draft-model-final.pt
Include the arguments draft_checkpoint
only if you trained a draft model.
9. Export adapter
When you’re ready to export, the toolkit includes utility functions to export your adapter in the .fmadapter
package format Xcode and Foundation Models framework expect. Unlike all the customizable sample code for training, code in the export
folder should not be modified, since the export logic must match exactly to make your adapter compatible with the system model and Xcode.
Export is covered in the walkthrough Jupyter notebook in examples
, and the export utility can be run from the command line:
python -m export.export_fmadapter \
--adapter-name my_adapter \
--base-checkpoint /path/to/my_checkpoints/base-model-final.pt \
--draft-checkpoint /path/to/my_checkpoints/draft-model-final.pt \
--output-dir /path/to/my_exports/
If you trained the draft model, the --draft_checkpoint
argument will bundle your draft model checkpoint as part of the .fmadapter
package. Exclude this argument otherwise.
Now that you have my_adapter.fmadapter
, you’re ready to start using your custom adapter using the FoundationModels framework!
How to deploy adapters
This guide covers how to try out custom adapters in your app with the FoundationModels framework, and then the longer process of preparing your adapters for deployment. Since each adapter file is large (160 MB+), you'll be setting up your app to download the correct adapter for a person's device.
Requirements
- A trained adapters in
.fmadapter
format - Xcode 26 or later
- Foundation Models Framework Adapter Entitlement
- Optional: a server to host your adapters
1. Try out a local adapter in Xcode
You can try custom adapters in your app locally, before going through the full process you’ll need to deploy adapters. First, store your .fmadapter
files locally on the Mac you run Xcode, in a different directory than your app. You can open .fmadapter
files in Xcode to preview the adapter’s metadata and version compatibility. If you’ve trained multiple adapters, find the adapter compatible with the macOS of the Mac you’re running Xcode on. Select the compatible adapter file in Finder, and use ⌥ ⌘ c to copy its full file path to your Mac clipboard.
Next, open your app in Xcode. With the Foundation Models framework, initialize an SystemLanguageModel.Adapter
from a local URL using the full file path from your clipboard:
let localURL = URL(filePath: "absolute/path/to/my_adapter.fmadapter")
let adapter = try SystemLanguageModel.Adapter(fileURL: localURL)
Then initialize a SystemLanguageModel
with your adapter, and a LanguageModelSession
with the adapted model. You’re ready to try out prompts using your adapter:
let adaptedModel = SystemLanguageModel(adapter: adapter)
let session = LanguageModelSession(model: adaptedModel)
let response = try await session.respond(to: "Your prompt here")
Important
Don’t include .fmadapter
files in your app’s Xcode target, since adapter files are too big to include as normal assets. If you do import or drag-and-drop an adapter file into your Xcode project for local testing, be sure to remove the file before you publish your app.
2. Bundle adapters as asset packs
Since each person using your app will just need a single specific adapter compatible with their device, the recommended approach is to host your adapter assets on a server, and use the Background Assets framework to manage downloads. For hosting your adapter assets, you can use your own server or choose to have Apple host your adapter assets. To understand your options, read about Apple-Hosted Background Assets in the App Store Connect documentation.
The Background Assets framework has a type of asset pack specific to Foundation Models adapters. To bundle your adapters in the asset pack format, return to the adapter training toolkit. The recommended way to produce an asset pack is using Python, but note the asset pack code needs the ba-package
command line tool, which is a utility included with Xcode. If you trained your adapters on a Linux GPU machine, at this point follow the setup steps of How to train adapters to set up a Python environment on your Mac just for this bundling step. If you trained your adapters on a Mac, you have everything you need.
In the adapter training toolkit, see the walkthrough Jupyter notebook in examples
or see the full adapter asset pack bundling code in export/produce_asset_pack.py
.
3. Upload your asset packs to a server
Once you’ve generated asset packs for each adapter, upload your asset packs to your chosen server. If Apple is hosting your adapters, follow the instructions Upload Apple-hosted asset packs in the App Store Connect documentation.
4. Configure an asset download target in Xcode
Open your app’s Xcode project. To download adapters at runtime with, your project will need a special asset downloader extension target. In the Xcode, choose File > New > Target ... and, in the sheet that appears, choose the Background Download template under the Application Extension section. Click next. A dialog will appear that asks for a "Product Name" and "Extension Type". For the product name, choose a descriptive name like "AssetDownloader". For the extension type choose:
- Apple-Hosted, Managed if Apple is hosting your adapters (simplest option).
- Self-hosted, Managed if you are using your own server but would like a person’s OS to automatically handle the download lifecycle (simple option).
- Self-hosted, Unmanaged if you are using your own server and want to code a custom download lifecycle (advanced option).
Click Finish. With your new downloader target, follow the instructions in the article Configuring your Background Assets project so that your extension and app can work together. Afterwards, Afterwards, check your app target’s information property list (or Info.plist
file) contains these additional required fields specific to your extension type:
- Apple-Hosted, Managed:
BAHasManagedAssetPacks
= yesBAAppGroupID
= the app group where your assets are hostedBAUsesAppleHosting
= yes
- Self-hosted, Managed:
BAHasManagedAssetPacks
= yesBAAppGroupID
= the app group where your assets are hostedBAManifestURL
= the URL of the manifest file for your adapter asset on your serverBAInitialDownloadRestrictions
set to a dictionary type field containing a keyBADownloadDomainAllowList
with an array value. SetItem 0
to your server’s location in DNS format.
- Self-hosted, Unmanaged:
BAHasManagedAssetPacks
= noBAManifestURL
= the URL of the manifest file for your adapter asset on your serverBAInitialDownloadRestrictions
set to a dictionary type field containing a keyBADownloadDomainAllowList
with an array value. SetItem 0
to your server’s location in DNS format.
Finally, it’s time to get coding!
5. Choose a compatible adapter at runtime
In your downloader target, open the generated code file BackgroundDownloadHandler.swift
. This file is what the Background Assets framework will use to download your adapters. Fill out based on your target type.
-
Apple-Hosted, Managed or Self-hosted, Managed:
You will see a single function
shouldDownload
. Fill this function in with the following code to choose an adapter asset compatible with the runtime device:func shouldDownload(_ assetPack: AssetPack) -> Bool { // Check for any non-adapter assets your app has, such as shaders. if assetPack.id.hasPrefix("mygameshader") { // Return false to filter out asset packs. return true } return SystemLanguageModel.Adapter.isCompatible(assetPack) }
- Self-hosted, Unmanaged:
You will see many different functions, for manual fine-grained control over the download lifecycle of your assets. Refer to the Background Assets framework for instructions.
6. Load adapter assets in your app
Once you’ve finished your downloader extension, return to your app target and choose any Swift file to start loading adapters. First, clean up any outdated adapters that may be on a person’s device:
SystemLanguageModel.Adapter.removeObsoleteAdapters()
Next, simply initialize a SystemLanguageModel.Adapter
using your adapter’s base name (without a file extension). If a person’s device doesn’t have your adapter downloaded yet, your downloader app extension will start downloading an adapter asset pack compatible with the current device:
let adapter = try SystemLanguageModel.Adapter(name: "myAdapter")
If a person has just installed your app, or their device needs an updated adapter, download will start. Since adapters can be approximately 160 MB, expect download time and design your app accordingly. Especially if a person using your app isn’t connected to a network, they won’t be able to use your adapter right away. To get the download status of an adapter:
let assetpackIDList = SystemLanguageModel.Adapter.compatibleAdapterIdentifiers(name: "myAdapter")
if let assetPackID = assetpackIDList.first {
let downloadStatus = AssetPackManager.shared.status(ofAssetPackID: assetpackID)
The download status DownloadStatusUpdate
is an enum that can be used show a progress UI if your app waiting on the download to complete. If no download was needed, the status will be finished
immediately. If download was needed, wait for the status to be finished
.
At last, it’s time to initialize a system language model with your adapter and get prompting:
let adaptedModel = SystemLanguageModel(adapter: adapter)
let session = LanguageModelSession(model: adaptedModel)
let response = try await session.respond(to: "Your prompt here")