Get started with Foundation Models adapter training

Teach the on-device language model new skills specific to your app by training a custom adapter. This toolkit contains a Python training workflow and utilities to package adapters for use with the Foundation Models framework.

    Overview

    While the on-device system language model is powerful, it may not be capable of all specialized tasks. Adapters are an advanced technique that adapt a large language model (LLM) with new skills or domains. With the adapter training toolkit, you can train adapters to specialize the on-device system LLM’s abilities, and then use your adapter with the Foundation Models framework. On this page you can download the toolkit, and learn about the full process of training and deploying custom adapters for your app.

    The adapter training toolkit contains:

    • Python sample code for each adapter training step
    • Model assets that match a specific system model version
    • Utilities to export an .fmadapter package
    • Utilities to bundle an adapter as a background asset pack

    Foundation Models Framework Adapter Entitlement

    When you’re ready to deploy adapters in your app, the Account Holder of a membership in the Apple Developer Program will need to request the Foundation Models Framework Adapter Entitlement. You don’t need this entitlement to train or locally test adapters.

    Get entitlement

    Download toolkit

    To download any adapter training toolkit version, you’ll need to be a member of the Apple Developer Program and will first need to agree to the terms and conditions of the toolkit.

    Get toolkit

    Remember you may need to download multiple toolkit versions. Each version contains the unique model assets compatible with a specific OS version range. To support people on different OS versions using your app, you must train an adapter for each version of the toolkit.

    Version Changes OS Compatibility
    Beta 0.1.0 Initial release macOS 26*

    * Custom adapter support on iOS, iPadOS, and visionOS coming soon.

    When do new versions come out? A new toolkit will be released for every system model update. The system model is shared across iOS, macOS, and visionOS, and system model updates will occur as part of those platforms’ OS updates (though not every OS update will have a model update). Be sure to install and use the latest beta software releases so that you have time to train a new adapter before people start using your app with the new system model version. Additionally, with the Foundation Models Framework Adapter Entitlement, the Account Holder of your membership in the Apple Developer Program will get an email update when a new toolkit version is available. Otherwise, when a new beta comes out, check here for any new toolkit versions.

    How to train adapters

    This guide provides a conceptual walkthrough of the steps to train an adapter. Each toolkit version also includes a sample code end-to-end Jupyter notebook in ./examples.

    Requirements

    • Mac with Apple silicon or Linux GPU machines
    • Python 3.11 or later

    1. When to consider an adapter

    Adapters are an effective way to teach the model specialized tasks, but they have steep requirements to train (and re-train for OS updates), so adapters aren’t suitable for all situations. Before considering adapters, try to get the most out of the system model using prompt engineering or tool calling. With the Foundation Models framework, tool calling is an effective way to give the system model access to outside knowledge sources or services.

    Adapter training is worth considering if you have a dataset suitable for use with an LLM, or if your app is already using a fine-tuned server-based LLM and you want to try replicating that functionality with the on-device LLM for reduced costs. Other reasons to use an adapter include:

    • You need the model to become a subject-matter expert.
    • You need the model to adhere to a specific style, format, or policy.
    • Prompt engineering isn’t achieving the required accuracy or consistency for your task.
    • You want lower latency at inference. If your prompt-engineered solutions require lengthy prompts with examples for every call, an adapter specialized for that task offers minimal prompting.

    Take into consideration that you will need:

    • A dataset of prompt and response pairs that demonstrate your target skill
    • A process for evaluating the quality of your adapters
    • A process to load your adapters into your app from a server

    Each adapter will take approximately 160 MB of storage space in your app. Like other big assets, adapters shouldn’t be part of your app’s main bundle because with multiple adapter versions your app will become too big for people to install. Instead, host your adapters on a server so that each person using your app can download just one adapter compatible with their device. For more on how, see Bundle adapters as asset packs below.

    2. Set up virtual environment

    Once you’ve downloaded the toolkit, it’s recommended to set up a Python virtual environment, using a Python environment manager like conda or venv:

    conda create -n adapter-training python=3.11
    conda activate adapter-training
    cd /path/to/toolkit

    3. Install dependencies

    Next, use pip to install all the packages required by the toolkit:

    pip install -r requirements.txt

    Finally, start running the toolkit’s walkthrough Jupyter notebook to finish setup:

    jupyter notebook ./examples/end_to_end_example.ipynb

    4. Test generation

    Verify your setup is ready by loading and running inference with system base model assets in the assets folder. The Jupyter notebook in examples demonstrates how to run inference, or you can run examples/generate.py from the command line:

    python -m examples.generate --prompt "Prompt here"

    5. Prepare a dataset

    To train an adapter, you’ll need to prepare a dataset in the jsonl format expected by the model. As a rough estimate of how much data you’ll need, consider:

    • 100 to 1,000 samples to teach the model basic tasks
    • 5,000+ samples to teach the model complex tasks

    The full expected data schema, including special fields you need to support guided generation and improve AI safety, can be found in the toolkit in Schema.md. The most basic schema is a list of prompt and response pairs:

    jsonl
    [{"role": "user", "content": "PROMPT"}, {"role": "assistant", "content": "RESPONSE"}]

    Here "role" identifies who is providing the content. The role "user" can refer to any entity providing the input prompt, such as you the developer, people using your app, or a mix of sources. The role "assistant" always refers to the model. Replace the "content" values above with your prompt and response, which can be text written in any language supported by Apple Intelligence.

    Utilities to help you prepare your data, including options for specifying language and locale, can be found in examples/data.py.

    After formatting, split your data into train and eval sets. The train set is used to optimize the adapter parameters during training. The eval set is used to monitor performance during training, such as identifying overfitting, and providing feedback to help you tune hyper-parameters.

    6. Start adapter training

    Adapter training is faster and less memory-intensive than fine-tuning an entire large language model. This is because the system model uses a parameter-efficient fine-tuning (PEFT) approach known as LoRA (Low-Rank Adaptation). In LoRA, the original model weights are frozen, and small trainable weight matrices called “adapters” are embedded through the model’s network. During training, only adapter weights are updated, significantly reducing the number of parameters to train. This approach also allows the base system model to be shared across many different use cases and apps that can each have a specialized adapter.

    Start training by running the walkthrough Jupyter notebook in examples, or the sample code in examples/train_adapter.py. You can modify and customize the training sample code to meet your use cases’s needs. For convenience, examples/train_adapter.py can be run from the command line:

    python -m examples.train_adapter \
    --train-data /path/to/train.jsonl \
    --eval-data /path/to/valid.jsonl \
    --epochs 5 \
    --learning-rate 1e-3 \
    --batch-size 4 \
    --checkpoint-dir /path/to/my_checkpoints/

    Use the data you prepared for train-data and eval-data. The additional training arguments are:

    • epochs is number of training iterations. More epochs will take longer, but may improve your adapter’s quality.
    • learning-rate is a floating-point number indicating how much to adjust the model’s parameters at each step. Adjustments should be tailored to the specific use case.
    • batch-size is the number of examples in a single training step. Choose batch size based on the machine you’re running the training process on.
    • checkpoint-dir is a folder you create so that the training process can save checkpoints of your adapter as it trains.

    During and after training, you can compare your adapter’s checkpoints to pick the one that best meets your quality goals. Checkpoints are also handy for resuming training in case the process fails midway, or you decide to train again for a few more epochs.

    7. Optionally train the draft model

    After training an adapter, you can train a matching draft model. Each toolkit includes assets for the system draft model, which is a small version of the system base model that can speed up inference via a technique called speculative decoding. Training the draft model is very similar to training an adapter, with some additional metrics so that you can measure how much your draft model speeds up inference. This step is optional. If you choose not to train the draft model, speculative decoding will not be available for your adapter’s use case. For more details on how draft models work, please refer to the papers Leviathan et al., 2022 (arXiv:2211.17192) and Chen et al., 2023 (arXiv:2302.01318).

    Just like adapter training, you can train using the examples Jupyter notebook, or by running the sample code in train_draft_model.py from the command line:

    python -m examples.train_draft_model \
    --base-checkpoint /path/to/my_checkpoints/base-model-final.pt \
    --train-data /path/to/train.jsonl \
    --eval-data /path/to/valid.jsonl \
    --epochs 5 \
    --learning-rate 1e-3 \
    --batch-size 4 \
    --checkpoint-dir /path/to/my_checkpoints/

    Training arguments are the same as training an adapter, except for:

    • base-checkpoint is the base model checkpoint after adapter training as the target for draft model training. Choose the checkpoint you intend to export for your adapter.
    • checkpoint-dir is where you’d like your draft model checkpoints saved

    After you train the draft model, if you’re not seeing much inference speedup, try experimenting with retraining the draft model using different hyper-parameters, more epochs, or alternative data to improve performance.

    8. Evaluate adapter quality

    Congratulations, you’ve trained an adapter! After training, you will need to evaluate how well your adapter has improved the system model’s behavior for your specific use case. Since each adapter is specialized, evaluation needs to be a custom process that makes sense for your specific use case. Typically, adapters are evaluated by both quantitative metrics, such as match to a target dataset, and qualitative metrics, such as human grading or auto-grading by a larger server-based LLM. You will want to come up with a standardized eval process, so that you can evaluate each of your adapters for each model version, and ensure they all meet your performance goals. Be sure to also evaluate your adapter for AI safety.

    To start running inference with your new adapter, see the walkthrough Jupyter notebook, or call the sample code examples/generate.py from the command line.

    python -m examples.generate \
    --prompt "Your prompt here" \
    --base-checkpoint /path/to/my_checkpoints/base-model-final.pt \
    --draft-checkpoint /path/to/my_checkpoints/draft-model-final.pt

    Include the arguments draft_checkpoint only if you trained a draft model.

    9. Export adapter

    When you’re ready to export, the toolkit includes utility functions to export your adapter in the .fmadapter package format Xcode and Foundation Models framework expect. Unlike all the customizable sample code for training, code in the export folder should not be modified, since the export logic must match exactly to make your adapter compatible with the system model and Xcode.

    Export is covered in the walkthrough Jupyter notebook in examples, and the export utility can be run from the command line:

    python -m export.export_fmadapter \
    --adapter-name my_adapter \
    --base-checkpoint /path/to/my_checkpoints/base-model-final.pt \
    --draft-checkpoint /path/to/my_checkpoints/draft-model-final.pt \
    --output-dir /path/to/my_exports/

    If you trained the draft model, the --draft_checkpoint argument will bundle your draft model checkpoint as part of the .fmadapter package. Exclude this argument otherwise.

    Now that you have my_adapter.fmadapter, you’re ready to start using your custom adapter using the FoundationModels framework!

    How to deploy adapters

    This guide covers how to try out custom adapters in your app with the FoundationModels framework, and then the longer process of preparing your adapters for deployment. Since each adapter file is large (160 MB+), you'll be setting up your app to download the correct adapter for a person's device.

    Requirements

    1. Try out a local adapter in Xcode

    You can try custom adapters in your app locally, before going through the full process you’ll need to deploy adapters. First, store your .fmadapter files locally on the Mac you run Xcode, in a different directory than your app. You can open .fmadapter files in Xcode to preview the adapter’s metadata and version compatibility. If you’ve trained multiple adapters, find the adapter compatible with the macOS of the Mac you’re running Xcode on. Select the compatible adapter file in Finder, and use ⌥ ⌘ c to copy its full file path to your Mac clipboard.

    Next, open your app in Xcode. With the Foundation Models framework, initialize an SystemLanguageModel.Adapter from a local URL using the full file path from your clipboard:

    let localURL = URL(filePath: "absolute/path/to/my_adapter.fmadapter")
    let adapter = try SystemLanguageModel.Adapter(fileURL: localURL)

    Then initialize a SystemLanguageModel with your adapter, and a LanguageModelSession with the adapted model. You’re ready to try out prompts using your adapter:

    let adaptedModel = SystemLanguageModel(adapter: adapter)
    let session = LanguageModelSession(model: adaptedModel)
    let response = try await session.respond(to: "Your prompt here")

    2. Bundle adapters as asset packs

    Since each person using your app will just need a single specific adapter compatible with their device, the recommended approach is to host your adapter assets on a server, and use the Background Assets framework to manage downloads. For hosting your adapter assets, you can use your own server or choose to have Apple host your adapter assets. To understand your options, read about Apple-Hosted Background Assets in the App Store Connect documentation.

    The Background Assets framework has a type of asset pack specific to Foundation Models adapters. To bundle your adapters in the asset pack format, return to the adapter training toolkit. The recommended way to produce an asset pack is using Python, but note the asset pack code needs the ba-package command line tool, which is a utility included with Xcode. If you trained your adapters on a Linux GPU machine, at this point follow the setup steps of How to train adapters to set up a Python environment on your Mac just for this bundling step. If you trained your adapters on a Mac, you have everything you need.

    In the adapter training toolkit, see the walkthrough Jupyter notebook in examples or see the full adapter asset pack bundling code in export/produce_asset_pack.py.

    3. Upload your asset packs to a server

    Once you’ve generated asset packs for each adapter, upload your asset packs to your chosen server. If Apple is hosting your adapters, follow the instructions Upload Apple-hosted asset packs in the App Store Connect documentation.

    4. Configure an asset download target in Xcode

    Open your app’s Xcode project. To download adapters at runtime with, your project will need a special asset downloader extension target. In the Xcode, choose File > New > Target ... and, in the sheet that appears, choose the Background Download template under the Application Extension section. Click next. A dialog will appear that asks for a "Product Name" and "Extension Type". For the product name, choose a descriptive name like "AssetDownloader". For the extension type choose:

    • Apple-Hosted, Managed if Apple is hosting your adapters (simplest option).
    • Self-hosted, Managed if you are using your own server but would like a person’s OS to automatically handle the download lifecycle (simple option).
    • Self-hosted, Unmanaged if you are using your own server and want to code a custom download lifecycle (advanced option).

    Click Finish. With your new downloader target, follow the instructions in the article Configuring your Background Assets project so that your extension and app can work together. Afterwards, Afterwards, check your app target’s information property list (or Info.plist file) contains these additional required fields specific to your extension type:

    • Apple-Hosted, Managed:
      • BAHasManagedAssetPacks = yes
      • BAAppGroupID = the app group where your assets are hosted
      • BAUsesAppleHosting = yes
    • Self-hosted, Managed:
      • BAHasManagedAssetPacks = yes
      • BAAppGroupID = the app group where your assets are hosted
      • BAManifestURL = the URL of the manifest file for your adapter asset on your server
      • BAInitialDownloadRestrictions set to a dictionary type field containing a key BADownloadDomainAllowList with an array value. Set Item 0 to your server’s location in DNS format.
    • Self-hosted, Unmanaged:
      • BAHasManagedAssetPacks = no
      • BAManifestURL = the URL of the manifest file for your adapter asset on your server
      • BAInitialDownloadRestrictions set to a dictionary type field containing a key BADownloadDomainAllowList with an array value. Set Item 0 to your server’s location in DNS format.

    Finally, it’s time to get coding!

    5. Choose a compatible adapter at runtime

    In your downloader target, open the generated code file BackgroundDownloadHandler.swift. This file is what the Background Assets framework will use to download your adapters. Fill out based on your target type.

    • Apple-Hosted, Managed or Self-hosted, Managed:

      You will see a single function shouldDownload. Fill this function in with the following code to choose an adapter asset compatible with the runtime device:

      func shouldDownload(_ assetPack: AssetPack) -> Bool {
      // Check for any non-adapter assets your app has, such as shaders.
      if assetPack.id.hasPrefix("mygameshader") {
      // Return false to filter out asset packs.
      return true
      }
      return SystemLanguageModel.Adapter.isCompatible(assetPack)
      }
    • Self-hosted, Unmanaged:

      You will see many different functions, for manual fine-grained control over the download lifecycle of your assets. Refer to the Background Assets framework for instructions.

    6. Load adapter assets in your app

    Once you’ve finished your downloader extension, return to your app target and choose any Swift file to start loading adapters. First, clean up any outdated adapters that may be on a person’s device:

    SystemLanguageModel.Adapter.removeObsoleteAdapters()

    Next, simply initialize a SystemLanguageModel.Adapter using your adapter’s base name (without a file extension). If a person’s device doesn’t have your adapter downloaded yet, your downloader app extension will start downloading an adapter asset pack compatible with the current device:

    let adapter = try SystemLanguageModel.Adapter(name: "myAdapter")

    If a person has just installed your app, or their device needs an updated adapter, download will start. Since adapters can be approximately 160 MB, expect download time and design your app accordingly. Especially if a person using your app isn’t connected to a network, they won’t be able to use your adapter right away. To get the download status of an adapter:

    let assetpackIDList = SystemLanguageModel.Adapter.compatibleAdapterIdentifiers(name: "myAdapter")
    if let assetPackID = assetpackIDList.first {
    let downloadStatus = AssetPackManager.shared.status(ofAssetPackID: assetpackID)

    The download status DownloadStatusUpdate is an enum that can be used show a progress UI if your app waiting on the download to complete. If no download was needed, the status will be finished immediately. If download was needed, wait for the status to be finished.

    At last, it’s time to initialize a system language model with your adapter and get prompting:

    let adaptedModel = SystemLanguageModel(adapter: adapter)
    let session = LanguageModelSession(model: adaptedModel)
    let response = try await session.respond(to: "Your prompt here")