View in English

  • メニューを開く メニューを閉じる
  • Apple Developer
検索
検索を終了
  • Apple Developer
  • ニュース
  • 見つける
  • デザイン
  • 開発
  • 配信
  • サポート
  • アカウント
次の内容に検索結果を絞り込む

クイックリンク

5 クイックリンク

ビデオ

メニューを開く メニューを閉じる
  • コレクション
  • トピック
  • すべてのビデオ
  • 利用方法

WWDC25に戻る

ストリーミングはほとんどのブラウザと
Developerアプリで視聴できます。

  • 概要
  • Summary
  • トランスクリプト
  • コード
  • Apple Immersive Videoテクノロジーについて

    真のイマーシブ体験を実現するための、Apple Immersive VideoおよびApple Spatial Audio Formatテクノロジーの機能を確認しましょう。Apple Immersive Videoを利用するために必要なメタデータを読み書きする機能を提供する、新しいImmersiveMediaSupportフレームワークを紹介します。Apple Immersive VideoのコンテンツをHLS経由で再生およびストリーミングできるように、スタンドアロンファイルにエンコードして公開するためのガイドラインについても解説します。 このセッションの内容を十分理解できるよう、まず「Explore video experiences for visionOS」のビデオを視聴することをおすすめします。

    関連する章

    • 0:00 - Introduction
    • 0:48 - Apple Immersive Video overview
    • 2:36 - Apple Immersive Video metadata
    • 5:13 - Read AIVU files
    • 7:16 - Write AIVU files
    • 8:43 - Publish Apple Immersive Video content
    • 10:29 - Preview Apple Immersive Video content
    • 11:21 - Apple Spatial Audio Format
    • 12:39 - Apple Positional Audio Codec

    リソース

    • Authoring Apple Immersive Video
    • AVFoundation
    • AVPlayerItemMetadataOutput
    • Core Media
    • HTTP Live Streaming (HLS) authoring specification for Apple devices
    • Immersive Media Support
    • What's new in HTTP Live Streaming
      • HDビデオ
      • SDビデオ

    関連ビデオ

    WWDC25

    • 空間Webの新機能
    • Apple Projected Media Profileについて
    • visionOSのビデオ体験の詳細
    • visionOSアプリでのイマーシブなビデオ再生のサポート
  • このビデオを検索

    Hi, I’m Blake, an engineer on the Apple Immersive Video team. In this video, I’m going to explain the new capabilities in macOS and visionOS 26 for creating Apple Immersive Video.

    From the explore video experiences for visionOS WWDC25, I will build on the foundations of the video profiles available in visionOS 26, and the high-level overview of Apple Immersive Video, so it's important to watch that one first.

    In this video, I’m going to cover the capabilities of Apple Immersive Video and Spatial Audio technologies for you to be able to create truly immersive experiences. And I’ll start with Apple Immersive Video.

    Apple Immersive Video is the highest quality immersive experience for video playback on Apple Vision Pro, with high-fidelity video and fully immersive audio to put you in the experience as if you were there.

    And because the content is so immersive, it requires specific cameras that are capable of capturing this high-fidelity video, such as the Blackmagic URSA Cine Immersive, designed from the ground up for Apple Immersive Video.

    Apple immersive video cameras are uniquely calibrated from the factory to capture the exact curvature of each of the stereoscopic lenses.

    And this calibration information is included with every video file. The calibration is used in the video metadata to correctly project the video.

    This table, from the “Explore Video Experiences for visionOS” WWDC25 contains the different formats that are supported in visionOS 26. And specifically for Apple Immersive Video, it uses a parametric projection type to support these camera calibrations.

    macOS and visionOS 26 now feature the Immersive Media Support framework, allowing you to create custom workflows. It enables reading and writing the essential metadata for Apple Immersive Video, and provides capabilities for previewing content in editorial workflows. For creating tools to support video production pipelines, like non-linear editing software or video compression and encoding tools, I’ll go over how to read and write Apple Immersive Video, how to publish your content for everyone to watch, and how to preview your content during the production process. But first, I'll start with the metadata, which enables Apple Immersive Video experiences.

    Apple Immersive Video can be produced using multiple cameras.

    And because each camera has a unique calibration, the combination of these cameras describes the venues captured. The VenueDescriptor type in the Immersive Media Support framework contains a combination of all of the cameras used in the venue. This VenueDescriptor information is stored as Apple Immersive Media Embedded or AIMEData, which I’ll cover in more detail later in this session.

    The VenueDescriptor type holds the reference to the cameras and the camera view model, the ability to add and remove cameras, the reference to your AIMEData, and the ability to save it to a URL, which will be important later. Each camera used in your video is capable of including more information than just the camera calibration. The points of a mask, or edge-blend, uses alpha to mask out the edges of the content.

    And there are several more capabilities for camera calibrations, like setting the camera origin position information. Custom backdrop environments are able to be included with camera calibrations as well. For all of the capabilities of the VenueDescriptor and the ImmersiveCamera, check out the Immersive Media Support documentation.

    Because the camera calibrations are specific to the video frames in your output video, dynamic metadata is present to define which calibration should be used for a given frame. There are additional timed dynamic metadata commands, represented as presentation commands in the Immersive Media Support framework, which are muxed into your output QuickTime file.

    Every video frame can contain multiple presentation commands with it. And these commands go along with every frame from your video track.

    Another PresentationCommand is a shot flop, used in editing for a variety of reasons, where the image and eyes are flopped over the y-axis.

    It’s important to note that because the immersive camera uses stereoscopic lenses, it makes a shot flop a more challenging editorial process since the image and eyes are swapped. But using the PresentationCommand, this is all handled automatically by visionOS during playback.

    Beyond the camera calibration and shot flop commands, there are fades, which are dynamically rendered and not baked into the video frame. For more details on these commands, refer to the PresentationDescriptor and PresentationCommand types. Now, I’ll describe how to use Apple Immersive Video in your own apps. To segment content as HLS, edit Apple Immersive Video files, or create your own custom player, reading the metadata is important. And for a single, file-based, standalone Apple Immersive Video experience, typically used in production, There is now an Apple Immersive Video Universal file type.

    The Apple Immersive Video Universal, or AIVU file, is a container of your output video with the PresentationDescriptor muxed into it and has the VenueDescriptor as metadata included within it as well.

    AIVU files are able to be played from the Files app through Quick Look on visionOS. And to play back Apple Immersive Video in your own app as a standalone file or HLS, check out “Support Immersive Video Playback in visionOS Apps” from WWDC25.

    If you are building an app or service to stream Apple Immersive Video or to share your Apple Immersive Video content with others, AIVU files are the best way to easily ingest or share your content with all the necessary metadata.

    Along with the new Immersive Media Support framework, there are also new APIs in AVFoundation to help with reading and writing Apple immersive video. To read the VenueDescriptor from an AIVU file, use the familiar AVFoundation APIs to load the asset’s metadata. There is a new quickTimeMetadataAIMEData identifier for filtering the specific metadata to load AIMEData as a VenueDescriptor. To read the PresentationDescriptor metadata, get the metadata group timed with each presentation timestamp for the video frames. Filter based on the quickTimeMetadataPresentationImmersiveMedia identifier, and decode the value into a presentation descriptor type.

    And for more information on how to get the timed metadata group, refer to the AVPlayerItemMetadataOutput API in AVFoundation.

    To write Apple Immersive Video, whether for a production tool or as an output from a non-linear editing software, you are able to create your own AIVU files.

    When creating Apple Immersive Video, there are a few important things to know. For your video assets projection kind, you must use AppleImmersiveVideo. This projection kind is defined as the parametric kind specific for Apple Immersive Video, so it's known how to get the projection. You also need to write your VenueDescriptor and PresentationCommand values to your video assets metadata using AVAssetWriter. Use the venue descriptor to retrieve the AIMEData to be saved to an AVMetadataItem with the AIMEData identifier.

    For your PresentationCommands, use the PresentationDescriptor reader to get the commands for a specific time. And use the presentation identifier I mentioned earlier, to create timed AVMetadataItems that align with the provided times and durations of your video frame buffers.

    Once you’ve created your AIVU files, you will be able to verify them using the AIVUValidator’s validate function in the Immersive Media Support framework. This will throw an error for any issues with validation or return true if it’s valid.

    For details on how to use AVAssetWriter for writing AIVU files, refer to the Authoring Apple Immersive Video sample project.

    For publishing Apple immersive content, use HLS segmentation to stream your video directly to your application.

    Apple Vision Pro is capable of rendering MV-HEVC at a recommended resolution of 4320 by 4320 per eye, 90 frames per second, with a P3-D65-PQ color space, and Apple Spatial Audio, which I’ll talk about later in this video.

    The recommended tiers for segmenting Apple Immersive Video are ranging from a minimum of 25 to 100 megabits per second for the average bandwidth and 50 to 150 megabits per second for peak. It’s important to consider the tradeoff between quality and size when building out your own tiers, while keeping the same resolution and frame rate. When building the HLS playlist, you will need to include your VenueDescriptor as AIMEData saved to a file alongside your HLS playlist for Apple Vision Pro to render your content correctly.

    To create your AIME file, save your VenueDescriptor object using the save function and copy that AIME file into your HLS playlist. It’s important to retain the metadata track with your video segments when segmenting the QuickTime file to keep the PresentationDescriptor commands. In the HLS multivariant playlist, there are a few important tags to call out. Apple Immersive Video requires version 12 or higher, the venue description data ID pointing to your AIME file, a content type of fully immersive, and in addition to using APAC Audio, which I’ll talk about later in this video, the required video layout needs to be stereo video and use the Apple Immersive Video projection.

    One other new important API in the Immersive Media Support framework is the ImmersiveMediaRemotePreviewSender and Receiver. It’s important to note that this method for previewing only supports a lower bitrate performance of Apple Immersive Video, and should be used in editorial workflows where quickly previewing is useful and full video files aren’t processed yet. One example of this would be viewing content on Apple Vision Pro while editing the video.

    These APIs are designed to send Apple Immersive Video Frames from Mac to Apple Vision Pro. ImmersiveMediaRemotePreviewSender and Receiver, enables sending the Immersive Video Frames to one or multiple receivers. And using a custom compositor, it allows live previewing in your visionOS application. For more information, check out the Immersive Media Support documentation.

    Spatial Audio is as important as video when considering creating a compelling immersive experience. We have created a new format for Spatial Audio called Apple Spatial Audio Format, or ASAF. ASAF is used in production to create truly immersive audio experiences. The Apple Positional Audio Codec, or APAC, is used to encode this audio format for delivery purposes.

    ASAF enables truly externalized audio experiences by ensuring acoustic cues are used to render the audio. It’s composed of new metadata coupled with linear PCM, and a powerful new spatial renderer that’s built into Apple platforms. It produces high resolution Spatial Audio through numerous point sources and high resolution sound scenes, or higher order ambisonics. The rendered audio is completely adaptive based on the object position and orientation, as well as listener position and orientation. None of it is baked in. And the sounds in ASAF come from all directions in any position, and at any distance. ASAF is carried inside of broadcast Wave files with linear PCM signals and metadata.

    You typically use ASAF in production, and to stream ASAF audio, you will need to encode that audio as an mp4 APAC file.

    APAC efficiently distributes ASAF, and APAC is required for any Apple immersive video experience. APAC playback is available on all Apple platforms except watchOS, and supports Channels, Objects, Higher Order Ambisonics, Dialogue, Binaural audio, interactive elements, as well as provisioning for extendable metadata. Because of the efficiency of this codec, it enables immersive spatial experiences at bitrates as low as 64 kilobits per second. To deliver spatial audio with HTTP Live Streaming, you need to include the media tag with the audio channel information, and specify APAC as an audio codec in the stream info tag.

    For new capabilities in HLS, specifically for supporting APAC audio, refer to the What’s New in HLS article.

    ASAF content can be created and encoded into APAC using Apple’s Pro Tools plugins, available on a per-user license, or Blackmagic Design’s DaVinci Resolve Studio Editor.

    In this session, I’ve covered the foundations of the metadata which makes Apple Immersive Video what it is, how to read and write it enabled by the Immersive Media Support framework, and Spatial Audio.

    Expand your app to support truly immersive experiences by supporting Apple Immersive Video and Spatial Audio. For more information on other immersive video formats for visionOS, check out “Learn About the Apple Projected Media Profile.” To learn how to play Apple Immersive Video, Watch the “Support Immersive Video Playback in visionOS apps” from WWDC25.

    I really love watching Apple Immersive Video, so I’m very excited for you to create more experiences. Oh, and send me your Apple Immersive Video Universal files so I can watch them. Thanks.

    • 6:23 - Read VenueDescriptor from AIVU file

      func readAIMEData(from aivuFile: URL) async throws -> VenueDescriptor? {
          let avAsset = AVURLAsset(url: aivuFile)
          let metadata = try await avAsset.load(.metadata)
          let aimeData = metadata.filter({ $0.identifier == .quickTimeMetadataAIMEData }).first
          if let dataValue = try await aimeData.load(.value) as? NSData {
              return try await VenueDescriptor(aimeData: dataValue as Data)
          }
          return nil
      }
    • 6:50 - Read PresentationDescriptor from AIVU playback

      func presentation(timedMetadata: [AVTimedMetadataGroup]) async throws ->   
      [PresentationDescriptor] {
          var presentations: [PresentationDescriptor] = [] 
          for group in timedMetadata {
              for metadata in group.items {
                  if metadata.identifier == .quickTimeMetadataPresentationImmersiveMedia {
                      let data = try await metadata.load(.dataValue) {
                          presentations.append(
                              try JSONDecoder().decode(PresentationDescriptor.self, from: data)
                          )
                      }
                  }
              }
          }
          return presentations
      }
    • 7:52 - Create AVMetadataItem from VenueDescriptor

      func getMetadataItem(from metadata: VenueDescriptor) async throws -> AVMetadataItem {
          let aimeData = try await metadata.aimeData
          let aimeMetadataItem = AVMutableMetadataItem()
          aimeMetadataItem.identifier = .quickTimeMetadataAIMEData
          aimeMetadataItem.dataType = String(kCMMetadataBaseDataType_RawData)
          aimeMetadataItem.value = aimeData as NSData
              
          return aimeMetadataItem
      }
    • 8:02 - Create timed AVMetadataItem from PresentationDescriptorReader

      func getMetadataItem(reader: PresentationDescriptorReader, 
                           time: CMTime, frameDuration: CMTime) -> AVMetadataItem? {
          let commands = reader.outputPresentationCommands(for: time) ?? []
          if commands.isEmpty { return nil }
      
          let descriptor = PresentationDescriptor(commands: commands)
          let encodedData = try JSONEncoder().encode(descriptor)
          let presentationMetadata = AVMutableMetadataItem()
          presentationMetadata.identifier = .quickTimeMetadataPresentationImmersiveMedia
          presentationMetadata.dataType = String(kCMMetadataBaseDataType_RawData)
          presentationMetadata.value = encodedData as NSData
          presentationMetadata.time = time
          presentationMetadata.duration = frameDuration
          
          return presentationMetadata
      }
    • 8:20 - Validate AIVU file

      func validAIVU(file aivuFile: URL) async throws -> Bool { 
          return try await AIVUValidator.validate(url: aivuFile)
      }
    • 9:31 - Save AIME file

      let aimeFile = FileManager.default.temporaryDirectory.appendingPathComponent("primary.aime")
      try? await venueDescriptor.save(to: aimeFile)
    • 0:00 - Introduction
    • visionOS 26 offers new capabilities allowing to create Apple Immersive Video experiences with Spatial Audio.

    • 0:48 - Apple Immersive Video overview
    • Apple Immersive Video provides high-fidelity, stereoscopic video playback with fully immersive audio on Apple Vision Pro. Specialized cameras, like the Blackmagic URSA Cine Immersive, are calibrated to capture the exact curvature of each stereoscopic lens, and this calibration info is carried with the video files for correct projection. macOS and visionOS 26 support this format through the Immersive Media Support framework, enabling custom workflows for content creation, previewing, and publishing.

    • 2:36 - Apple Immersive Video metadata
    • Apple Immersive Video can be produced using multiple cameras, each with unique calibrations. The combination of these cameras describes the venues captured. The VenueDescriptors include camera information, edge-blending masks, custom backdrops, and dynamic calibration data for each video frame and stored as Apple Immersive Media Embedded, or AIMEData. The Immersive Media Support framework enables the integration of presentation commands such as shot flops, fades, and dynamic rendering, which are automatically handled by visionOS during playback, simplifying the editorial process for stereoscopic immersive videos. Refer to the PresentationDescriptor and PresentationCommand types for more details.

    • 5:13 - Read AIVU files
    • Apple Immersive Video Universal (AIVU) file is a container of output video with metadata muxed. You can play AIVU files on visionOS via Quick Look in the Files app, and in custom apps using AVKit. The new quickTimeMetadataAIMEData AVAsset metadata identifier provides access to AIMEData as a VenueDescriptor, and the PresentationDescriptor metadata is available through the AVTimedMetadataGroup.

    • 7:16 - Write AIVU files
    • To create Apple Immersive Video (AIVU) files, use the AppleImmersiveVideo projection kind and write VenueDescriptor and PresentationCommand values to your asset's metadata using AVAssetWriter. The AIVUValidator's validate function can then verify the files. For more details, see the "Authoring Apple Immersive Video" sample project.

    • 8:43 - Publish Apple Immersive Video content
    • To publish your Apple immersive content, use HLS segmentation with MV-HEVC video at 4320x4320 per eye, 90 frames per second, and a P3-D65-PQ color space. The recommended tiers for segmenting Apple Immersive Video are ranging between 25-150Mbps for average bandwidth. Include your AIME file (VenueDescriptor) with your HLS multi-variant playlist, as well as the APAC audio track. Your playlist needs to specify version 12 or higher, the fully immersive content type, and the stereo video layout with Apple Immersive Video projection.

    • 10:29 - Preview Apple Immersive Video content
    • The new ImmersiveMediaRemotePreviewSender and Receiver APIs in the Immersive Media Support framework support low-bitrate live previewing of Apple Immersive Video from Mac to Apple Vision Pro during editorial workflows, allowing for real-time viewing while editing. Check out the Immersive Media Support documentation for more details.

    • 11:21 - Apple Spatial Audio Format
    • Apple Spatial Audio Format (ASAF) is a new production format that uses new metadata, linear PCM, and a spatial renderer to create high-resolution Spatial Audio. ASAF enables externalized audio with adaptive sound from all directions, distances, and positions. ASAF is carried inside of Broadcast Wave Files with linear PCM signals and metadata.

    • 12:39 - Apple Positional Audio Codec
    • To stream ASAF audio via HLS, encode it as an MP4 APAC file using Apple Pro Tools plugins or Blackmagic Design's DaVinci Resolve Studio Editor. APAC is required for any Apple Immersive Video experience, and is available on all Apple platforms except watchOS, enabling efficient spatial audio delivery at low bitrates. Include the media tag with channel information, and specify APAC in the stream info tag to deliver spatial audio with HLS.

Developer Footer

  • ビデオ
  • WWDC25
  • Apple Immersive Videoテクノロジーについて
  • メニューを開く メニューを閉じる
    • iOS
    • iPadOS
    • macOS
    • tvOS
    • visionOS
    • watchOS
    Open Menu Close Menu
    • Swift
    • SwiftUI
    • Swift Playground
    • TestFlight
    • Xcode
    • Xcode Cloud
    • SF Symbols
    メニューを開く メニューを閉じる
    • アクセシビリティ
    • アクセサリ
    • App Extension
    • App Store
    • オーディオとビデオ(英語)
    • 拡張現実
    • デザイン
    • 配信
    • 教育
    • フォント(英語)
    • ゲーム
    • ヘルスケアとフィットネス
    • アプリ内課金
    • ローカリゼーション
    • マップと位置情報
    • 機械学習
    • オープンソース(英語)
    • セキュリティ
    • SafariとWeb(英語)
    メニューを開く メニューを閉じる
    • 英語ドキュメント(完全版)
    • 日本語ドキュメント(一部トピック)
    • チュートリアル
    • ダウンロード(英語)
    • フォーラム(英語)
    • ビデオ
    Open Menu Close Menu
    • サポートドキュメント
    • お問い合わせ
    • バグ報告
    • システム状況(英語)
    メニューを開く メニューを閉じる
    • Apple Developer
    • App Store Connect
    • Certificates, IDs, & Profiles(英語)
    • フィードバックアシスタント
    メニューを開く メニューを閉じる
    • Apple Developer Program
    • Apple Developer Enterprise Program
    • App Store Small Business Program
    • MFi Program(英語)
    • News Partner Program(英語)
    • Video Partner Program(英語)
    • セキュリティ報奨金プログラム(英語)
    • Security Research Device Program(英語)
    Open Menu Close Menu
    • Appleに相談
    • Apple Developer Center
    • App Store Awards(英語)
    • Apple Design Awards
    • Apple Developer Academy(英語)
    • WWDC
    Apple Developerアプリを入手する
    Copyright © 2025 Apple Inc. All rights reserved.
    利用規約 プライバシーポリシー 契約とガイドライン