Thanks for being a part of WWDC25!

How did we do? We’d love to know your thoughts on this year’s conference. Take the survey here

What is the first reliable position of the apple vision pro device?

In several visionOS apps, we readjust our scenes to the user's eye level (their heads). But, we have encountered issues whereby the WorldTrackingProvider returns bad/incorrect positions for the first x number of frames.

See below code which you can copy paste in any Immersive Space. Relaunch the space and observe the numberOfBadWorldInfos value is inconsistent.

a. what is the most reliable way to get the devices's position?

b. is this indeed a bug?

c. are we using worldInfo improperly?

d. as a workaround, in our apps we set to 10 the number of frames to let pass before using worldInfo, should we set our threshold differently?

import ARKit
import Combine
import OSLog
import SwiftUI
import RealityKit
import RealityKitContent

let SUBSYSTEM = Bundle.main.bundleIdentifier!
struct ImmersiveView: View {
    let logger = Logger(subsystem: SUBSYSTEM, category: "ImmersiveView")
    let session = ARKitSession()
    let worldInfo = WorldTrackingProvider()
    @State var sceneUpdateSubscription: EventSubscription? = nil
    @State var deviceTransform: simd_float4x4? = nil
    
    @State var numberOfBadWorldInfos = 0
    @State var isBadWorldInfoLoged = false

    var body: some View {
        RealityView { content in
            try? await session.run([worldInfo])
            sceneUpdateSubscription = content.subscribe(to: SceneEvents.Update.self) { event in
                guard let pose = worldInfo.queryDeviceAnchor(atTimestamp: CACurrentMediaTime()) else {
                    return
                }
                
                // `worldInfo` does not return correct values for the first few frames (exact number of frames is unknown)
                // - known SO: https://stackoverflow.com/questions/78396187/how-to-determine-the-first-reliable-position-of-the-apple-vision-pro-device
                deviceTransform = pose.originFromAnchorTransform
                if deviceTransform!.columns.3.y < 1.6 {
                    numberOfBadWorldInfos += 1
                    logger.warning("\(#function) \(#line) deviceTransform.columns.3.y \(deviceTransform!.columns.3.y), numberOfBadWorldInfos \(numberOfBadWorldInfos)")
                } else {
                    
                    if !isBadWorldInfoLoged {
                        logger.info("\(#function) \(#line) deviceTransform.columns.3.y \(deviceTransform!.columns.3.y), numberOfBadWorldInfos \(numberOfBadWorldInfos)")
                    }
                    isBadWorldInfoLoged = true // stop logging.
                }
            }
        }
    }
}

Hi @VaiStardom

A few suggestions:

  • Try moving try? await session.run([worldInfo]) into a .task.
  • Before you query device anchor confirm the worldTrackingProvider is running by checking worldTrackingProvider.state == .running.
  • Confirm the anchor is tracked otherwise it may not be accurate.
  • If you don't need access to the transforms of the content you place at eye level consider using a head anchor entity.
let headAnchor = AnchorEntity(.head)
headAnchor.addChild(someEntityYouWantToPositionRelativeToTheHead)
content.add(headAnchor)

If this resolves the issue please accept the answer otherwise please followup so I can help you achieve your goal.

Hi Vision Pro Engineer,

Thanks for your reply, and apologies for my late reply.

We tried points 1, 2, and 3 of your recommendations and have since moved our logic into the following DeviceTracker class.

Using your recommendations, we appear to be able to remove the frame counter properties. But, the following arbitrary yValue > 0.3 condition needs to remain. Otherwise, we are still getting negative values, see image.

import ARKit
import Combine
import OSLog
import RealityKit
import SwiftUI

@Observable
public final class DeviceTracker {
    private let session = ARKitSession()
    private let worldInfo = WorldTrackingProvider()
    private var sceneUpdateSubscription: EventSubscription?
    private let logger = Logger(subsystem: SUBSYSTEM, category: "DeviceTransformTracker")
    
    public var deviceTransform: simd_float4x4? = nil
    
    public init() {
        // Defer the task until self is fully initialized
        Task.detached(priority: .userInitiated) { [session, worldInfo, logger] in
            do {
                try await session.run([worldInfo])
            } catch {
                logger.error("\(#function) \(#line) Failed to start ARKitSession: \(error.localizedDescription)")
            }
        }
    }
    
    public func subscribe(content: RealityViewContent) {
        sceneUpdateSubscription = content.subscribe(to: SceneEvents.Update.self) { [weak self] _ in
            guard let self = self else { return }
            guard self.worldInfo.state == .running else {
                logger.warning("\(#function) \(#line) worldInfo.state not running")
                return
            }
            guard let deviceAnchor = self.worldInfo.queryDeviceAnchor(atTimestamp: CACurrentMediaTime()) else {
                logger.warning("\(#function) \(#line) missing deviceAnchor for CACurrentMediaTime() \(String(describing: CACurrentMediaTime()))")
                return
            }
            guard deviceAnchor.isTracked else {
                logger.warning("\(#function) \(#line) deviceAnchor not yet tracked")
                return
            }
            
            let yValue = deviceAnchor.originFromAnchorTransform.columns.3.y
            logger.warning("\(#function) \(#line) y value = \(deviceAnchor.originFromAnchorTransform.columns.3.y)")
            if yValue > 0.3 {
                self.deviceTransform = deviceAnchor.originFromAnchorTransform
            }
        }
    }
    
    public func cancel() {
        sceneUpdateSubscription?.cancel()
    }
}

By the way, thank you for your recommendations to use AnchorEntity(.head). We have tried using it before but kept getting inconsistencies wrt entity placements and re-adjustments, We are still trying to figure out how best to use it.

Adding all our content has AnchorEntity(.head)'s children is requiring lots of refactoring, since we have several entities controlled by Timeline animations, and as mentioned, we are not getting correct head positioning.

Currently, our favorite best working solution for eye-level placement is the DeviceTracker shared above.

What is the first reliable position of the apple vision pro device?
 
 
Q