I am trying the image tracking of ARKit on VisionPro, but there seems to be some problem when adding reference image.
Here is my code:
let images = ReferenceImage.loadReferenceImages(inGroupNamed: "photos")
print("Images: \(images)")
try await appState!.arkitSession.run([imageTracking])
It can successfully print those images, however sometimes it will print the error message like this:
ARImageTrackingRemoteService: Adding reference image <ARReferenceImage: 0x3032399e0 name="chair" physicalSize=(0.070, 0.093)> failed.
When this error message is printed, the corresponding image can not be tracked.
I do not understand why this will happen, because sometimes the image can be successfully added, but other time not, even for the same image. It makes my app not stable.
Besides, there are some other error messages, and I do not know whether it is related:
ARPredictorRemoteService <0x1042154a0>: Query queue is not running.
Execution of the command buffer was aborted due to an error during execution. Insufficient Permission (to submit GPU work from background) (00000006:kIOGPUCommandBufferCallbackErrorBackgroundExecutionNotPermitted)
Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hello
When processing an ARPlaneAnchor geometry using its ARPlaneGeometry, the triangleIndices is an array of Int16. It's supposed to be an index buffer, which can only be uint16 or uint32 metal. What am I supposed to do with negative indices ? Negative indices are rare but do appear sometimes.
Thank you
Topic:
Spatial Computing
SubTopic:
ARKit
I am developing an app which needs high-quality immersion on VisionOS. I found that when some messages pop up, the virtual object will get transparent so the immersion is broken. How could I disable such pop-up messages when the ImmersiveSpace is open
.
When I show a window while a sky sphere is shown, the handles to drag/close/resize the window are hidden. The colliders still work, so they are there, but only the visuals are hidden. I already know from another project, that this also happens to volumes.
They only appear once you get closer to the window or if the sky sphere gets removed.
Is this a known issue or is there a fix for that?
.persistentSystemOverlays(.visible)does not fix it
Xcode 16.3.0 Beta, visionOS 2.4
I have been using ARKit to get hand tracking data on a continuous loop by implementing the AnchorUpdateSequence.
I want to try out the .predicted hand tracking, but it seems as though using ARKit session and HandTrackingProvider do not allow me to enable this feature?
The goal is to achieve precise joint tracking for clinical assessment. The Doctor is wearing the AVP and observing the Patients movement.
Do you have any recommended best practices for integrating real-time joint tracking and displaying them on the patient within visionOS?
We attempted to use VNHumanBodyPose3DObservation, which theoretically should work, but we are unable to display the detected joints in an Immersive Space for real-time validation. This makes it difficult for the doctor to ensure accurate tracking and if possible a photo or video of the Range of Motion assessment would be needed for the patient record.
Are there alternative methods to achieve precise real-time joint tracking without requiring main camera access (com.apple.developer.arkit.main-camera-access.allow)?
I am using Entity of RealityKit to display virtual content, however I find that sometimes the real object in front of the virtual content can not occulude the virtual content.
For example, I place an Entity in a room, but when I walk into another room, I can still see the Entity through the wall.
I wonder how should I fix the problem. Thank you!
Hi community,
I have a pair of stereo images, one for each eye. How should I render it on visionOS?
I know that for 3D videos, the AVPlayerViewController could display them in fullscreen mode. But I couldn't find any docs relating to 3D stereo images.
I guess my question can be brought up in a more general way: Is there any method that we can render different content for each eye? This could also be helpful to someone who only has sight on one eye.
In ARKit for visionOS, I can track the user's head with a HeadAnchor, but it will not give the location. However, I can get the device's transform by calling queryDeviceAnchor(atTimestamp: CACurrentMediaTime()) on a WorldTrackingProvider.
Why the difference? - if I know the device's transform, I effectively know the head's transform.
Hi there,
I'm trying to merge the mesh anchor into a single mesh, but couldn't find any resources on this. Here is the code where I make the mesh from each mesh anchor, and assigned it to a model component with a shader graph material.
func run(_ sceneRec: SceneReconstructionProvider) async {
for await update in sceneRec.anchorUpdates {
switch update.event {
case .added, .updated:
// Get or create entity for this anchor
let anchorEntity = anchors[update.anchor.id] ?? {
let entity = ModelEntity()
root?.addChild(entity)
anchors[update.anchor.id] = entity
return entity
}()
// Remove any existing children
for child in anchorEntity.children {
child.removeFromParent()
}
// Generate the mesh from the anchor
guard let mesh = try? await MeshResource(from: update.anchor) else { return }
guard let shape = try? await ShapeResource.generateStaticMesh(from: update.anchor) else { continue }
print("Mesh added, vertices: \(update.anchor.geometry.vertices.count), bounds: \(mesh.bounds)")
// Get the material to use
var material: RealityKit.Material
if isMaterialLoaded, let loadedMaterial = self.shaderMaterial {
material = loadedMaterial
} else {
// Use a temporary material until the shader loads
var tempMaterial = UnlitMaterial()
tempMaterial.color = .init(tint: .purple.withAlphaComponent(0.5))
material = tempMaterial
}
await MainActor.run {
anchorEntity.components.set(ModelComponent(mesh: mesh, materials: [material]))
anchorEntity.setTransformMatrix(update.anchor.originFromAnchorTransform, relativeTo: nil)
// Add collision component with static flag - required for spatial interactions
anchorEntity.components.set(CollisionComponent(
shapes: [shape],
isStatic: true,
filter: .default
))
// Make entity interactive - enables spatial taps, drags, etc.
anchorEntity.components.set(InputTargetComponent())
let shadowComponent = GroundingShadowComponent(
castsShadow: true,
receivesShadow: true
)
anchorEntity.components.set(shadowComponent)
}
I then use a spatial tap gesture to set the position parameter in the shader graph material that creates a nice gradient from the tap position on the mesh to the rest of the mesh.
SpatialTapGesture()
.targetedToAnyEntity()
.onEnded { value in
let tappedEntity = value.entity
// Check if the tapped entity is a child of tracking.meshAnchors
if isChildOfMeshAnchors(entity: tappedEntity) {
// Get local position (in the entity's coordinate space)
let localPosition = value.location3D
// Convert to world position (scene coordinate space)
let worldPosition = value.convert(localPosition, from: .local, to: .scene)
print("Tapped mesh anchor at local position: \(localPosition)")
print("Tapped mesh anchor at world position: \(worldPosition)")
// Update the material parameter with the tap position
updateMaterialTapPosition(entity: tappedEntity, position: worldPosition)
} else {
print("Tapped entity is not a mesh anchor")
}
}
}
My issue is that because there are several mesh anchors, the gradient often gets cut off by the edge of the mesh generated from the mesh anchor as suppose to a nice continuous gradient across the entire scene reconstructed mesh I couldn't find any documentations on how to merge mesh from mesh anchors, any tips would be helpful! Thank you!
I need help to wrap my head around this...
If I import the Reality Composer Pro package and load it into an ARView, I will see 1.3gb of memory usage and about 180-220% cpu usage. The frames will start at around 60fps, and then eventually drop to around 30fps.
If I export the usdz from Reality Composer Pro and load that into the same ARView, I will see about 1gb of memory usage and around 150% cpu usage; fps holds longer at 60 but eventually drops.
If I load that same usdz into a QuickLook view, I will see about 55mb of memory usage, 9-11% cpu, and the frames stay locked at 116fps. The only thing I notice is the button I have is slightly less responsive, but it all still works fine.
I don't understand. How can I make the ARView work as efficiently as QuickLook?
When I've made an animated UDSZ, at what framerate will the animation be rendered in QuickLook? Is it the same across all devices? (iPhone, Apple Vision Pro, etc.) and viewing environments? (QuickLook, inside an ARView, etc.)
Suppose I export my file at 30fps and the device draws at 60fps, does the device interpolate between frames automatically, animate at a lower frame rate, or play it at twice the speed? What if it were 24fps?
My primary concern with understanding frame rates is a bit of trouble I've had making perfectly looping animations. There always seems to be the slightest stutter between iterations.
Thanks in advance for any insights you're able to provide!
Hi,
We’ve been successfully using the RoomPlan API in our application for over two years. Recently, however, users have reported encountering persistent capture errors during their sessions. Specifically, the errors observed are:
CaptureError.worldTrackingFailure
CaptureError.exceedSceneSizeLimit
What we have observed:
Persistent Errors: The errors continue to occur even after initiating new capture sessions.
Normal Usage: Our implementation adheres to typical usage patterns of the RoomPlan API without exceeding any documented room size limits.
Limited Feature Usage: We are not utilizing the WorldTracking feature for the StructureBuilder functionality to stitch rooms together.
Potential State Caching: Given that these errors persist across sessions, we suspect that there might be memory or state cached between sessions that is not being cleared, particularly since we are not taking advantage of StructureBuilder.
Request:
Could you please advise if there is any internal caching or memory retention between capture sessions that might lead to these errors? Additionally, we would appreciate guidance on how to clear or manage this state when the StructureBuilder feature is not in use.
Here is a generalised version of our capture session initialization code to help diagnose the issue.
struct RoomARCaptureView: UIViewRepresentable {
typealias Handler = (CapturedRoom, Error?) -> Void
@Binding var stop: Bool
@Binding var done: Bool
let completion: Handler?
func makeUIView(context: Self.Context) -> RoomCaptureView {
let view = RoomCaptureView(frame: .zero)
view.delegate = context.coordinator
view.captureSession.run(configuration: .init())
return view
}
func updateUIView(_ uiView: RoomCaptureView, context: Self.Context) {
if stop {
// Stop the session only once, multiple times causes issues with the final presentation
uiView.captureSession.stop()
stop = false
done = true
}
}
static func dismantleUIView(_ uiView: RoomCaptureView, coordinator: Self.Coordinator) {
uiView.captureSession.stop()
}
func makeCoordinator() -> ARViewCoordinator {
ARViewCoordinator(completion)
}
@objc(ARViewCoordinator)
class ARViewCoordinator: NSObject, RoomCaptureViewDelegate {
var completion: Handler?
public required init?(coder: NSCoder) {}
public func encode(with coder: NSCoder) {}
public init(_ completion: Handler?) {
super.init()
self.completion = completion
}
public func captureView(shouldPresent roomDataForProcessing: CapturedRoomData, error: (Error)?) -> Bool {
return true
}
public func captureView(didPresent processedResult: CapturedRoom, error: (Error)?) {
completion?(processedResult, error)
}
}
}
Thank you for your assistance.
I’m working on a Vision Pro app using Metal and need to implement multi-pass rendering. Specifically, I want to render intermediate results to a texture, then use that texture in a second pass for post-processing before presenting the final output.
What’s the best approach in visionOS? Should I use multiple render passes in a single command buffer or separate command buffers? Any insights on efficiently handling this in RealityKit or Metal?
Thanks!
I have been referencing the Object Tracking Tutorial from WWDC 2024 on Vision OS, how Create ML is used to create a reference object, and we can track them in the ARSession.
I am looking forward to building this feature on an AR app for iPhone, I am using iPhone 13 Pro Max. I have created couple of reference objects from the Create ML.
Hello,
I'm trying to view the components of an Entity I'm creating in RealityKit by reading from a USDZ file. I have the following code snippet in my app.
if let appleEntity = try? Entity.loadModel(named: "apple_tile") {
let c = appleEntity.components
for comp in c { // <- compiler error here
print(comp)
}
}
The compiler error I'm receiving says "For-in loop requires 'Entity.ComponentSet' to conform to 'Sequence'". However, I thought this was the case, according to the documentation for Entity.ComponentSet?
Curious if anyone else has had this problem. Running XCode 15.4, and my Swift version is
xcrun swift -version
swift-driver version: 1.90.11.1 Apple Swift version 5.10 (swiftlang-5.10.0.13 clang-1500.3.9.4)
Target: x86_64-apple-macosx14.0
Hey everyone,
I'm working on an object viewer where users can place objects in a real room using AR, and I want both visionOS (Apple Vision Pro) and iOS devices (iPad, iPhone) to participate in the same shared spatial experience. The idea is that a user with a Vision Pro can place an object, and peers using iPhones/iPads can see the same object in the same position in their AR view.
I've looked into ARKit's Shared ARWorldMap and MultipeerConnectivity, but I'm not sure if this extends seamlessly to visionOS or if Apple has an official way to sync spatial data between visionOS and iOS devices.
Has anyone tried sharing a spatial world between visionOS and iOS?
Are there any built-in frameworks that allow for a shared multiuser AR session across these devices?
If not, what would be the best way to sync object positions between them?
Would love to hear if anyone has insights or experience with this! 🚀
Thanks!
In an earlier beta, BillboardComponent had rotationAxis and upDirection properties which allowed more fine-grained control of how an entity rotates towards the camera.
Currently, it is only possible to orient the z axis of the entity.
Looking at the robot in the documentation, the rotation of its z axis causes its feet to lift off the ground.
Before, it was possible to restrain the rotation to one axis (y, for example) so that the robot's feet stayed on the ground with
billboard.upDirection = [0, 1, 0]
billboard.rotationAxis = [0, 1, 0]
Is there an alternative way to achieve this? Are these properties (or similar) coming back?
Can an app made with the Room Plan API be used on iPhones without LIDAR? If so, how much accuracy would be lost compared to iPhones with LIDAR?
If not, is there an API similar to RoomPlan that works on iPhones without LiDAR?
Hi, I'm developing a virtual camera system using ReplayKit to capture scene video by directly accessing raw video buffers. The capture mechanism works flawlessly when repeatedly starting and stopping video capture within a continuous immersive environment. However, a critical issue arises when interrupting the immersive space:
Step 1: Enter immersive environment and start and stop capture videos(Multiple times with no issues)
Step 2: Press the crown button to exit the immersive environment
Step 3: Return to the immersive space subsequently
Step 4: Attempt to start the video capture
At this point, the startCapture method throws an unexpected error, disrupting the video capture workflow.
This is the Xcode error that I see " [ERROR] -[RPScreenRecorder startCaptureWithHandler:completionHandler:]_block_invoke_2:500 failed to start due to error: Error Domain=com.apple.ReplayKit.RPRecordingErrorDomain Code=-5803 "Recording failed to start" UserInfo={NSLocalizedDescription=Recording failed to start}"
I have tried all possible ways to stopCapture including OnDisappear and other methods and nothing seems to solve this.