Improving person segmentation and occlusion quality in RealityKit

I’m building an app that uses RealityKit and specifically ARConfiguration.FrameSemantics.personSegmentationWithDepth.

The goal is to insert an AR object into the scene behind a person, and an additional AR object in front of the person, while being as photo realistic as possible.

Through testing, I’ve noticed that many times, the edges of the person segmentation mask are not well matched to the actual person, and parts of the person are transparent, with the AR object bleeding through. It’s sort of like a “bad green screen” effect, which I’d expect to see a little bit, but not to this extent. I’ve been testing on iPhone 16, iPhone 14 Pro, iPad Pro 12.9 inch 6th Generation, and iPhone 12 Pro, with similar results across all devices.

I’m wondering what else I can do to improve this… either code changes, platform (like different iPhone models), or environment (like lighting, distance, etc).

Attaching some example screen grabs and a minimum reproducible code sample. Appreciate any insights!

import ARKit
import SwiftUI
import RealityKit

struct RealityViewContainer: UIViewRepresentable {
    
    func makeUIView(context: Context) -> ARView {
        let arView = ARView(frame: .zero)
        arView.environment.sceneUnderstanding.options.insert(.occlusion)
        arView.renderOptions.insert(.disableMotionBlur)
        arView.renderOptions.insert(.disableDepthOfField)

        let configuration = ARWorldTrackingConfiguration()
        configuration.planeDetection = [.horizontal]
        if ARWorldTrackingConfiguration.supportsFrameSemantics(.personSegmentationWithDepth) {
            configuration.frameSemantics.insert(.personSegmentationWithDepth)
        }

        arView.session.run(configuration)
        
        arView.session.delegate = context.coordinator
        context.coordinator.arView = arView
    }
    
    
    func makeCoordinator() -> Coordinator {
        Coordinator(self)
    }
    
    class Coordinator: NSObject, ARSessionDelegate {
        var parent: RealityViewContainer
        var floorAnchor: ARPlaneAnchor?
        
        init(_ parent: RealityViewContainer) {
            self.parent = parent
        }
        
        func session(_ session: ARSession, didAdd anchors: [ARAnchor]) {
            if let arView,floorAnchor == nil {
                for anchor in anchors {
                    if let horizontalPlaneAnchor = anchor as? ARPlaneAnchor,
                       horizontalPlaneAnchor.alignment == .horizontal,
                       horizontalPlaneAnchor.transform.columns.3.y < arView.cameraTransform.translation.y { // filter out ceiling
                        floorAnchor = horizontalPlaneAnchor
                        let backgroundEntity = BackgroundEntity()
                        let anchorEntity = AnchorEntity(anchor: horizontalPlaneAnchor)
                        anchorEntity.addChild(background)
                        let foregroundEntity = ForegroundEntity()
                        backgroundEntity.addChild(foregroundEntity)
                        arView.scene.addAnchor(anchorEntity)
                        
                        arView.installGestures([.rotation, .translation], for: backgroundEntity)

                        break  // Stop after adding the first horizontal plane (floor)
                    }
                }
            }
        }        
    }
}

Hello @scientifikent,

People occlusion is implemented by ARView. I recommend that you file a bug report using Feedback Assistant to request improvements in people occlusion.

I’m wondering what else I can do to improve this… either code changes, platform (like different iPhone models), or environment (like lighting, distance, etc).

Environmental factors certainly make a difference. For the best results, you would want to have high contrast between the person and the background in a well-lit room.

-- Greg

Improving person segmentation and occlusion quality in RealityKit
 
 
Q