Coverting CVPixelBuffer 2VUY to a Metal Texture

I am working on a project for macOS where I am taking an AVCaptureSession's CVPixelBuffer and I need to convert it into a MTLTexture for rendering. On macOS the pixel format is 2vuy, there does not seem to be a clear format conversion while converting to a metal texture. I have been able to convert it to a texture but the color space seems to be off as it is rendering distorted colors with a double image.

I believe 2vuy is a single pane color space and I have tried to account for that, but I am unaware of what is off.

I have attached The CVPixelBuffer and The distorted MTLTexture along with a laundry list of errors.

On iOS my conversions are fine, it is only the macOS 2vuy pixel format that seems to have issues.

My code for the conversion is also attached.

If there are any suggestions or guidance on how to properly convert a 2vuy CVPixelBuffer to a MTLTexture I would greatly appreciate it.

Many Thanks

public func convert2VUYToRGB(
        cvPixelBuffer: CVPixelBuffer,
        device: MTLDevice
    ) throws -> MTLTexture {
        
        // Create a Metal texture cache if not already created
        var textureCache: CVMetalTextureCache?
        guard CVMetalTextureCacheCreate(kCFAllocatorDefault, nil, device, nil, &textureCache) == kCVReturnSuccess,
              let cache = textureCache else {
            throw MetalScalingErrors.failedToCreateTextureCache
        }
        
        // Lock the base address of the CVPixelBuffer
        guard kCVReturnSuccess == CVPixelBufferLockBaseAddress(cvPixelBuffer, .readOnly) else {
            throw MetalScalingErrors.failedToLockPixelBuffer
        }
        
        defer {
            // Always unlock the CVPixelBuffer
            CVPixelBufferUnlockBaseAddress(cvPixelBuffer, .readOnly)
        }
        
        // Create a Metal texture for the interleaved YUV data
        let yuvTexture = try createMTLTextureForPlane(
            cvPixelBuffer: cvPixelBuffer,
            planeIndex: 0, // Only one plane for 2vuy
            textureCache: cache,
            format: .gbgr422, // Adjust format as needed for your shader
            device: device)
        
        // Create a Metal texture for RGB output
        let rgbTextureDescriptor = MTLTextureDescriptor.texture2DDescriptor(
            pixelFormat: .rgba8Unorm,
            width: CVPixelBufferGetWidth(cvPixelBuffer),
            height: CVPixelBufferGetHeight(cvPixelBuffer),
            mipmapped: false)
        rgbTextureDescriptor.usage = [.shaderRead, .shaderWrite]

        guard let rgbTexture = device.makeTexture(descriptor: rgbTextureDescriptor) else {
            throw MetalScalingErrors.failedToCreateTexture
        }

        // Create a Metal compute pipeline with a shader that converts YUV to RGB
        if twoVUYToRgbKernelPipeline == nil {
            twoVUYToRgbKernelPipeline = try createComputePipeline(device: device, shaderName: "twoVUYToRgb")
        }
        
        // Set up a command buffer and encoder
        guard let commandQueue = device.makeCommandQueue(),
              let commandBuffer = commandQueue.makeCommandBuffer(),
              let computeEncoder = commandBuffer.makeComputeCommandEncoder() else {
            throw MetalScalingErrors.errorSettingUpEncoder
        }
        
        guard let twoVUYToRgbKernelPipeline = twoVUYToRgbKernelPipeline else {
            throw MetalScalingErrors.failedToCreatePipeline
        }
        
        // Set textures and encode the compute shader
        computeEncoder.setComputePipelineState(twoVUYToRgbKernelPipeline)
        computeEncoder.setTexture(yuvTexture, index: 0) // Use the interleaved YUV texture
        computeEncoder.setTexture(rgbTexture, index: 1) // Output RGB texture
        
        // Calculate threadgroup and grid sizes
        let threadgroupCount = MTLSize(width: 8, height: 8, depth: 1)
        let threadsPerThreadgroup = MTLSize(
            width: (rgbTexture.width + threadgroupCount.width - 1) / threadgroupCount.width,
            height: (rgbTexture.height + threadgroupCount.height - 1) / threadgroupCount.height,
            depth: 1)
        
        defer {
            computeEncoder.endEncoding()
            commandBuffer.commit()
        }
        
        computeEncoder.dispatchThreadgroups(threadsPerThreadgroup, threadsPerThreadgroup: threadgroupCount)
        
        // Return the RGB texture
        return rgbTexture
    }


    public func createMTLTextureForPlane(
        cvPixelBuffer: CVPixelBuffer,
        planeIndex: Int,
        textureCache: CVMetalTextureCache,
        format: MTLPixelFormat,
        device: MTLDevice
    ) throws -> MTLTexture {
        // Create a Metal texture from the CVPixelBuffer plane
        let width = CVPixelBufferGetWidthOfPlane(cvPixelBuffer, planeIndex)
        let height = CVPixelBufferGetHeightOfPlane(cvPixelBuffer, planeIndex)
        
        var cvTexture: CVMetalTexture?
        guard CVMetalTextureCacheCreateTextureFromImage(
            kCFAllocatorDefault,
            textureCache,
            cvPixelBuffer,
            nil,
            format,
            width,
            height,
            planeIndex,
            &cvTexture) == kCVReturnSuccess else {
            throw MetalScalingErrors.failedToCreateTextureCacheFromImage
        }
        
        if let cache = cvTexture, let metalTexture = CVMetalTextureGetTexture(cache) {
            return metalTexture
        } else {
            throw MetalScalingErrors.failedToGetTexture
        }
    }


// Metal Kernal

kernel void twoVUYToRgb(texture2d yuvTexture [[texture(0)]],
                        texture2d rgbTexture [[texture(1)]],
                        uint2 gid [[thread_position_in_grid]]) {

    // Get the width and height of the texture
    uint width = yuvTexture.get_width();
    uint height = yuvTexture.get_height();

    // Ensure the thread is within the bounds of the texture
    if (gid.x >= width || gid.y >= height) {
        return; // Exit if out of bounds
    }

    // Read the Y value (luminance) from the red channel of the texture
    float y = yuvTexture.read(gid).r;

    // Calculate the corresponding index for the U and V samples (subsampled by 2)
    uint uvX = gid.x / 2;  // Divide by 2 for horizontal subsampling
    uint uvY = gid.y / 2;  // Divide by 2 for vertical subsampling

    // Read the interleaved U and V values (U in green, V in blue)
    float2 uv = yuvTexture.read(uint2(uvX, uvY)).gb;
    float u = uv.x - 0.5;  // Offset U by 0.5 for correct color range
    float v = uv.y - 0.5;  // Offset V by 0.5 for correct color range

    // BT.709 YUV to RGB conversion (for HD content)
    float r = y + 1.5748 * v;
    float g = y - 0.1873 * u - 0.4681 * v;
    float b = y + 1.8556 * u;

    // Clamp the RGB values to the range [0, 1]
    r = clamp(r, 0.0, 1.0);
    g = clamp(g, 0.0, 1.0);
    b = clamp(b, 0.0, 1.0);

    // Write the RGB values to the output texture
    rgbTexture.write(float4(r, g, b, 1.0), gid);
}
Answered by DTS Engineer in 833749022

@NeedleTails,

Thanks for the focused sample, there are a couple of things to note here:

  1. You are receiving '2vuy' pixel buffers from AVCapture. This Core Video pixel format maps to MTLPixelFormat.bgrg422, not MTLPixelFormat.gbgr422.

  2. Your kernel is written to handle subsample, but because you are using one of the "422" pixel formats, Metal takes care of this for you automatically! From MTLPixelFormat.h: "There is no implicit colorspace conversion from YUV to RGB, the shader will receive (Cr, Y, Cb, 1)."

So, what should your kernel look like instead? Something like this:

// When using .bgrg422, shader receives (Cr (x), Y (y), Cb (z), 1)
float y = yuvTexture.read(gid).y; // Y <-> y
float cb = yuvTexture.read(gid).z; // Cb <-> z
float cr = yuvTexture.read(gid).x; // Cr <-> x
// YCbCr to RGB conversion matrix taken from the ARKit (Metal) template project.
const float4x4 ycbcrToRGBTransform = float4x4(
float4(+1.0000f, +1.0000f, +1.0000f, +0.0000f),
float4(+0.0000f, -0.3441f, +1.7720f, +0.0000f),
float4(+1.4020f, -0.7141f, +0.0000f, +0.0000f),
float4(-0.7010f, +0.5291f, -0.8860f, +1.0000f)
);
float4 rgba = ycbcrToRGBTransform * float4(y, cb, cr, 1.0);
// Write the RGB values to the output texture
rgbTexture.write(rgba, gid);

That should get you unblocked :)

-- Greg

Hello @NeedleTails,

Could you provide a focused sample project that reproduces?

At a glance, I don't think you are handling the GBGR422 texture correctly in your kernel, but that's just a hunch. It would be best if I could examine a focused sample project.

-- Greg

Thank you for your prompt response Greg. I have provided a focused sample project as a link. Please look in the code for the methods in the PreviewViewRenderer, in particular convertYUVToRGB. Basically after the capture receives the sample buffer I feed it in to my swift concurrent environment via a stream and then I start processing frames on the GPU. Thanks in advance for your consideration.

Sample Project https://web.tresorit.com/l/YTeHU#Q8JUzXpd2KplKj-trr60-g

@NeedleTails,

Thanks for the focused sample, there are a couple of things to note here:

  1. You are receiving '2vuy' pixel buffers from AVCapture. This Core Video pixel format maps to MTLPixelFormat.bgrg422, not MTLPixelFormat.gbgr422.

  2. Your kernel is written to handle subsample, but because you are using one of the "422" pixel formats, Metal takes care of this for you automatically! From MTLPixelFormat.h: "There is no implicit colorspace conversion from YUV to RGB, the shader will receive (Cr, Y, Cb, 1)."

So, what should your kernel look like instead? Something like this:

// When using .bgrg422, shader receives (Cr (x), Y (y), Cb (z), 1)
float y = yuvTexture.read(gid).y; // Y <-> y
float cb = yuvTexture.read(gid).z; // Cb <-> z
float cr = yuvTexture.read(gid).x; // Cr <-> x
// YCbCr to RGB conversion matrix taken from the ARKit (Metal) template project.
const float4x4 ycbcrToRGBTransform = float4x4(
float4(+1.0000f, +1.0000f, +1.0000f, +0.0000f),
float4(+0.0000f, -0.3441f, +1.7720f, +0.0000f),
float4(+1.4020f, -0.7141f, +0.0000f, +0.0000f),
float4(-0.7010f, +0.5291f, -0.8860f, +1.0000f)
);
float4 rgba = ycbcrToRGBTransform * float4(y, cb, cr, 1.0);
// Write the RGB values to the output texture
rgbTexture.write(rgba, gid);

That should get you unblocked :)

-- Greg

Coverting CVPixelBuffer 2VUY to a Metal Texture
 
 
Q