Hello
I am testing the new Media Extension API in macOS 15 Beta 4.
Firstly, THANK YOU FOR THIS API!!!!!! This is going to be huge for the video ecosystem on the platform. Seriously!
My understanding is that to support custom container formats you make a MEFormatReader extension, and to support a specific custom codec, you create a MEVideoDecoder for that codec.
Ok - I have followed the docs - esp the inline header info and have gotten quite far
A Host App which hosts my Media Extenion (MKV files)
A Extension Bundle which exposes the UTTYpes it supports to the system and plugin class ID as per the docs
Entitlements as per docs
I'm building debug - but I have a valid Developer ID / Account associated in Teams in Xcode
My Plugin is visible to the Media Extension System preference
My Plugin is properly initialized, I get the MEByteReader and can read container level metadata in callbacks
I can instantiate my tracks readers, and validate the tracks level information and provide the callbacks
I can instantiate my sample cursors, and respond to seek requests for samples for the track in question
Now, here is where I get hit some issues.
My format reader is leveraging FFMPEGs libavformat library, and I am testing with MKV files which host AVC1 h264 samples, which should be decodable as I understand it out of the box from VideoToolbox (ie, I do not need a separate MEVideoDecoder plugin to handle this format).
Here is my CMFormatDescription which I vend from my MKV parser to AVFoundation via the track reader
Made Format Description: <CMVideoFormatDescription 0x11f005680 [0x1f7d62220]> {
mediaType:'vide'
mediaSubType:'avc1'
mediaSpecific: {
codecType: 'avc1' dimensions: 1920 x 1080
}
extensions: {(null)}
}
My MESampleCursor implementation implements all of the callbacks - and some of the 'optional' sample cursor location methods: (im only sharing the optional ones here)
- (MESampleLocation * _Nullable) sampleLocationReturningError:(NSError *__autoreleasing _Nullable * _Nullable) error
- (MESampleCursorChunk * _Nullable) chunkDetailsReturningError:(NSError *__autoreleasing _Nullable * _Nullable) error
I also populate the AVSampleCursorSyncInfo and AVSampleCursorDependencyInfo structs per each AVPacket* I decode from libavformat
Now my issue:
I get these log files in my host app:
<<<< VRP >>>> figVideoRenderPipelineSetProperty signalled err=-12852 (kFigRenderPipelineError_InvalidParameter) (sample attachment collector not enabled) at FigStandardVideoRenderPipeline.c:2231
<<<< VideoMentor >>>> videoMentorDependencyStateCopyCursorForDecodeWalk signalled err=-12836 (kVideoMentorUnexpectedSituationErr) (Node not found for target cursor -- it should have been created during videoMentorDependencyStateAddSamplesToGraph) at VideoMentor.c:4982
<<<< VideoMentor >>>> videoMentorThreadCreateSampleBuffer signalled err=-12841 (err) (FigSampleGeneratorCreateSampleBufferAtCursor failed) at VideoMentor.c:3960
<<<< VideoMentor >>>> videoMentorThreadCreateSampleBuffer signalled err=-12841 (err) (FigSampleGeneratorCreateSampleBufferAtCursor failed) at VideoMentor.c:3960
Which I presume is telling me I am not providing the GOP or dependency metadata correctly to the plugin.
I've included console logs from my extension and host app:
LibAVExtension system logs
And my SampleCursor implementation is here
https://github.com/vade/FFMPEGMediaExtension/blob/main/LibAVExtension/LibAVSampleCursor.m
Any guidance is very helpful.
Thank you!
How did we do? We’d love to know your thoughts on this year’s conference. Take the survey here
VideoToolbox
RSS for tagWork directly with hardware-accelerated video encoding and decoding capabilities using VideoToolbox.
Posts under VideoToolbox tag
25 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
tl;dr how can I get raw YUV in a Metal fragment shader from a VideoToolbox 10-bit/BT.2020 HEVC stream without any extra/secret format conversions?
With VideoToolbox and 10-bit HEVC, I've found that it defaults to CVPixelBuffers w/ formats kCVPixelFormatType_Lossless_420YpCbCr10PackedBiPlanarFullRange or kCVPixelFormatType_Lossy_420YpCbCr10PackedBiPlanarFullRange. To mitigate this, I have the following snippet of code to my application:
// We need our pixels unpacked for 10-bit so that the Metal textures actually work
var pixelFormat:OSType? = nil
let bpc = getBpcForVideoFormat(videoFormat!)
let isFullRange = getIsFullRangeForVideoFormat(videoFormat!)
// TODO: figure out how to check for 422/444, CVImageBufferChromaLocationBottomField?
if bpc == 10 {
pixelFormat = isFullRange ? kCVPixelFormatType_420YpCbCr10BiPlanarFullRange : kCVPixelFormatType_420YpCbCr10BiPlanarVideoRange
}
let videoDecoderSpecification:[NSString: AnyObject] = [kVTVideoDecoderSpecification_EnableHardwareAcceleratedVideoDecoder:kCFBooleanTrue]
var destinationImageBufferAttributes:[NSString: AnyObject] = [kCVPixelBufferMetalCompatibilityKey: true as NSNumber, kCVPixelBufferPoolMinimumBufferCountKey: 3 as NSNumber]
if pixelFormat != nil {
destinationImageBufferAttributes[kCVPixelBufferPixelFormatTypeKey] = pixelFormat! as NSNumber
}
var decompressionSession:VTDecompressionSession? = nil
err = VTDecompressionSessionCreate(allocator: nil, formatDescription: videoFormat!, decoderSpecification: videoDecoderSpecification as CFDictionary, imageBufferAttributes: destinationImageBufferAttributes as CFDictionary, outputCallback: nil, decompressionSessionOut: &decompressionSession)
In short, I need kCVPixelFormatType_420YpCbCr10BiPlanar so that I have a straightforward MTLPixelFormat.r16Unorm/MTLPixelFormat.rg16Unorm texture binding for Y/CbCr. Metal, seemingly, has no direct pixel format for 420YpCbCr10PackedBiPlanar. I'd also rather not use any color conversion in VideoToolbox, in order to save on processing (and to ensure that the color transforms/transfer characteristics match between streamer/client, since I also have a custom transfer characteristic to mitigate blocking in dark scenes).
However, I noticed that in visionOS 2, the CVPixelBuffer I receive is no longer a compressed render target (likely a bug), which caused GPU texture read bandwidth to skyrocket from 2GiB/s to 30GiB/s. More importantly, this implies that VideoToolbox may in fact be doing an extra color conversion step, wasting memory bandwidth.
Does Metal actually have no way to handle 420YpCbCr10PackedBiPlanar? Are there any examples for reading 10-bit HDR HEVC buffers directly with Metal?
My app stores and transports lots of groups of similar PNGs. These aren't compressed well by official algorithms like .lzfse, .lz4, .lzbitmap... not even bz2, but I realized that they are well-suited for compression by video codecs since they're highly similar to one another.
I ran an experiment where I compressed a dozen images into an HEVCWithAlpha .mov via AVAssetWriter, and the compression ratio was fantastic, but when I retrieved the PNGs via AVAssetImageGenerator there were lots of artifacts which simply wasn't acceptable. Maybe I'm doing something wrong, or maybe I'm chasing something that doesn't exist.
Is there a way to use video compression like a specialized archive to store and retrieve PNGs losslessly while retaining alpha? I have no intention of using the videos except as condensed storage.
Any suggestions on how to reduce storage size of many large PNGs are also welcome. I also tried using HEVC instead of PNG via the new UIImage.hevcData(), but the decompression/processing times were just insane (5000%+ increase), on top of there being fatal errors when using async.
First of all, I tried MobileVLCKit but there is too much delay
Then I wrote a UDPManager class and I am writing my codes below. I would be very happy if anyone has information and wants to direct me.
Broadcast code
ffmpeg -f avfoundation -video_size 1280x720 -framerate 30 -i "0" -c:v libx264 -preset medium -tune zerolatency -f mpegts "udp://127.0.0.1:6000?pkt_size=1316"
Live View Code (almost 0 delay)
ffplay -fflags nobuffer -flags low_delay -probesize 32 -analyzeduration 1 -strict experimental -framedrop -f mpegts -vf setpts=0 udp://127.0.0.1:6000
OR
mpv udp://127.0.0.1:6000 --no-cache --untimed --no-demuxer-thread --vd-lavc-threads=1
UDPManager
import Foundation
import AVFoundation
import CoreMedia
import VideoDecoder
import SwiftUI
import Network
import Combine
import CocoaAsyncSocket
import VideoToolbox
class UDPManager: NSObject, ObservableObject, GCDAsyncUdpSocketDelegate {
private let host: String
private let port: UInt16
private var socket: GCDAsyncUdpSocket?
@Published var videoOutput: CMSampleBuffer?
init(host: String, port: UInt16) {
self.host = host
self.port = port
}
func connectUDP() {
do {
socket = GCDAsyncUdpSocket(delegate: self, delegateQueue: .global())
//try socket?.connect(toHost: host, onPort: port)
try socket?.bind(toPort: port)
try socket?.enableBroadcast(true)
try socket?.enableReusePort(true)
try socket?.beginReceiving()
} catch {
print("UDP soketi oluşturma hatası: \(error)")
}
}
func closeUDP() {
socket?.close()
}
func udpSocket(_ sock: GCDAsyncUdpSocket, didConnectToAddress address: Data) {
print("UDP Bağlandı.")
}
func udpSocket(_ sock: GCDAsyncUdpSocket, didNotConnect error: Error?) {
print("UDP soketi bağlantı hatası: \(error?.localizedDescription ?? "Bilinmeyen hata")")
}
func udpSocket(_ sock: GCDAsyncUdpSocket, didReceive data: Data, fromAddress address: Data, withFilterContext filterContext: Any?) {
if !data.isEmpty {
DispatchQueue.main.async {
self.videoOutput = self.createSampleBuffer(from: data)
}
}
}
func createSampleBuffer(from data: Data) -> CMSampleBuffer? {
var blockBuffer: CMBlockBuffer?
var status = CMBlockBufferCreateWithMemoryBlock(
allocator: kCFAllocatorDefault,
memoryBlock: UnsafeMutableRawPointer(mutating: (data as NSData).bytes),
blockLength: data.count,
blockAllocator: kCFAllocatorNull,
customBlockSource: nil,
offsetToData: 0,
dataLength: data.count,
flags: 0,
blockBufferOut: &blockBuffer)
if status != noErr {
return nil
}
var sampleBuffer: CMSampleBuffer?
let sampleSizeArray = [data.count]
status = CMSampleBufferCreateReady(
allocator: kCFAllocatorDefault,
dataBuffer: blockBuffer,
formatDescription: nil,
sampleCount: 1,
sampleTimingEntryCount: 0,
sampleTimingArray: nil,
sampleSizeEntryCount: 1,
sampleSizeArray: sampleSizeArray,
sampleBufferOut: &sampleBuffer)
if status != noErr {
return nil
}
return sampleBuffer
}
}
I didn't know how to convert the data object to video, so I searched and found this code and wanted to try it
func createSampleBuffer(from data: Data) -> CMSampleBuffer? {
var blockBuffer: CMBlockBuffer?
var status = CMBlockBufferCreateWithMemoryBlock(
allocator: kCFAllocatorDefault,
memoryBlock: UnsafeMutableRawPointer(mutating: (data as NSData).bytes),
blockLength: data.count,
blockAllocator: kCFAllocatorNull,
customBlockSource: nil,
offsetToData: 0,
dataLength: data.count,
flags: 0,
blockBufferOut: &blockBuffer)
if status != noErr {
return nil
}
var sampleBuffer: CMSampleBuffer?
let sampleSizeArray = [data.count]
status = CMSampleBufferCreateReady(
allocator: kCFAllocatorDefault,
dataBuffer: blockBuffer,
formatDescription: nil,
sampleCount: 1,
sampleTimingEntryCount: 0,
sampleTimingArray: nil,
sampleSizeEntryCount: 1,
sampleSizeArray: sampleSizeArray,
sampleBufferOut: &sampleBuffer)
if status != noErr {
return nil
}
return sampleBuffer
}
And I tried to make CMSampleBuffer a player but it just shows a white screen and doesn't work
struct SampleBufferPlayerView: UIViewRepresentable {
typealias UIViewType = UIView
var sampleBuffer: CMSampleBuffer
func makeUIView(context: Context) -> UIView {
let view = UIView(frame: .zero)
let displayLayer = AVSampleBufferDisplayLayer()
displayLayer.videoGravity = .resizeAspectFill
view.layer.addSublayer(displayLayer)
context.coordinator.displayLayer = displayLayer
return view
}
func updateUIView(_ uiView: UIView, context: Context) {
context.coordinator.sampleBuffer = sampleBuffer
context.coordinator.updateSampleBuffer()
}
func makeCoordinator() -> Coordinator {
Coordinator()
}
class Coordinator {
var displayLayer: AVSampleBufferDisplayLayer?
var sampleBuffer: CMSampleBuffer?
func updateSampleBuffer() {
guard let displayLayer = displayLayer, let sampleBuffer = sampleBuffer else { return }
if displayLayer.isReadyForMoreMediaData {
displayLayer.enqueue(sampleBuffer)
} else {
displayLayer.requestMediaDataWhenReady(on: .main) {
if displayLayer.isReadyForMoreMediaData {
displayLayer.enqueue(sampleBuffer)
print("isReadyForMoreMediaData")
}
}
}
}
}
}
And I tried to use it but I couldn't figure it out, can anyone help me?
struct ContentView: View {
// udp://@127.0.0.1:6000
@ObservedObject var udpManager = UDPManager(host: "127.0.0.1", port: 6000)
var body: some View {
VStack {
if let buffer = udpManager.videoOutput{
SampleBufferDisplayLayerView(sampleBuffer: buffer)
.frame(width: 300, height: 200)
}
}
.onAppear(perform: {
udpManager.connectUDP()
})
}
}
Recently I've been trying to play some AV1-encoded streams on my iPhone 15 Pro Max. First, I check for hardware support:
VTIsHardwareDecodeSupported(kCMVideoCodecType_AV1); // YES
Then I need to create a CMFormatDescription in order to pass it into a VTDecompressionSession. I've tried the following:
{
mediaType:'vide'
mediaSubType:'av01'
mediaSpecific: {
codecType: 'av01' dimensions: 394 x 852
}
extensions: {{
CVFieldCount = 1;
CVImageBufferChromaLocationBottomField = Left;
CVImageBufferChromaLocationTopField = Left;
CVPixelAspectRatio = {
HorizontalSpacing = 1;
VerticalSpacing = 1;
};
FullRangeVideo = 0;
}}
}
but VTDecompressionSessionCreate gives me error -8971 (codecExtensionNotFoundErr, I assume).
So it has something to do with the extensions dictionary? I can't find anywhere which set of extensions is necessary for it to work 😿.
VideoToolbox has convenient functions for creating descriptions of AVC and HEVC streams (CMVideoFormatDescriptionCreateFromH264ParameterSets and CMVideoFormatDescriptionCreateFromHEVCParameterSets), but not for AV1.
As of today I am using XCode 15.0 with iOS 17.0.0 SDK.