Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

Swift playgrounds (.swiftpm) and CoreML
Hey guys, I've been having difficulties transferring my Xcode project to a Swift playground (.swiftpm) for the Swift Student Challenge. I keep getting these errors as well as none of the views being able to find the model in scope: "TrashDetector 1.mlmodel: No predominant language detected. Set COREML_CODEGEN_LANGUAGE to preferred language." Unexpected duplicate tasks: Target 'TrashQuest' (project 'TrashQuest') has write command with output /Users/kmcph3/Library/Developer/Xcode/DerivedData/TrashQuest-glvzskunedgtakfrdmsxdoplondj/Build/Intermediates.noindex/TrashQuest.build/Debug-iphonesimulator/TrashQuest.build/0a4ef2429d66360920ddb4f16e65e233.sb I've gone through multiple post with these exact problems, but they all seem to be talking about ".playground" files due to the "Resources" folder (mind you I did try exactly what they said). Is there anyone that can help??? (Quick side note, why does it need to be a swiftpm file for the SSC??? Like why can't we just send the zip of our Xcode project??)
2
0
736
Feb ’25
Metal GPU Work Won't Stop
Is there any way to stop GPU work running that is scheduled using metal? Long shader calculations don't stop when application is stopped in Xcode and continue to take up GPU time and affect the display. Why is this functionality not available when Swift Tasks are able to be canceled?
2
0
700
Feb ’25
Efficient Clustering of Images Using VNFeaturePrintObservation.computeDistance
Hi everyone, I'm working with VNFeaturePrintObservation in Swift to compute the similarity between images. The computeDistance function allows me to calculate the distance between two images, and I want to cluster similar images based on these distances. Current Approach Right now, I'm using a brute-force approach where I compare every image against every other image in the dataset. This results in an O(n^2) complexity, which quickly becomes a bottleneck. With 5000 images, it takes around 10 seconds to complete, which is too slow for my use case. Question Are there any efficient algorithms or data structures I can use to improve performance? If anyone has experience with optimizing feature vector clustering or has suggestions on how to scale this efficiently, I'd really appreciate your insights. Thanks!
0
0
497
Feb ’25
missing CreateML frameworks
I have reinstalled everything including command line tools but the CreateML frameworks fail to install, I need the framework so that I can train my auto-categorzation model which predicts category based on descriptions. I need that framework because I want to use reviision 4. please suggest advice on how do I proceed
4
0
627
Feb ’25
What special features does Apple officially have that use ML or AI?
I am a App designer and I am curious about what specific ML or AI Apple used to develop those features in the system. As far as I know, Apple's hand-raising detection, destination recommendations in maps, and exercise types in fitness all use ML. Are there more specific application examples of ML or AI? Does Apple have a document specifically introducing examples of specific applications of ML or AI technology in the system?
1
0
563
Feb ’25
Unexpectedly slow CreateML text classifier training (limited GPU/CPU usage)
While training a text classifier model with a few thousand samples completes in seconds, when using 100,000 or 1 million samples, CreateML's training time increases exponentially (to hours or days). During these hours/days, GPU usage is low and almost every CPU core is idle. When using the Swift APIs for model training, resource utilization does not increase. I'm using Xcode 16.2, macOS 15.2 on either an M2 Ultra 64 GB or an M3 Max 48 GB laptop (both using built-in SSD with ~500 GB free) running no other applications. Is there a setting I've missed to allow training to take over more of my computing resources? Is this expected of CreateML (i.e., when looking to exploit a larger corpus, I should move to other tooling)? I'd love to speed up my iteration cycle time.
1
0
547
Feb ’25
Issues with using ClassifyImageRequest() on an Xcode simulator
Hello, I am developing an app for the Swift Student challenge; however, I keep encountering an error when using ClassifyImageRequest from the Vision framework in Xcode: VTEST: error: perform(_:): inside 'for await result in resultStream' error: internalError("Error Domain=NSOSStatusErrorDomain Code=-1 \"Failed to create espresso context.\" UserInfo={NSLocalizedDescription=Failed to create espresso context.}") It works perfectly when testing it on a physical device, and I saw on another thread that ClassifyImageRequest doesn't work on simulators. Will this cause problems with my submission to the challenge? Thanks
5
1
712
Feb ’25
CoreML inference on iOS HW uses only CPU on CoreMLTools imported Pytorch model
I have exported a Pytorch model into a CoreML mlpackage file and imported the model file into my iOS project. The model is a Music Source Separation model - running prediction on audio-spectrogram blocks and returning separated audio source spectrograms. Model produces correct results vs. desktop+GPU+Python but the inference on iPhone 15 Pro Max is really, really slow. Using Xcode model Performance tool I can see that the inference isn't automatically managed between compute units - all of it runs on CPU. The Performance tool notation hints all that ops should be supported by both the GPU and Neural Engine. One thing to note, that when initializing the model with MLModelConfiguration option .cpuAndGPU or .cpuAndNeuralEngine there is an error in Xcode console: `Error(s) occurred compiling MIL to BNNS graph: [CreateBnnsGraphProgramFromMIL]: Failed to determine convolution kernel at location at /private/var/containers/Bundle/Application/2E3C4AFF-1FA4-4C95-AAE4-ECEBC0FB0BF9/mymss.app/mymss.mlmodelc/model.mil:2453:12 @ CreateBnnsGraphProgramFromMIL` Before going back hammering the model in Python, are there any tips/strategies I could try in CoreMLTools export phase or in configuring the model for prediction on iOS? My export toolchain is currently Linux with CoreMLTools v8.1, export target iOS16.
2
0
664
Feb ’25
Feature Request – Support for GS1 DataBar Stacked in Vision Framework
Dear Apple Developer Team, I am writing to request the addition of GS1 DataBar Stacked (both regular and expanded variants) to the barcode symbologies supported by the Vision framework (VNBarcodeSymbology) and VisionKit's DataScannerViewController. Currently, Vision supports several GS1 DataBar formats, such as: VNBarcodeSymbology.gs1DataBar VNBarcodeSymbology.gs1DataBarExpanded VNBarcodeSymbology.gs1DataBarLimited However, GS1 DataBar Stacked is widely used in industries such as retail, pharmaceuticals, and logistics, where space constraints prevent the use of the standard GS1 DataBar format. Many businesses rely on this symbology to encode GTINs and other product data, but Apple's barcode scanning API does not explicitly support it. Why This Feature Matters: Essential for Small Packaging: GS1 DataBar Stacked is commonly used on small product labels where a standard linear barcode does not fit. Widespread Industry Adoption: Many point-of-sale (POS) systems and inventory management tools require this symbology. Improves iOS Adoption for Enterprise Use: Adding support would make Apple’s Vision framework a more viable solution for businesses that currently rely on third-party barcode scanning SDKs. Feature Request: Please add GS1 DataBar Stacked and GS1 DataBar Expanded Stacked to the recognized symbologies in: VNBarcodeSymbology (for Vision framework) DataScannerViewController (for VisionKit) This addition would enhance the versatility of Apple’s barcode scanning tools and reduce the need for third-party libraries. I appreciate your consideration of this request and would be happy to provide more details or test implementations if needed. Thank you for your time and support! Best regards
2
5
548
Feb ’25
Inquiry About GS1 DataBar Stacked Support in Vision Framework
Hello, I am currently developing an application that requires barcode scanning using Apple’s Vision framework (VNBarcodeSymbology). I noticed that the framework supports several GS1 DataBar symbologies, such as: VNBarcodeSymbology.gs1DataBar VNBarcodeSymbology.gs1DataBarExpanded VNBarcodeSymbology.gs1DataBarLimited However, I could not find any explicit reference to support for GS1 DataBar Stacked (both regular and expanded variants). Could you confirm whether GS1 DataBar Stacked is currently supported in VisionKit's DataScannerViewController or VNBarcodeObservation? If not, are there any plans to include support for this symbology in a future iOS update? This functionality is critical for my use case, as GS1 DataBar Stacked barcodes are widely used in retail, pharmaceuticals, and logistics, where space constraints prevent the use of standard GS1 DataBar formats. I appreciate any clarification on this matter and would be happy to provide additional details if needed.
0
0
369
Feb ’25
Training Images for Vision Classifier Model - Swift Student Challenge
I'm working on my Swift Student Challenge submission and developing a Vision framework-based image classifier. I want to ensure I'm following best practices for training data and follow to guidelines for what images I use to train my image classifier. What types of images can I use for training my model? Are there specific image databases or resources recommended by Apple that are safe to use for Swift Student Challenge submissions? Currently considering images used from wikipedia, and my own images
1
0
420
Feb ’25
Create ML Model Shows Wrong output or predictions in xcode
I am working on a CoreML image classification model in Xcode, which takes a 299x299 image and attempts to classify hand-drawn sketches. The model was trained using Create ML and works perfectly when tested in the Create ML preview. However, when used in Xcode application, the classification results are incorrect. I have already verified that the image is correctly resized to 299x299 pixels, matching the input size of the model. The classification always returns incorrect results, even when using images that were correctly classified during training. I originally used kCVPixelFormatType_32ARGB, but I read that CoreML typically expects BGRA format. I updated my conversion function to use kCVPixelFormatType_32BGRA and CGImageAlphaInfo.premultipliedLast, but the issue persists. This makes me suspect that either the pixel format is still incorrect or that something went wrong during the .mlmodelc compilation.
1
2
402
Jan ’25
How does the extract method from ImagePlaygroundConcept work?
I’m building an app that generates images based on text input from a specific text field. However, I’m encountering a problem: For short prompts like "a cat and a dog", the entire string is sent to the Image Playground, even when I use the extracted method. For longer inputs, the behavior is inconsistent. Sometimes it extracts keywords correctly, but other times it doesn’t extract anything at all. Since my app relies on generating images based on the extracted keywords, this inconsistency negatively impacts the user experience in my app. How can I make sure that keywords are always extracted from the input string? Button("Generate", systemImage: "apple.intelligence") { isPresented = true } .imagePlaygroundSheet(isPresented: $isPresented, concepts: [ImagePlaygroundConcept.extracted(from: text, title: textTitle)]) { url in imageURL = url }
1
0
486
Jan ’25
Running a local LLM on Swift Playgrounds
I am trying to run TinyLlama directly using Swift Playgrounds for iOS. I have tried multiple solutions, like libraries (LLM.swift, swift-transformers, ...) which never worked due to import issues, and also tried importing an exported mlmodel. For the later, I followed the article about Llama 3.1 on CoreML. It was hard to understand how to do the inference with it, but I was able to export a mlpackage, that I then placed in a xcode project to generate the mlmodelc (compiled model) and the model class. I had to go with the first version described in the article, without optimizations, as I got errors during model loading with the flexible input shapes. I was able to run the model for one token generation. But my biggest problem is that, though the mlmodelc is only 550 MiB, th model loads 24+GiB of memory, largely exceeding what I can have on an iOS device. Is there a way to use do LLM inferences on Swift Playgrounds at a reasonable speed (even 1 token / s would be sufficient)?
0
1
1.3k
Jan ’25
Run Time Issues with Swift/Core ML
Hello! I have a swift program that tracks the location of a ball (through the back camera). It seems to be working fine, but the only issue is the run time, particularly my concatenate, normalize, and argmax functions, which are meant to be a 1 to 1 copy of the PyTorch argmax function and the following python lines: imgs = np.concatenate((img, img_prev, img_preprev), axis=2) imgs = imgs.astype(np.float32)/255.0 imgs = np.rollaxis(imgs, 2, 0) inp = np.expand_dims(imgs, axis=0) # used to pass into model However, I need my program to run in real time and in an ideal world, I want it to run way under real time. Below is a run down of the run times that result from my code: Starting model inference Setup took: 0.0 seconds Resize took: 0.03741896152496338 seconds Concatenation took: 0.3359949588775635 seconds Normalization took: 0.9906361103057861 seconds Model prediction took: 0.3425499200820923 seconds Argmax took: 28.17007803916931 seconds Postprocess took: 0.054128050804138184 seconds Model inference took 29.934185028076172 seconds Here are the concatenateBuffers, normalizeBuffers, and argmax functions that I use: func concatenateBuffers(_ buffers: [CVPixelBuffer?]) -> CVPixelBuffer? { guard buffers.count == 3, let first = buffers[0] else { return nil } let width = CVPixelBufferGetWidth(first) let height = CVPixelBufferGetHeight(first) let targetChannels = 9 var concatenated: CVPixelBuffer? let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue] as CFDictionary CVPixelBufferCreate(kCFAllocatorDefault, width, height, kCVPixelFormatType_32BGRA, attrs, &concatenated) guard let output = concatenated else { return nil } CVPixelBufferLockBaseAddress(output, []) defer { CVPixelBufferUnlockBaseAddress(output, []) } guard let outputData = CVPixelBufferGetBaseAddress(output) else { return nil } let outputPtr = UnsafeMutablePointer<UInt8>(OpaquePointer(outputData)) // Lock all input buffers at once buffers.forEach { buffer in guard let buffer = buffer else { return } CVPixelBufferLockBaseAddress(buffer, .readOnly) } defer { buffers.forEach { CVPixelBufferUnlockBaseAddress($0!, .readOnly) } } // Process each input buffer for (frameIdx, buffer) in buffers.enumerated() { guard let buffer = buffer, let inputData = CVPixelBufferGetBaseAddress(buffer) else { continue } let inputPtr = UnsafePointer<UInt8>(OpaquePointer(inputData)) let bytesPerRow = CVPixelBufferGetBytesPerRow(buffer) let totalPixels = width * height // Process all pixels in one go for this frame for i in 0..<totalPixels { let y = i / width let x = i % width let inputOffset = y * bytesPerRow + x * 4 let outputOffset = i * targetChannels + frameIdx * 3 // BGR order to match numpy outputPtr[outputOffset] = inputPtr[inputOffset + 2] // B outputPtr[outputOffset + 1] = inputPtr[inputOffset + 1] // G outputPtr[outputOffset + 2] = inputPtr[inputOffset] // R } } return output } func normalizeBuffer(_ buffer: CVPixelBuffer?) -> MLMultiArray? { guard let input = buffer else { return nil } let width = CVPixelBufferGetWidth(input) let height = CVPixelBufferGetHeight(input) let channels = 9 CVPixelBufferLockBaseAddress(input, .readOnly) defer { CVPixelBufferUnlockBaseAddress(input, .readOnly) } guard let inputData = CVPixelBufferGetBaseAddress(input) else { return nil } let shape = [1, NSNumber(value: channels), NSNumber(value: height), NSNumber(value: width)] guard let output = try? MLMultiArray(shape: shape, dataType: .float32) else { return nil } let inputPtr = inputData.assumingMemoryBound(to: UInt8.self) let bytesPerRow = CVPixelBufferGetBytesPerRow(input) let ptr = UnsafeMutablePointer<Float>(OpaquePointer(output.dataPointer)) let totalSize = width * height for c in 0..<channels { for idx in 0..<totalSize { let h = idx / width let w = idx % width let inputIdx = h * bytesPerRow + w * channels + c ptr[c * totalSize + idx] = Float(inputPtr[inputIdx]) / 255.0 } } return output } func argmax(_ array: MLMultiArray) -> MLMultiArray? { let shape = array.shape.map { $0.intValue } guard shape.count == 3, shape[0] == 1, shape[1] == 256, shape[2] == 230400 else { return nil } guard let output = try? MLMultiArray(shape: [1, NSNumber(value: 230400)], dataType: .int32) else { return nil } let ptr = UnsafePointer<Float>(OpaquePointer(array.dataPointer)) let outputPtr = UnsafeMutablePointer<Int32>(OpaquePointer(output.dataPointer)) let channelSize = 230400 for pos in 0..<230400 { var maxValue = -Float.infinity var maxIndex: Int32 = 0 for channel in 0..<256 { let value = ptr[channel * channelSize + pos] if value > maxValue { maxValue = value maxIndex = Int32(channel) } } outputPtr[pos] = maxIndex } return output } Are there any glaring areas of inefficiencies that can be reduced to allow for under real time processing whilst following the same logic as found in the python code exactly? Would using Obj-C speed things up for some reason? Are there any tools I can use so I don't have to write these functions myself? Additionally, in the classes init, function, I tried to check the compute units being used since I feel 0.34 seconds for a singular model prediction is also far too long, but no print statements are showing for some reason: init() { guard let loadedModel = try? BallTrackerModel() else { fatalError("Could not load model") } let config = MLModelConfiguration() config.computeUnits = .all guard let configuredModel = try? BallTrackerModel(configuration: config) else { fatalError("Could not configure model") } self.model = configuredModel print("model loaded with compute units \(config.computeUnits.rawValue)") } Thanks!
3
0
649
Jan ’25