Thanks for being a part of WWDC25!

How did we do? We’d love to know your thoughts on this year’s conference. Take the survey here

Getting Progress from long running process

I have been working on updating an old app that makes extensive use of Objective-C's NSTask. Now using Process in Swift, I'm trying to gather updates as the process runs, using readabilityHandler and availableData. However, my process tends to exit before all data has been read. I found this post entitled "Running a Child Process with Standard Input and Output" but it doesn't seem to address gathering output from long-running tasks. Is there a straightforward way to gather ongoing output from a long running task without it prematurely exiting?

Answered by DTS Engineer in 840446022
I have a tool that verifies volumes using diskutil.

OK. I think you’re in luck here, because while diskutil writes its progress to stdout, it seems to flush stdout after each write. Consider:

% diskutil verifyVolume /Volumes/Test | cat
Started file system verification on disk64s1 (Test)
…
Finished file system verification on disk64s1 (Test)

Each line of output appears as the operation occurs. If diskutil weren’t flushing stdout, it’d all appear at the end. And when that happens things get more complex.


Based on this assumption, I created a small test script that I can use to verify that I’m getting data promptly:

#! /bin/sh

echo 0
for i in $(seq 4)
do
    sleep 2
    echo ${i}
done

I then started poking at it with Subprocess:

import Foundation
import Subprocess

func main() async throws {
    let config = Subprocess.Configuration(
        executable: .path("/Users/quinn/slow-print.sh")
    )
    let result = try await Subprocess.run(config) { (execution, stdin, stdout, stderr) -> Void in
        for try await chunk in stdout {
            print(chunk.count)
        }
    }
    print(result)
}

try await main()

That’s disappointing. The AsyncBufferSequence approach seems to accumulate all the data until it hits EOF — or, presumably, until the data hits some desired size [1] — so it won’t work for this case.

So I fell back to the file descriptor approach:

let result = try await Subprocess.run(
    config,
    output: FileDescriptorOutput.fileDescriptor(write, closeAfterSpawningProcess: true)
)

The problem then devolves into how to handle the file descriptor. Here’s a quick hack for that:

let (read, write) = try FileDescriptor.pipe()
Thread.detachNewThread {
    do {
        while true {
            var buffer = [UInt8](repeating: 0, count: 1024)
            let bytesRead = try buffer.withUnsafeMutableBytes { buffer in
                try read.read(into: buffer)
            }
            if bytesRead == 0 {
                print("EOF")
                break
            }
            print("chunk, count: \(bytesRead)")
        }
    } catch {
        print("error")
    }
}

In a real program you’d want to come up with something that doesn’t require you to spin up a completely new thread, for example, by using Dispatch I/O.

All-in-all, I think it’d be reasonable you to file an issue again Subprocess requesting that AsyncBufferSequence have some better way to control the low and high water marks.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] Looking at the code, this seems to be readBufferSize, which is one page.

Is there a straightforward way to gather ongoing output from a long running task … ?

Yes, although the best path forward depends on the executable you’re running. Is it something that you built? Or are you trying to run an executable built by someone else?

ps The in Running a Child Process with Standard Input and Output is effectively deprecated in favour of Swift’s shiny-new Subprocess package. That can definitely help with this problem, but the answer still depends on what executable you’re running.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Thanks! I'm afraid I still need to support Ventura, so I can't adopt Swift 6 just yet. I have a tool that verifies volumes using diskutil.

I haven't tested anything Swift on Ventura, but Xcode seems happy enough to build Swift 6 with a Ventura minimum deployment.

In playing with Subprocess, I'm assuming that I'll need to use the custom closure example... Except that the sample code has some errors, particularly, this line:

{ execution in // <--- it wants three more arguments

and:

for try await chunk in execution.standardOutput { //<--- Execution doesn't have a member standardOutput

Not sure where to go from here... Perhaps return to an earlier methodology?

Accepted Answer
I have a tool that verifies volumes using diskutil.

OK. I think you’re in luck here, because while diskutil writes its progress to stdout, it seems to flush stdout after each write. Consider:

% diskutil verifyVolume /Volumes/Test | cat
Started file system verification on disk64s1 (Test)
…
Finished file system verification on disk64s1 (Test)

Each line of output appears as the operation occurs. If diskutil weren’t flushing stdout, it’d all appear at the end. And when that happens things get more complex.


Based on this assumption, I created a small test script that I can use to verify that I’m getting data promptly:

#! /bin/sh

echo 0
for i in $(seq 4)
do
    sleep 2
    echo ${i}
done

I then started poking at it with Subprocess:

import Foundation
import Subprocess

func main() async throws {
    let config = Subprocess.Configuration(
        executable: .path("/Users/quinn/slow-print.sh")
    )
    let result = try await Subprocess.run(config) { (execution, stdin, stdout, stderr) -> Void in
        for try await chunk in stdout {
            print(chunk.count)
        }
    }
    print(result)
}

try await main()

That’s disappointing. The AsyncBufferSequence approach seems to accumulate all the data until it hits EOF — or, presumably, until the data hits some desired size [1] — so it won’t work for this case.

So I fell back to the file descriptor approach:

let result = try await Subprocess.run(
    config,
    output: FileDescriptorOutput.fileDescriptor(write, closeAfterSpawningProcess: true)
)

The problem then devolves into how to handle the file descriptor. Here’s a quick hack for that:

let (read, write) = try FileDescriptor.pipe()
Thread.detachNewThread {
    do {
        while true {
            var buffer = [UInt8](repeating: 0, count: 1024)
            let bytesRead = try buffer.withUnsafeMutableBytes { buffer in
                try read.read(into: buffer)
            }
            if bytesRead == 0 {
                print("EOF")
                break
            }
            print("chunk, count: \(bytesRead)")
        }
    } catch {
        print("error")
    }
}

In a real program you’d want to come up with something that doesn’t require you to spin up a completely new thread, for example, by using Dispatch I/O.

All-in-all, I think it’d be reasonable you to file an issue again Subprocess requesting that AsyncBufferSequence have some better way to control the low and high water marks.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] Looking at the code, this seems to be readBufferSize, which is one page.

Since Subprocess is so shiny and new, do I still use Feedback Assistant, or do I need to do so elsewhere?

I've been working on reading the data with:

DispatchIO.read(fromFileDescriptor: readFD.rawValue, maxLength: whatLength, runningHandlerOn: queue)

and also from an instance of DispatchIO like:

let channel = DispatchIO(type: .stream, fileDescriptor: readFD.rawValue, queue: queue) { error in ...}

// and

channel.read(offset: 0, length: 2, queue: queue) { done, data, error in ... }

I'm finding that these don't 'see' the data until the process finishes either... can you point out what I'm missing?

Many thanks!

Since Subprocess is so shiny and new, do I still use Feedback Assistant … ?

No.

Well, it’d probably work, but it’s not the best option. Subprocess is an open source package and you should follow the advice in the read me. The key advantage here is that you get a lot more detail than you would with Feedback Assistant.

I'm finding that these don't 'see' the data until the process finishes either

Right. Dispatch I/O has the concept of high- and low-water marks, and you need to set the low-water mark to 1 for an interactive channel like this. See the setLimit(lowWater:) method.

IMPORTANT Note this text in the doc:

In practice, your handlers should be designed to handle data blocks that are significantly larger than the current low-water mark.

Setting the low-water mark to 1 doesn’t mean that you’ll get bytes one at a time (unless you also set the high-water mark). In most cases you’ll receive bug chunks of data, up to the high-water mark. However, it means that, if there’s only 1 byte of data available, Dispatch will deliver it promptly rather than waiting for more.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Looks like someone has already posted an Issue regarding adding high/low water marks for AsyncStream. It will certainly be ideal if that option becomes available. Thanks!

Looks like someone has already posted an Issue

I’m posting a link for the sake of those following along at home (but mostly for Future Quinn™ :-)…

Processes with small and infrequent output doesn't get emitted in a timely manner when using sequence to capture the output

Lemme know if I got that wrong.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Getting Progress from long running process
 
 
Q