Thanks for your input.
My idea of file coordination was that it would allow processes to make "atomic" changes to files and folders so that they are left in a consistent state to other processes that use file coordination as well.
Yes, that's generally what it's goal is, particularly with it's broader role in file versioning and syncing. However, the problem of your particular situation is:
-
File coordination is inherently "opt in", so it only helps if the writing app (which you don't control) chooses to implement it.
-
It looks like you want to operate in the "general case", meaning you're expecting to work with whatever directories the user specifies and with whatever apps/files they happen to be using.
Those to factors mean you simply cannot rely on file coordination. You can certainly choose to implement it and there are definitely case where it may be helpful, but you still need to figure out a solution that works for all of the other case where the writing process doesn't implement file coordination.
Unzipping seemed like an ideal fit to me, regardless how long it takes
No, this is not what file coordination is "for". It is NOT acceptable for an app to block inside a file coordination call for an "extended" period of time. File coordination calls are intended to be very brief (<~1s) "low level" I/O calls, not a tool for blocking long running operations. The problem with doing this:
(after all, the user should be aware that an unzip operation is in progress and shouldn't worry about other scan operations containing the zip archive apparently hanging for the duration of the unzip operation).
...is that you're basically setting up a "trap" for other apps running on the system. Apps expect file coordination calls to block for very limited duration and now those calls will end up blocking for far longer than they were ever expecting.
Note that this means that correctly using file coordination for large operations is more complicated than simply calling coordinate(writingItemAt:) and writing whatever you want. In practice, that means that large operations should generally be implemented using some variation of this approach:
-
The app uses NSFileManager.url(for:in:appropriateFor:create:) to establish a private location on the same volume as the final destination.
-
The app writes whatever it needs to write out to that location.
-
The app starts a coordinated write, then uses NSFileManager.replaceItemAt(_:withItemAt:backupItemName:options:) to safely replace the existing file with it's new objects.
FYI, this same issue can also apply when reading data. Particularly on APFS (where file cloning makes file duplication extraordinarily fast), an app doing bulk copying might be better off using a coordinated read to create a "private" copy on the same volume, then using that private copy as the source for it's copy operation.
That leads to here:
If file coordination isn't the answer and the unzipped files are not
meant to be accessed before the unzip operation has completed, perhaps
it would make sense for unzip to write the temporary files to a
standard temporary folder and then move them to the final location
only at the end.
I won't try and provide a full/specific justification but here are two specific issues:
-
This approach has issues outside of AFPS (and HFS+). On AFPS and HFS+, the replace operation in #3 is atomic operation internal to the file system, which means it's both extremely fast and require no meaningful storage. On other file systems, it's going to require copying and will (temporarily) require double the total storage.
-
For some operations (copying and unzip included), there can be value in ensuring that whatever data the operation produces is accessible, even if the full operation never completes. This could be done by recovering data out of the temporary location, but the most straightforward solution is to just write directly to the final target.
As I mentioned, I'm not trying to avoid permission errors in general, but only those caused by temporary operations such as unzipping. My app keeps syncing, sends a notification when an error happens and allows the user to see the list of errors, but the user isn't forced to do anything for the app to keep syncing. But even if the user is aware that the errors are caused by the unzip operation, they would have to go through the list of errors (which could be quite long) to make sure that they haven't overseen an unrelated error. What I could do mitigate this is to mark an error as "solved" if a successive sync of the same file is successful.
The other thing I would add here is that simply heuristics can add significant value. For example:
...so if "foo" is set to "420" and in the same directory as "foo.zip", then there is a decent chance that this is an in-progress unzip operation. Not guaranteed of course, but the way I would think about this is that your goal here is to better manage work and present what's actually going on, not "perfection". Related to that point, going back to here:
sends a notification when an error happens and allows the user to see the list of errors,
Unless you've gone out of your way (at significant performance cost) to impose a specific file ordering, the order you process files in isn't going to be meaningful to the user. On HFS+, the catalog ordering behavior makes bulk iteration roughly alphabetical, but on most other file systems (particularly APFS) the order is going to look basically arbitrary/random. That's important here because it means there isn't really any reason your app HAS to tell the user about any particular issue at the moment you encounter it- you could just defer that location/file to later processing and silently keep going.
Now, there's obviously a balancing act there (you don't necessarily want your app to dump all of it's errors at the very end), but it can certainly be another tool you use to improve the overall experience.
__
Kevin Elliott
DTS Engineer, CoreOS/Hardware