Symbolicating kernel backtraces on Apple Silicon

I am able to symbolicate kernel backtraces for addresses that belong to my kext.

Is it possible to symbolicate kernel backtraces for addresses that lie beyond my kext and reference kernel code?

panic(cpu 4 caller 0xfffffe001d86a708): watchdog timeout: no checkins from watchdogd in 90 seconds (25 total checkins since monitoring last enabled)
Debugger message: panic
Memory ID: 0x6
OS release type: User
OS version: 24E248
Kernel version: Darwin Kernel Version 24.4.0: Wed Mar 19 21:18:03 PDT 2025; root:xnu-11417.101.15~1\/RELEASE_ARM64_T8112
Fileset Kernelcache UUID: 10F2AA02A75C57325E92D9C2CB934145
Kernel UUID: 31040A93-B7A4-3B28-8A52-1D311EE62ECC
Boot session UUID: 71B31B6A-B61F-4211-A83A-2D81AEC82D38
iBoot version: iBoot-11881.101.1
iBoot Stage 2 version: iBoot-11881.101.1
secure boot?: YES
roots installed: 0
Paniclog version: 14
KernelCache slide: 0x0000000014608000
KernelCache base:  0xfffffe001b60c000
Kernel slide:      0x0000000014610000
Kernel text base:  0xfffffe001b614000
Kernel text exec slide: 0x0000000015d88000
Kernel text exec base:  0xfffffe001cd8c000
mach_absolute_time: 0x1dd77a623
Epoch Time:        sec       usec
  Boot    : 0x6808b8bc 0x0006328e
  Sleep   : 0x00000000 0x00000000
  Wake    : 0x00000000 0x00000000
  Calendar: 0x6808ba04 0x000b3a84

Zone info:
  Zone map: 0xfffffe136a000000 - 0xfffffe396a000000
  . VM    : 0xfffffe136a000000 - 0xfffffe1936000000
  . RO    : 0xfffffe1936000000 - 0xfffffe1bd0000000
  . GEN0  : 0xfffffe1bd0000000 - 0xfffffe219c000000
  . GEN1  : 0xfffffe219c000000 - 0xfffffe2768000000
  . GEN2  : 0xfffffe2768000000 - 0xfffffe2d34000000
  . GEN3  : 0xfffffe2d34000000 - 0xfffffe3300000000
  . DATA  : 0xfffffe3300000000 - 0xfffffe396a000000
  Metadata: 0xfffffe8386010000 - 0xfffffe838f810000
  Bitmaps : 0xfffffe838f810000 - 0xfffffe8393d8c000
  Extra   : 0 - 0

CORE 0 recently retired instr at 0xfffffe001cf4dfe0
CORE 1 recently retired instr at 0xfffffe001cf4dfe4
CORE 2 recently retired instr at 0xfffffe001cf4dfe4
CORE 3 recently retired instr at 0xfffffe001cf4dfe4
CORE 4 recently retired instr at 0xfffffe001cf4c750
CORE 5 recently retired instr at 0xfffffe001cf4dfe4
CORE 6 recently retired instr at 0xfffffe001cf4dfe4
CORE 7 recently retired instr at 0xfffffe001cf4dfe4
TPIDRx_ELy = {1: 0xfffffe31ffe31fc0  0: 0x0000000000001004  0ro: 0x0000000000000000 }
CORE 0 PVH locks held: None
CORE 1 PVH locks held: None
CORE 2 PVH locks held: None
CORE 3 PVH locks held: None
CORE 4 PVH locks held: None
CORE 5 PVH locks held: None
CORE 6 PVH locks held: None
CORE 7 PVH locks held: None
CORE 0: PC=0xfffffe001ce24e3c, LR=0xfffffe001ce24e38, FP=0xfffffe6bbf8d3ed0
CORE 1: PC=0xfffffe001ce24e3c, LR=0xfffffe001ce24e38, FP=0xfffffe6bbf37bed0
CORE 2: PC=0xfffffe001ce24e3c, LR=0xfffffe001ce24e38, FP=0xfffffe6bbf64bed0
CORE 3: PC=0xfffffe001ce24e3c, LR=0xfffffe001ce24e38, FP=0xfffffe6bbfa73ed0
CORE 4 is the one that panicked. Check the full backtrace for details.
CORE 5: PC=0xfffffe001ce24e3c, LR=0xfffffe001ce24e38, FP=0xfffffe6bc108bed0
CORE 6: PC=0xfffffe001ce24e3c, LR=0xfffffe001ce24e38, FP=0xfffffe6bbfc23ed0
CORE 7: PC=0xfffffe001ce24e3c, LR=0xfffffe001ce24e38, FP=0xfffffe6bc05dbed0
Compressor Info: 3% of compressed pages limit (OK) and 3% of segments limit (OK) with 2 swapfiles and OK swap space
Total cpu_usage: 82569818
Thread task pri cpu_usage
0xfffffe31ffe31fc0 kernel_task 91 3977810
0xfffffe31ffc42f60 SystemUIServer 47 0
0xfffffe31ff0cff80 WallpaperAgent 47 0
0xfffffe31fe1bb7f0 bluetoothd 47 0
0xfffffe31feeb37f0 CoreServicesUIAgent 46 0

Panicked task 0xfffffe31fffeaaf8: 0 pages, 677 threads: pid 0: kernel_task
Panicked thread: 0xfffffe31ffe31fc0, backtrace: 0xfffffe81838478d0, tid: 1597
		  lr: 0xfffffe001cde8ab8  fp: 0xfffffe8183847960
		  lr: 0xfffffe001cf44894  fp: 0xfffffe81838479d0
		  lr: 0xfffffe001cf42a90  fp: 0xfffffe8183847a90
		  lr: 0xfffffe001cd938c0  fp: 0xfffffe8183847aa0
		  lr: 0xfffffe001cde83b0  fp: 0xfffffe8183847e70
		  lr: 0xfffffe001d66d670  fp: 0xfffffe8183847e90
		  lr: 0xfffffe001d86a708  fp: 0xfffffe8183847ee0
		  lr: 0xfffffe001d86941c  fp: 0xfffffe8183847f20
		  lr: 0xfffffe001d866504  fp: 0xfffffe8183847f40
		  lr: 0xfffffe001e234bc8  fp: 0xfffffe8183847fc0
		  lr: 0xfffffe001cf4611c  fp: 0xfffffe8183847fe0
		  lr: 0xfffffe001cd93938  fp: 0xfffffe8183847ff0
		  lr: 0xfffffe001cee0a80  fp: 0xfffffe6bbf6ffc20
		  lr: 0xfffffe001cee0a80  fp: 0xfffffe6bbf6ffd70
		  lr: 0xfffffe001d38e408  fp: 0xfffffe6bbf6ffd90
		  lr: 0xfffffe001d3e9304  fp: 0xfffffe6bbf6ffe80
		  lr: 0xfffffe001ce8c24c  fp: 0xfffffe6bbf6ffe90
		  lr: 0xfffffe001cedf514  fp: 0xfffffe6bbf6fff20
		  lr: 0xfffffe001cd9cc04  fp: 0x0000000000000000
Answered by DTS Engineer in 836958022

Let me step back to the beginning and explain this in a bit more detail:

  1. When using "-l", what atos is actually "getting" there is the load address starting from the first __TEXT segment in the executable.

  2. In arm64, (very generally) things shifted about code was loading out of __TEXT_EXEC, NOT _TEXT.

SO, to correctly symbolicate, you need to properly shift the load address to account for that. Starting with loadable KEXTs, this is fairly easy. They're __TEXT segment is at 0, so the math here is:

(__TEXT_EXEC.vmaddr) - (__TEXT.vmaddr) ->

KEXTs also tend to have a fixed opening __TEXT size, so the value of __TEXT_EXEC.vmaddr ends up being "0x4000"

0x4000 - 0 -> 0x4000

That is where my original command came from:

atos -arch arm64e -o <symbol file path)  -l <KEXT load address - 0x4000>  <address to symbol>

NOTE: My earlier post was wrong, you need to subtract not add.

For kext, that load address comes directly from the panic log. Here is what that looks like in the panic from the other thread on this issue:

lr: 0xfffffe0013723894 fp: 0x0000000000000000
Kernel Extensions in backtrace:
com.company.product(1.4.21d119)[92BABD94-80A4-3F6D-857A-3240E4DA8009]@0xfffffe001203bfd0->0xfffffe00120533ab
dependency: com.apple.iokit.IOPCIFamily(2.9)[6D6666E6-340F-3A5E-9464-DE05164C0658]@0xfffffe0015e65e90->0xfffffe0015e93b3f

Now, the math is actually exactly the same for the kernel, but the numbers are totally different. Running otool -l on kernel.release.t8112 gets you:

__TEXT.vmaddr ->
...
Load command 0
cmd LC_SEGMENT_64
cmdsize 552
segname __TEXT
vmaddr 0xfffffe0007004000
vmsize 0x0000000000100000
...

AND:

__TEXT_EXEC.vmaddr:
...
Load command 2
cmd LC_SEGMENT_64
cmdsize 312
segname __TEXT_EXEC
vmaddr 0xfffffe00072b8000
vmsize 0x00000000008f0000
...

(You may notice there are multiple __TEXT_EXEC section, in this case all we care about is the first)

SO, plugging that that into our math above:

(__TEXT_EXEC.vmaddr) - (__TEXT.vmaddr) ->
0xfffffe00072b8000 - 0xfffffe0007004000 -> 0x2B4000

That number is then subtracted from:

Kernel text exec base: 0xfffffe001cd8c000
0xfffffe001cd8c000 - 0x2B4000 -> 0xFFFFFE001CAD8000

Plugging that value into your atos command:

atos -o kernel.release.t8112 -arch arm64e -l 0xFFFFFE001CAD8000 0xfffffe001cde8ab8 0xfffffe001cf44894 0xfffffe001cf42a90 0xfffffe001cd938c0 0xfffffe001cde83b0 0xfffffe001d66d670 0xfffffe001d86a708 0xfffffe001d86941c 0xfffffe001d866504 0xfffffe001e234bc8 0xfffffe001cf4611c 0xfffffe001cd93938 0xfffffe001cee0a80 0xfffffe001cee0a80 0xfffffe001d38e408 0xfffffe001d3e9304 0xfffffe001ce8c24c 0xfffffe001cedf514 0xfffffe001cd9cc04

Gives us:

handle_debugger_trap (in kernel.release.t8112) (debug.c:1839)
handle_uncategorized (in kernel.release.t8112) (sleh.c:1591)
sleh_synchronous (in kernel.release.t8112) (sleh.c:1477)
fleh_synchronous (in kernel.release.t8112) + 44
panic_trap_to_debugger (in kernel.release.t8112) (debug.c:1400)
panic_with_options_and_initiator (in kernel.release.t8112) (debug.c:1213)
0xfffffe001d86a708
0xfffffe001d86941c
0xfffffe001d866504
0xfffffe001e234bc8
sleh_irq (in kernel.release.t8112) (sleh.c:3840)
fleh_irq (in kernel.release.t8112) + 64
vm_object_upl_request (in kernel.release.t8112) (vm_pageout.c:6213)
vm_object_upl_request (in kernel.release.t8112) (vm_pageout.c:6213)
ubc_create_upl_kernel (in kernel.release.t8112) (ubc_subr.c:2643)
vnode_pageout (in kernel.release.t8112) (vnode_pager.c:407)
vnode_pager_data_return (in kernel.release.t8112) (bsd_vm.c:451)
vm_pageout_iothread_external_continue (in kernel.release.t8112) (vm_pageout.c:4237)
Call_continuation (in kernel.release.t8112) + 196

Note the references to "panic..." which are an easy way to validate that the symbolication is valid. By it's nature, the kernel cannot "crash" in the same sense any other process can, as the kernel IS the core component that controls all of the mechanisms that cause OTHER processes to crash. Because of that, "all" kernel panics end in a call to "panic" because that's the function that ends up generating the log data your looking at and then stopping normal execution.

Anticipating your question, no, I don't know why the other 4 frame aren't symbolicating, but I don't think it's an issue with atos. Apple obviously has our own internal symbolication tools and they're getting exactly the same result. Are they from your KEXT and you just didn't include the KEXT address info?

Having said that, they also don't really matter for this particular panic. That's because of this message at the top of the log:

panic(cpu 4 caller 0xfffffe001d86a708): watchdog timeout: no checkins from watchdogd in 90 seconds (25 total checkins since monitoring last enabled)

This message basically means exactly what it sounds like. There's system in place (managed by "watchdogd") to verify that user space that requires specific, critical system daemons to "checkin" with the kernel. If they fail to do so, then the system times out and the kernel panics. That dynamic also means that the specific stack traces is largely irrelevant, as it simply shows what happened to be executing at the time the panic occurred, NOT what actually caused the panic.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

BTW, how much memory does this machine have? Is it 32g or 16g?

It's 36Gb.

% expr `sysctl -n hw.memsize` / 1024 / 1024 / 1024
36

The bottom line here is that you ultimately need to constrain how quickly you complete write calls to be roughly in line with your underlying ability to transfer data.

My attempt at this was through modifying the socket polling algorithm.

The decision on whether it's a read or a write event that's occurred is taken in the sock_upcall callback.

The socket's receive buffer is queried about whether there's any data available via a call to sock_getsockopt(SO_NREAD). If there is, a thread waiting to read data is sent a read event wakeup() call. Otherwise, the algorithm considers this to be a write event and a thread waiting to write data is sent a wakeup() call. This doesn't take into account whether or not there's room in the send buffer.

It's a 'poor' man's socket polling algo, due to functions like sock_setupcalls(), soreadable(), and sowriteable() being a part of the private API.

To try and bring the rate the data is being written in line with the rate it's being transferred, I tried modifying the write part of the algorithm by considering it a write event only if there's no more data left in the socket's send buffer via a call to sock_getsockopt(SO_NWRITE). That didn't help remedy the problem.

I can't think of other ways of achieving this at the moment.

My attempt at this was through modifying the socket polling algorithm.

I don't know enough about the internals of your code to be sure, but my intuition is that this is the wrong place to try and adress this.

Moving to here....

The decision on whether it's a read or a write event that's occurred is taken in the sock_upcall callback.

Making things more explicit, my understanding is that your current logic basically looks like this:

  • write() called.
  • VNOP_WRITE is called in your driver.
  • Driver copies input data.
  • Driver returns from VNOP_WRITE
  • write() returns.
  • Driver finishes uploading data "at some point".

That basic approach is how any kind of write cache works, however, unless you're able to transfer data to disk faster than the system can call write()*, you're going to accumulate memory in the kernel and eventually panic. That means that at some point you need to decide to NOT return from "VNOP_WRITE" and instead block.

*Hint, this cannot be done, particularly since user space is allowed to "cheat" and have multiple threads calling write().

Note that this issue doesn't really apply to read(). You can't return from read() until you "have" the data, so a certain amount of blocking in implicit in it's design/"thinking". You might get into trouble if you allow your read cach to become to large, but that risk is more obvious and avoidable.

The socket's receive buffer is queried about whether there's any data available via a call to sock_getsockopt(SO_NREAD). If there is, a thread waiting to read data is sent a read event wakeup() call. Otherwise, the algorithm considers this to be a write event and a thread waiting to write data is sent a wakeup() call. This doesn't take into account whether or not there's room in the send buffer.

Unless I'm missing something, this is already to late as you've already returned from VNOP_WRITE/"write()".

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Your description of the write operation logic my filesystem implements is fairly close. I thought I'd feel in a few blanks though.

  • write() called.
  • VNOP_WRITE gets called.
  • Input data gets copied from user space into kernel space.
  • Input data equal to the size of uio_resid() gets split up into smaller chunks and is iteratively transmitted over network until there's no more input data left with sock_send(MSG_DONTWAIT) returning the number of the bytes written in sentlen, which is equal to the value of uio_resid() of the chunk passed into sock_send().
  • VNOP_WRITE returns once all input data has been transmitted.
  • write() returns.
  • Not sure about the 'driver finishes uploading data at some point' part though. Or were you referring to the TCP stack queueing up the input data after sock_send(MSG_DONTWAIT) returns for subsequent transmission?

At some point you need to decide to NOT return from "VNOP_WRITE" and instead block

Without first understanding what's using up all that VM, I can't see how such a point may be determined. How would VNOP_WRITE block?

For some reason, IOLog() data doesn't wind up in the output of log show|stream on macOS 15.5 running on M4 Max.

I've added debug print statements to output buffer sizes and memory (de)allocation for uio_t buffers used to store the chunks of input data. I'll run the zip file extraction while collecting the diagnostics via log stream on an M2 Mac where logging is working and see if that reveals anything of interest.

First off, I wanted to add a quick comment about something you said earlier:

Two-machine debugging on Apple Silicon has certain limitations, which are worded in the KDK Readme as follows:

Note: Apple silicon doesn’t support active kernel debugging. You may inspect the current state of the kernel when it is halted due to a panic or NMI. However, you cannot set breakpoints, continue code execution, step into code, step over code, or step out of the current instruction.

x86-64 is better suited for remote kernel debugging. This particular panic doesn't occur on x86-64 though. Only on Apple Silicon.

This is much less of a limitation than it might seem in practice because, in practice, being able to break and resume the kernel has always been far less useful that it might seem. So much of the system behavior overall behavior is driven by timing that it's always been the case that pausing the kernel for any significant length of time was so disruptive to it's normal operation that resuming it was of limited value.

One suggestion I do have here it think of IOPanic() (or, more likely, Debugger()) as a investigative tool, not just an error path. If you're able broadly identify the point at which the issue is starting to occur, you can then use IOPanic to halt the kernel inside your driver at whatever point will give you the best access to your drivers state, then use the debugger to investigate the specifics in more detail.

Looking at some details:

I thought I'd feel in a few blanks though.

One broad comment here. Looking at things a bit more closely, none of our file systems actually do "direct" I/O through VNOP_read/write. What they actually do is use the cluster I/O functions. Those function the generate actually I/O calls by calling into VNOP_STRATEGY. The cluster I/O functions are what control the UBC, so this also means that the system has much more control over what data is living in memory and when it knows it's flushed to disk.

Input data equal to the size of uio_resid() gets split up into smaller chunks and is iteratively transmitted over network until there's no more input data left with sock_send(MSG_DONTWAIT) returning the number of the bytes written in sentlen, which is equal to the value of uio_resid() of the chunk passed into sock_send().

My concern here is that the system is actually queuing up far more data here that you realize, allowing a very large backlog of unwritten data to accumulate. Have you looked into how that flow is working? Note that this is another case where manipulating network bandwidth can be helpful, as you'd immediately be able to see the effect, even with relatively small I/O requests.

Without first understanding what's using up all that VM, I can't see how such a point may be determined.

Sure, but the issue here is that if you're not controlling that point, then something else in the system is (or isn't...).

How would VNOP_WRITE block?

Because you blocked waiting on "something"? Typically that would either be an I/O call or lock.

For some reason, IOLog() data doesn't wind up in the output of log show|stream on macOS 15.5 running on M4 Max.

Is all kernel logging missing, or are your messages being redacted? On the redaction side, DriverKit actually uses the kernel's logging infrastructure, so I think the solution described in this thread would also apply to the kernel.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

I was able to collect some diagnostics which revealed an interesting pattern of things that were happening prior to the kernel panic occurring.

Depending on the amount of memory available, one or more files are able to get extracted from the zip file before the panic occurs.

[FIle 1: 732.2Mb]
767814479  03-27-2025 13:16   25116_CINEGRAPHER_MARCCAIN/01_IN/02_INGEST/2025-03-30/MARC_CAIN_23012024_2200/616919_Smoke_Overlay_Mysterious_Fog_By_Kino_Pravda_Artlist_4K.mov

vnop_write: ArchiveService-1872
vnop_write: -> cluster_write(616919_Smoke_Overlay_Mysterious_Fog_By_Kino_Pravda_Artlist_4K.mov, uio_resid:9314, originalEOF:0, newEOF:9314, zero_head_off:0, zero_tail_off:9314, lflag:100001)
vnop_write: <- cluster_write(616919_Smoke_Overlay_Mysterious_Fog_By_Kino_Pravda_Artlist_4K.mov, uio_resid:0, originalEOF:0, newEOF:9314, zero_head_off:0, zero_tail_off:9314, lflag:100001) -> 0
[...]
vnop_write: ArchiveService-1872
vnop_write: -> cluster_write(616919_Smoke_Overlay_Mysterious_Fog_By_Kino_Pravda_Artlist_4K.mov, uio_resid:9965, originalEOF:767804514, newEOF:767814479, zero_head_off:0, zero_tail_off:767814479, lflag:100001)
vnop_write: <- cluster_write(616919_Smoke_Overlay_Mysterious_Fog_By_Kino_Pravda_Artlist_4K.mov, uio_resid:0, originalEOF:767804514, newEOF:767814479, zero_head_off:0, zero_tail_off:767814479, lflag:100001) -> 0

vnop_pageout: ArchiveService-1872
vnop_pageout error, file is not mmapped
[...]
vnop_strategy: ArchiveService-1872
vnop_strategy: -> do_write(616919_Smoke_Overlay_Mysterious_Fog_By_Kino_Pravda_Artlist_4K.mov, uio_resid:8388608, 0, pid:1872)
vnop_strategy: <- do_write(616919_Smoke_Overlay_Mysterious_Fog_By_Kino_Pravda_Artlist_4K.mov, uio_resid:0, 0, pid:1872) -> 8388608
[...]
vnop_strategy: ArchiveService-1872
vnop_strategy: -> do_write(616919_Smoke_Overlay_Mysterious_Fog_By_Kino_Pravda_Artlist_4K.mov, uio_resid:4451151, 0, pid:1872)
vnop_strategy: <- do_write(616919_Smoke_Overlay_Mysterious_Fog_By_Kino_Pravda_Artlist_4K.mov, uio_resid:0, 0, pid:1872) -> 4451151

% awk '/do_write\(616919_Smoke_Overlay_Mysterious_Fog_By_Kino_Pravda_Artlist_4K.mov.*->/{sz+=$NF}END{print sz}' unzip-64Gb-zip-log-stream-current.log
767814479

cluster_write() gets called repeatedly until all file data is stored in VM.

Then VNOP_PAGEOUT gets called. I thought that VNOP_PAGEOUT was only called for mmapped files. But there are no VNOP_MMAP/VNOP_MNOMAP calls being made for the files being extracted.

The current VNOP_PAGEOUT implementation ends up calling err_pageout() for files that didn't get tagged as being memory mapped by VNOP_MMAP.

After a number of VNOP_PAGEOUT calls, VNOP_STRATEGY is called which commits file data to remote storage via a do_write() rpc.

This pattern is repeated until the system ultimately runs out of VM with kernel panic ensuing.

My intuition is telling me that those VNOP_PAGEOUT calls are being made for a reason. Possibly, to free up the memory pages used up by the cluster API.

I tried calling cluster_pageout() from VNOP_PAGEOUT despite VNOP_MMAP never being called. But that resulted in VNOP_STRATEGY being called through two separate paths, VNOP_PAGEOUT and cluster API, which resulted in I/O stalling.

Any further pointers would be much appreciated.

Then VNOP_PAGEOUT gets called. I thought that VNOP_PAGEOUT was only called for mmapped files.

No that's not correct, that's definitely not correct. The idea behind the UBC ("Universal Buffer Cache") is unify the I/O and VM system into a common architecture. In concrete terms, these to cases:

  1. Reading the first 16kb of a file.

  2. Mapping the first 16kb of the same file.

...should both return a VM page that's actually backed by EXACTLY the same page in the UBC system. Similarly, writing to the first 16kb of that file and modifying a mapped file both have EXACTLY the same result- that is, they both modify the same page in the UBC.

But there are no VNOP_MMAP/VNOP_MNOMAP calls being made for the files being extracted.

No, you called cluster_write(), which basically means "please modify this pages in the UBC".

The current VNOP_PAGEOUT implementation ends up calling err_pageout() for files that didn't get tagged as being memory mapped by VNOP_MMAP.

I don't think that's going to end very well. Have you looked at what it's actually asking to pageout and how that relates to the data you've returned?

My intuition is telling me that those VNOP_PAGEOUT calls are being made for a reason. Possibly, to free up the memory pages used up by the cluster API.

Trying to sort out exactly how this works is more than an afternoon's work, but what I think is going on here is that the system is actually dealing with two different concerns:

After a number of VNOP_PAGEOUT calls, VNOP_STRATEGY is called which commits file data to remote storage via a do_write() rpc.

VNOP_STRATEGY-> I think this is the system flushing dirty pages to disk as part of it's normal "lets get stuff to disk" strategy.

VNOP_PAGEOUT-> Yes, I think this the system letting you know that it's trying to reclaim pages. What exactly it's doing in your particular case isn't clear, but a comment in the SMB source does hint at one possibility:

/*
* One case where the file is not currently open and we get called.
* vnode_vput->vnode_reclaim_internal->vgone->vclean will call
* ubc_msync_internal which calls vnode_pageout even though all dirty
* data was pushed out on the last vnop_close.
*
* If file is not open, no way to push out data, so just return an
* error.
*/

But that resulted in VNOP_STRATEGY being called through two separate paths, VNOP_PAGEOUT and cluster API, which resulted in I/O stalling.

This makes me very nervous, as my intuition is basically "yes, VNOP_STRATEGY can be called on a whole lot of different paths". Setting all other issue aside, I think you could make this happen by having multiple process open the same file and then shoving I/O at it (particularly if mapping and direct I/O were both being used).

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Further diagnostics have shown that ubc_msync(UBC_PUSHDIRTY|UBC_SYNC|UBC_INVALIDATE) was being called, for the file being written, through the VNOP_GETATTR path on each change of the file size while cluster_write() of the same file was in progress.

The change I made now calls ubc_msync() only if there's no cluster_write() in progress when getting file attributes.

To test the patch I extracted the zip file a few times in a row with the system no longer crashing and the following metrics being reported by vm_stat(1):

% vm_stat 30 | awk '/Statistics/; !/Statistics/{print $3,"|",$13}'
Mach Virtual Memory Statistics: (page size of 16384 bytes)
specul | file-backed
165368 | 653350
65896 | 888301
66724 | 920760
65872 | 907261
63935 | 898648
67795 | 885978
63295 | 871778
59696 | 863204
3309 | 807539
188881 | 859100
60996 | 1018163
63041 | 1015598
59276 | 1013489
57392 | 1013524
60394 | 1013068
56646 | 1009342
634 | 953419
333949 | 1003401
59119 | 1132207
60277 | 1131932
56650 | 1128616
59147 | 1124974
56620 | 1124759
57165 | 1123408
55714 | 1122087

Thanks very much for all your help.

Are you at liberty to disclose how you obtained the backtraces from the full panic long I'd supplied?

Are you at liberty to disclose how you obtained the backtraces from the full panic long I'd supplied?

Well, to be honest, I tossed a log into an internal tool that did the symbolication for me. In terms of doing the process yourself, the process isn't actually all that complicated (in principle) it would just be REALLY painful to do "by hand", as the format was basically designed to be as SMALL as possible, not be all the readable. If you haven't already, please file a bug asking for us to provide a tool/tools for this process and then post the bug number back here.

In terms of how you'd do it yourself, if you look at the unsymbolicated crash log* you'll find a section that starts with:

"processByPid" : {
"0" : {"timesThrottled":0,"pageIns":0,"rawFlags":"0x20800001","timesDidThrottle":3,"procname":"kernel_task",....
"1" : {"timesThrottled":0,"codeSigningAuxiliaryInfo":139586437120,"pageIns":2297,"systemTimeTask":2.57985825,...
"103" : {"timesThrottled":0,"codeSigningAuxiliaryInfo":288230376151711744,"pageIns":75,"systemTimeTask":0.111604833,...
...

*Make sure you use a text editor that can disable line wrapping, otherwise the log is an undecipherable mess.

Each of those lines contain "all" of the information about that particular PID. Going back to symbolicated "chunk" I sent earlier, it included this unsymbolicated chunk:

13524 Kernel Frames
kernel.release.t8112 0xfffffe0008acbaf4 thread_invoke + 1180 (sched_prim.c:3566)
kernel.release.t8112 0xfffffe0008aca59c thread_block_reason + 244 (sched_prim.c:4183)
kernel.release.t8112 0xfffffe0008ba0ce4 vm_page_wait + 436 (vm_resident.c:5443)
kernel.release.t8112 0xfffffe0008ba29bc vm_page_alloc_list + 96 (vm_resident.c:8581)
kernel.release.t8112 0xfffffe0008b562f8 kmem_alloc_guard_internal + 404 (vm_kern.c:908)
kernel.release.t8112 0xfffffe0008aa7298 kalloc_large + 360 (kalloc.c:2364)
kernel.release.t8112 0xfffffe0008fdbe40 _MALLOC + 68 (kern_malloc.c:119)
[15, 181932]
[15, 184316]
[15, 317620]
[15, 316216]
[15, 71024]
[15, 70676]
[15, 236844]
kernel.release.t8112 0xfffffe0008c7b994 cluster_io + 3416 (vfs_cluster.c:2336)
kernel.release.t8112 0xfffffe0008c8549c advisory_read_ext + 756 (vfs_cluster.c:6258)

Searching for "181932" will get you to this line:

"1413" : {"timesThrottled":0,"codeSigningAuxiliaryInfo":0,"pageIns":3922,"systemTimeTask":37.890620249999998,"rawFlags":"0x4C790801",...

Which, after replacing "]," with "],\n", will let you find this:

"user_usec":3287993,"kernelFrames":[[0,604916],
[0,599452],
[0,1477860],
[0,1485244],
[0,1172216],
[0,455320],
[0,5914176],
[15,181932],
[15,184316],
[15,317620],
[15,316216],
[15,71024],
[15,70676],
[15,236844],
[0,2374036],
[0,2413724],

Those are the actual frames you're symbolicating. The second value is the address, offset (I think) from the start of the library. The first value is an index in the "binaryImages" list which follows "processByPid":

  "binaryImages" : [["31040a93-b7a4-3b28-8a52-1d311ee62ecc",18446741874828394496,"T"],["d097e7da-1aa7-319f-bfb0-703c46b031b9",18446741874884756224,"T"]...

SO "0" is:

["31040a93-b7a4-3b28-8a52-1d311ee62ecc",18446741874828394496,"T"]

Note this is also the kernel UUID earlier in the log:

Kernel UUID: 31040A93-B7A4-3B28-8A52-1D311EE62ECC

And "15" is:

["1473bf7b-1487-3fe3-94bb-649224e65f21",18446741874883219200,"T"],

...which I assume is the UUID of your KEXT.

The second value is then the load address of the libary

I'd need to experiment a bit ot sort out the exact symbolication from there, but the sequence would basically be:

On that last step, I think you could just use the offset from the original frame and a load address of "0", but you might need to add the offset to the load and then symbolicate. That's something I'd need to experiment with to be 100% sure about.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

If you haven't already, please file a bug asking for us to provide a tool/tools for this process and then post the bug number back here.

That's very kind of you to suggest that. Please see FB17733912. Please advise if I should close my previous bug report, FB17497530, which you've already looked into.

That's very kind of you to suggest that. Please see FB17733912.

Perfect. Just to clarify, we probably won't release "our" tool as such, but this is something Xcode should address.

Please advise if I should close my previous bug report, FB17497530, which you've already looked into.

I think it's head back to you as 3rd party to resolves, but you're welcome to close it yourself.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Thanks for clarifying that point. Hopefully, Xcode will soon come invested with the symbolication capabilities on Apple Silicon which will certainly be much appreciated.

I have closed FB17497530. Thanks.

Symbolicating kernel backtraces on Apple Silicon
 
 
Q