hi,
I have offered to help port a custom debug tool that "revives" a process from a core file. It currently works on Linux and Windows and I would like to help port it to macOS.
On Linux, prelink is used to load a dynamic library at a specific addrress (to match its location in corefile). On Windows, editbin is used.
Is there an off the shelf tool that loads a dylib on macOS at a specified address?
I tried to research this topic and I see:
- dylibs on macOS are position independent, though apparently it is possible to build a position dependent lib (but the note doesn't say how)
- there is a slide value that adjust base address of a dylib (but I can not find much actual info on how exactly to use it)
- prebinding (deprecated?)
I feel like I am starting to veer off into fun topics, like dylib hijacking and implementing custom dylib loaders (DyldDeNeuralyzer).
As much as I enjoy going off main path sometimes, can someone help set me back on the main path?
thanks!
running fragments of code from the mapped image should work.
That’s going to be tricky because of macOS’s limitations on executing data, especially on Apple silicon.
Can you elaborate on why we need Mach?
See my Mach Musings discussion below.
IMPORTANT Thinking about this more, I suspect that path is infeasible due to code signing restrictions on Apple silicon. I was gonna delete all of that text but there’s some useful background info in there so I just moved it out of the way.
To offer further advice I need to dig deeper into your requirements. Let me start by confirming what I understand so far:
-
You want to support Intel and Apple silicon.
-
This is a developer tool, so you can reasonably require a recent version of macOS. That is, you don’t care if your solution doesn’t support macOS 10.15, or something of that era.
-
You are also prepared to accept some level of compatibility risk.
-
You have a core file that represents the state of a process.
-
Your goal is to create a process from that core file.
-
You need this to be a process, rather than just an abstract representation of the memory, because you want to execute the code from the core file.
-
You don’t care about reviving IPC connections, including both Mach IPC and BSD file descriptors.
Is that all right?
Now some questions:
-
How is this core file created? Is it a custom mechanism? Or do you want to work with standard macOS core files?
-
Do you want to load all the libraries? Or just some subset?
-
And if it’s a subset, how do you want to handle dependencies? Specifically, if library A is in the subset and it imports library B that isn’t, what happens?
Note That last case is important because all dynamic libraries eventually depend on libSystem and libSystem is in the dynamic linker shared cache.
Share and Enjoy
—
Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"
Mach Musings
Apple’s dynamic linker expects to be in charge of the libraries loaded into your process. It doesn’t have the ability to load a library at a specified address. Even if it did, there’d still be lots of obstacles:
-
Apple doesn’t support statically linked executables [1]. Our system call interface is libSystem, not the system call trap instruction.
-
So there’s no way to call the dynamic linker without first loading a bunch of dynamic libraries.
-
And you can’t control where those libraries get loaded.
-
And they’ll likely load from the shared cache, and there’s no guarantee that those locations are aligned with the locations of the equivalent libraries in your core file.
The reason I thinking about Mach is that it gives you complete control over the memory layout of a task. Now, we don’t support ‘naked’ Mach tasks, but you might be able to fake that by spawning an executable with the POSIX_SPAWN_START_SUSPENDED
flag. That stops at the first instruction, before dynamic linker runs, and so none of its address layout policies have been applied.
From there you can start mapping chunks of memory from your core file into your ‘empty’ process, allowing you to re-create its memory map.
Finally, two words of warning. The first is that I’ve never tried this, so I’m outlining a theoretical possibility, not a guaranteed path to success.
The second is that the Mach APIs are somewhat brittle. While they are APIs, and we do try to maintain binary compatibility for them, they are tightly bound to the platform’s implementation, and sometimes binary compatibility just isn’t possible.
For folks building a product that they want to ship to a wide range of end users, I recommend steering away from Mach APIs as much as possible. However, you’re building a developer tool and that gives you more latitude.
[1] Except for the dynamic linker itself.