prelink like tool on macOS?

hi,

I have offered to help port a custom debug tool that "revives" a process from a core file. It currently works on Linux and Windows and I would like to help port it to macOS.

On Linux, prelink is used to load a dynamic library at a specific addrress (to match its location in corefile). On Windows, editbin is used.

Is there an off the shelf tool that loads a dylib on macOS at a specified address?

I tried to research this topic and I see:

  1. dylibs on macOS are position independent, though apparently it is possible to build a position dependent lib (but the note doesn't say how)
  2. there is a slide value that adjust base address of a dylib (but I can not find much actual info on how exactly to use it)
  3. prebinding (deprecated?)

I feel like I am starting to veer off into fun topics, like dylib hijacking and implementing custom dylib loaders (DyldDeNeuralyzer).

As much as I enjoy going off main path sometimes, can someone help set me back on the main path?

thanks!

Answered by DTS Engineer in 839427022
running fragments of code from the mapped image should work.

That’s going to be tricky because of macOS’s limitations on executing data, especially on Apple silicon.

Can you elaborate on why we need Mach?

See my Mach Musings discussion below.

IMPORTANT Thinking about this more, I suspect that path is infeasible due to code signing restrictions on Apple silicon. I was gonna delete all of that text but there’s some useful background info in there so I just moved it out of the way.

To offer further advice I need to dig deeper into your requirements. Let me start by confirming what I understand so far:

  • You want to support Intel and Apple silicon.

  • This is a developer tool, so you can reasonably require a recent version of macOS. That is, you don’t care if your solution doesn’t support macOS 10.15, or something of that era.

  • You are also prepared to accept some level of compatibility risk.

  • You have a core file that represents the state of a process.

  • Your goal is to create a process from that core file.

  • You need this to be a process, rather than just an abstract representation of the memory, because you want to execute the code from the core file.

  • You don’t care about reviving IPC connections, including both Mach IPC and BSD file descriptors.

Is that all right?

Now some questions:

  • How is this core file created? Is it a custom mechanism? Or do you want to work with standard macOS core files?

  • Do you want to load all the libraries? Or just some subset?

  • And if it’s a subset, how do you want to handle dependencies? Specifically, if library A is in the subset and it imports library B that isn’t, what happens?

Note That last case is important because all dynamic libraries eventually depend on libSystem and libSystem is in the dynamic linker shared cache.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"


Mach Musings

Apple’s dynamic linker expects to be in charge of the libraries loaded into your process. It doesn’t have the ability to load a library at a specified address. Even if it did, there’d still be lots of obstacles:

  • Apple doesn’t support statically linked executables [1]. Our system call interface is libSystem, not the system call trap instruction.

  • So there’s no way to call the dynamic linker without first loading a bunch of dynamic libraries.

  • And you can’t control where those libraries get loaded.

  • And they’ll likely load from the shared cache, and there’s no guarantee that those locations are aligned with the locations of the equivalent libraries in your core file.

The reason I thinking about Mach is that it gives you complete control over the memory layout of a task. Now, we don’t support ‘naked’ Mach tasks, but you might be able to fake that by spawning an executable with the POSIX_SPAWN_START_SUSPENDED flag. That stops at the first instruction, before dynamic linker runs, and so none of its address layout policies have been applied.

From there you can start mapping chunks of memory from your core file into your ‘empty’ process, allowing you to re-create its memory map.

Finally, two words of warning. The first is that I’ve never tried this, so I’m outlining a theoretical possibility, not a guaranteed path to success.

The second is that the Mach APIs are somewhat brittle. While they are APIs, and we do try to maintain binary compatibility for them, they are tightly bound to the platform’s implementation, and sometimes binary compatibility just isn’t possible.

For folks building a product that they want to ship to a wide range of end users, I recommend steering away from Mach APIs as much as possible. However, you’re building a developer tool and that gives you more latitude.

[1] Except for the dynamic linker itself.

If you haven’t already found it, please read An Apple Library Primer.

But before we start talking about the dynamic linker, I want to chat about the big picture. You wrote:

I have offered to help port a custom debug tool that a process from a core file.

What sort of fidelity are you looking for here?

This is impractical in the general case because macOS leans heavily into Mach IPC. The vast bulk of system services aren’t provided by the kernel but are instead provided by daemons and agents via IPC. A process typically interacts with those via Mach IPC, and specifically XPC, and rebuilding those connections is pretty much impossible.

So, if your OK with limiting this to Unix-y APIs then it might be worth continuing down this path. But if you want to get this working for apps, you should rethink your life choices )-:

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Thank you Quinn for your response.

This is impractical in the general case because macOS leans heavily into Mach IPC. The vast bulk of system services aren’t provided by the kernel but are instead provided by daemons and agents via IPC. A process typically interacts with those via Mach IPC, and specifically XPC, and rebuilding those connections is pretty much impossible. So, if your OK with limiting this to Unix-y APIs then it might be worth continuing down this path. But if you want to get this working for apps, you should rethink your life choices )-:

Yes, I am aware of the limitations - the purpose is not to fully "resurrect" the process, just to get enough running to do light poking. And yes, we are talking about POSIX APIs here.

I read the Apple Library Primer, but not 100% sure which way should I proceed.

I think the easiest would be to create a location dependent dylib and load that to make everything match. Are there any other paths? Is there any documentation describing how to build location dependent dylib?

the purpose is not to fully "resurrect" the process, just to get enough running to do light poking.

OK, cool, that’s well feasible.

In fact, LLDB supports something similar using the --core option. Did you look at that already?

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Qiunn said:

In fact, LLDB supports something similar using the --core option. Did you look at that already?

I had the exact same thought and when I asked about it, the original author of the code, he said:

So with a debugger like gdb, attaching live it uses ptrace or similar to read memory that's already loaded. On a core, it isn't really loading libraries into a live process. It builds its own process model in memory but doesn't need to use the regular loader to get it there (as I understand it!).... (edited)

So, based on his understanding, studying gdb will not help, is he wrong?

Right, but why is a process necessary? It’s not like you’ll be able to run this code, right? So what are you planning to do with this process that can’t be done with your own internal representation?

ps I suspect it would be possible to create a process and then map relevant bits of data into it from your core file. However, that’s gonna require some gnarly low-level programming (it may even involve… Mach *gasp* :-). Your life will be easier if you can avoid that.

Oh, and if you do need to do that then it’s gonna be important to know why, because that determines how much of the process you need to running.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Right, but why is a process necessary? It’s not like you’ll be able to run this code, right? So what are you planning to do with this process that can’t be done with your own internal representation?

ps I suspect it would be possible to create a process and then map relevant bits of data into it from your core file. However, that’s gonna require some gnarly low-level programming (it may even involve… Mach gasp :-). Your life will be easier if you can avoid that.

Oh, and if you do need to do that then it’s gonna be important to know why, because that determines how much of the process you need to running.

We acknowledge that the whole process won't be fully functional, but getting the memory mapped in for inspection is a goal, and running fragments of code from the mapped image should work.

Can you elaborate on why we need Mach? We are not against learning new technology, we just could use a little help pointing us in the right direction.

Any more insight on how we should proceed?

Thank you.

Accepted Answer
running fragments of code from the mapped image should work.

That’s going to be tricky because of macOS’s limitations on executing data, especially on Apple silicon.

Can you elaborate on why we need Mach?

See my Mach Musings discussion below.

IMPORTANT Thinking about this more, I suspect that path is infeasible due to code signing restrictions on Apple silicon. I was gonna delete all of that text but there’s some useful background info in there so I just moved it out of the way.

To offer further advice I need to dig deeper into your requirements. Let me start by confirming what I understand so far:

  • You want to support Intel and Apple silicon.

  • This is a developer tool, so you can reasonably require a recent version of macOS. That is, you don’t care if your solution doesn’t support macOS 10.15, or something of that era.

  • You are also prepared to accept some level of compatibility risk.

  • You have a core file that represents the state of a process.

  • Your goal is to create a process from that core file.

  • You need this to be a process, rather than just an abstract representation of the memory, because you want to execute the code from the core file.

  • You don’t care about reviving IPC connections, including both Mach IPC and BSD file descriptors.

Is that all right?

Now some questions:

  • How is this core file created? Is it a custom mechanism? Or do you want to work with standard macOS core files?

  • Do you want to load all the libraries? Or just some subset?

  • And if it’s a subset, how do you want to handle dependencies? Specifically, if library A is in the subset and it imports library B that isn’t, what happens?

Note That last case is important because all dynamic libraries eventually depend on libSystem and libSystem is in the dynamic linker shared cache.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"


Mach Musings

Apple’s dynamic linker expects to be in charge of the libraries loaded into your process. It doesn’t have the ability to load a library at a specified address. Even if it did, there’d still be lots of obstacles:

  • Apple doesn’t support statically linked executables [1]. Our system call interface is libSystem, not the system call trap instruction.

  • So there’s no way to call the dynamic linker without first loading a bunch of dynamic libraries.

  • And you can’t control where those libraries get loaded.

  • And they’ll likely load from the shared cache, and there’s no guarantee that those locations are aligned with the locations of the equivalent libraries in your core file.

The reason I thinking about Mach is that it gives you complete control over the memory layout of a task. Now, we don’t support ‘naked’ Mach tasks, but you might be able to fake that by spawning an executable with the POSIX_SPAWN_START_SUSPENDED flag. That stops at the first instruction, before dynamic linker runs, and so none of its address layout policies have been applied.

From there you can start mapping chunks of memory from your core file into your ‘empty’ process, allowing you to re-create its memory map.

Finally, two words of warning. The first is that I’ve never tried this, so I’m outlining a theoretical possibility, not a guaranteed path to success.

The second is that the Mach APIs are somewhat brittle. While they are APIs, and we do try to maintain binary compatibility for them, they are tightly bound to the platform’s implementation, and sometimes binary compatibility just isn’t possible.

For folks building a product that they want to ship to a wide range of end users, I recommend steering away from Mach APIs as much as possible. However, you’re building a developer tool and that gives you more latitude.

[1] Except for the dynamic linker itself.

Thank you!

prelink like tool on macOS?
 
 
Q