Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early introspection #98

Open
stephenrkell opened this issue Nov 13, 2024 · 5 comments
Open

Early introspection #98

stephenrkell opened this issue Nov 13, 2024 · 5 comments

Comments

@stephenrkell
Copy link
Owner

Although liballocs queries cannot be issued before liballocs is initialized, which is some way into process start-up (but we aim for as early as possible), we would like the ability to introspect later on allocations made during early phases, e.g. by the ld.so itself. In general that requires allocsld to interfere with the ld.so, e.g. to instrument its memory alloctaion functions. And of course we must craft allocsld itself so that introspection on it, and anything it allocates, will later work correctly... there are some meta-completeness gaps at the moment (hence a link to #16).

I'm currently proof-of-concepting a way to instrument the ld.so's malloc functions, via binary instrumentation (hence some progress on #11).

This is related to #97.

@stephenrkell
Copy link
Owner Author

How to inform liballocs of the ld.so's private heap area, and any bitmap we've already built from it, is not entirely clear. Perhaps a quick hack for now: probe for __minimal_malloc in ld.so, make a call to it if it exists, and find the bitmap info etc at a known offset from its load address? This will be the same offset used by allocsld to map the space for storing them, so it's not so brittle.

For validation it'd be nice if we also had caller address information for the memory mapping that made this space. That's require us to trap the mmaps early during the ld.so execution (#97).

@stephenrkell
Copy link
Owner Author

(Recall: the reason this needs special-casing is that the heap area has already been created, and we weren't able to catch that at the time. So we don't need to generalise this into a "scan any loaded object for mallocs and probe where they are allocating" style of hack.)

@stephenrkell
Copy link
Owner Author

This is now done, mostly in 115be11. We use a Detours-style binary instrumentation to catch early calls to malloc and friends as defined within the ld.so (really __minimal_malloc). These are indexed in a very stupid sorted-array data structure, which is OK because this malloc is very lightly used. The instrumentation logic is in allocsld and it creates the trampolines. We pick these up in liballocs proper using (at present) a pretty ad-hoc convention for where to find the allocsld mapping.

For meta-completeness (#16) it would be nice if the allocsld memory image could morph into the dlbind image for the process. That way, it would remain transparent to all introspection and the trampolines could be included too. Other data we collect 'early' could fall under the same approach (#97). That would require libdlbind to work with a somewhat-arbitrary initial memory image, rather than its own carefully crafted zygote as at present. I think that's doable but have to think about it. How we dlopen it, i.e. get it into the inferior link map, is tricky... we have to do the usual libdlbind trick of turning a MAP_PRIVATE into a MAP_SHARED, but we don't create it. It might be necessary to unrelocate, so that we get re-relocated correctly when dlopen occurs. Perhaps try memfd_create, copying the unrelocated allocsld image into it, and then dlopening the resulting /proc/self/fd link?

@stephenrkell
Copy link
Owner Author

This would be quite pleasingly testable: implement unrelocation via a write() loop that is reading from the existing raw memory image, unrelocating as it goes. Then dlopen ensuring the big mmap gets placed precisely over the top of where the memory image already was. There should be no change in the contents of memory. A test case could dump pre- and post-images to temporary files and print the diff of the hd, hopefully empty.

What is this primitive? It's close to: shareably 'forking' a loaded DSO (not in the sense of Unix fork of a process, of course). The gap is that allocsld is not a real DSO at the start of this process. So we're just adopting a range of raw memory as a DSO... however, the unrelocation feature brings it very close to what we'd need to create a shareable modification of a loaded (system, unmodifiable) DSO. This is a potentially useful image-style primitive.

@stephenrkell
Copy link
Owner Author

Another almost-correct way to think of this is that we "prototypify" the underlying ELF image, so that our own copy is morally delegating to it. There's no continuing delegation though... we don't continue to share state with the prototype, and updates made later to it have no effect on us. So maybe we are "freezing" it also.

What are the next moves, in workflows that use this primitive? In the current scenario, one answer is to do dlbind-y things to it, i.e. adding new symbols and writing new code. In other scenarios we might patch existing code. And we might save it out to the filesystem (roughly by copying the memory image, noting the use of MAP_SHARED). So we're gaining the ability to modify ELF files "at the base level", by updating their in-memory data and code, rather than the usual meta-level (using APIs like libelf).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant