Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New service: stable allocation identities #86

Open
stephenrkell opened this issue Mar 3, 2024 · 1 comment
Open

New service: stable allocation identities #86

stephenrkell opened this issue Mar 3, 2024 · 1 comment

Comments

@stephenrkell
Copy link
Owner

A useful primitive for a lot of tool and diagnostic applications would be "stable allocation identities": instead of a trace or dump that has inscrutable 0xdeadbeef-esque numbers that change each time, a stable identity could say something like site-myfunc-2-thread-3-1+0x22. This identifier would be the same each execution, for the allocation created at the same point in the dynamic instruction trace.

I'm anticipating that

  • there would be multiple schemes, roughly one per allocator or groups of allocators; here site means the identifier is based on the allocation site, which is appropriate or heap allocations
  • allocation sites are identified in a function-local way, e.g. here it's site number 2 within myfunc
  • dynamic program points are counted in a thread-local way, e.g. 1-thread-3-1 means the second hit by thread 3
  • threads are (recursively) identified by the stable identity of their own control block
  • offsets can be tacked on to the end, to identify a byte/position within an allocation

A good test would be an ltrace-like tracer, implemented in-process, but where we print addresses as stable identifiers. It should give the same output for any run, even for multithreaded programs if we discard ordering (e.g. sort the output lines).

Note that the liballocs API already allows for allocations to have names. However, these names are not the same thing -- they are (1) optional (only for subobjects, symbols etc) and (2) non-unique (directly using the symbol name, subobject name, etc). By contrast, a stable allocation identity should be globally unique across a whole execution, and every allocation should have one (if asked for one).

@stephenrkell
Copy link
Owner Author

stephenrkell commented Mar 4, 2024

This is related to the need faced by my editable-assembly-generating example client (see examples/ or #88) to generate symbolic names for any referenced position. That is interesting because what's being named there is a position in the allocation containment tree, not just an address. If we have an address and a pointed-to type, we can identify a particular tree node, not just an address-equal slice through the tree. (This is the insight between my as-yet-unpublished bounds-checking work.)

So maybe the service could name a tree node not just an address? Some defaulting logic could break ties if the client doesn't know which node it wants.

Incidentally, a better API for navigating these tree slices would replace find_matching_subobject and first_subobject_spanning and walk_subobjects_spanning (others?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant