-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: innocentzero <[email protected]>
- Loading branch information
1 parent
b111320
commit 7516508
Showing
1 changed file
with
79 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
+ | ||
title = "Understanding ELFs, part 3" | ||
date = 2025-01-30 | ||
authors = ["InnocentZero"] | ||
+++ | ||
|
||
## On relocations, loading binaries, and more | ||
|
||
The reason we need relocations is because of a simple fact, the existence of shared libraries. | ||
|
||
One question anyone may ask is the necessity of having shared libraries. That is done to avoid | ||
repitition of pages in memory, a thing which was critical in older days because of low memory. | ||
Another thing to note is that there is separation of the library and the binary. The library can | ||
be updated without updating the binary as such. | ||
|
||
This is dealt with by using _relocation sections_. These contain the info needed to do the | ||
relocation of the symbol within the binary's context. The section usually links to an additional | ||
section where the relocation is going to happen. | ||
|
||
There are two ways in which object files may be linked: statically and dynamically. | ||
|
||
Static linking is fairly straightforward, the linker takes in all the object files and archive | ||
files (=libc.a=) and creates a single self-contained binary containing all the required | ||
functionality. This is done at the end of compilation itself. | ||
|
||
Dynamic linking is a slightly more complex and involved process. It defers the linking part from | ||
compile time to runtime. The binary contains the information about its choice of runtime linker | ||
(also referred to as an _interpreter_) and the dynamic symbols and how to obtain them. | ||
|
||
|
||
## Loading an ELF on the memory | ||
|
||
The system first executes the file's "interpreter" before handing over execution to the binary. | ||
Over here, the interpreter is obtained from the `.interp` section in the `PT_INTERP` segment in | ||
memory. This can be read using `readelf -p .interp example`. | ||
|
||
``` | ||
$ readelf -p .interp example | ||
String dump of section '.interp': | ||
[ 0] /lib64/ld-linux-x86-64.so.2 | ||
``` | ||
|
||
The interpreter loads the binary into memory first. | ||
|
||
The interpreter sets up the environment using the `.dynamic` section of the binary. This can be | ||
seen using `readelf -d executable`. | ||
|
||
In this, the interpreter will recursively begin visiting all the **NEEDED** dynamic libraries to be | ||
loaded into memory. For each dependency, the following steps are executed: | ||
|
||
- The ELF is mapped into memory. | ||
- Relocations are performed, in the original binary we patch all the absolute addresses and | ||
resolve references to other object files. | ||
- Its dynamic table is parsed and dependencies loaded. | ||
- Run `dl_init`, which executes all the functions from `INIT`, and `INIT_ARRAY` for the just loaded | ||
libraries. | ||
|
||
Now the control is handed over to `_start` in the ELF binary. That gets the pointer to `_dl_fini` | ||
in `rdx`. This prepares the stack with a few arguments and calls `_libc_start_main`. | ||
|
||
`_libc_start_main` receives a function pointer to `main`, `init`, `fini`, and `rtld_fini` (this is the | ||
same as `dl_fini`). | ||
|
||
This function has a bunch of things going on, such as setting up of thread local storage and | ||
such. Here we only care about two things: | ||
|
||
- `__cxa_atexit__` which sets up `_dl_fini` as the destructor after the program is done. | ||
|
||
- A call to `call_init` that run the constructors in the `INIT` and `INIT_ARRAY` dynamic table | ||
entries. Note that `dl_init` was for the entries in the shared libraries themselves, but this | ||
is for the binary. | ||
|
||
- Finally, control after this is handed over to `main`. | ||
|
||
- Immediately after `main`, `exit` is called. This only transfers the control to | ||
`__run_exit_handlers`. | ||
|
||
- This runs all the functions registered in `__exit_funcs` which also contains `_dl_fini`. |