Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracing programs cannot be attached to non-unique kernel symbols #894

Open
ti-mo opened this issue Dec 22, 2022 · 3 comments
Open

Tracing programs cannot be attached to non-unique kernel symbols #894

ti-mo opened this issue Dec 22, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@ti-mo
Copy link
Collaborator

ti-mo commented Dec 22, 2022

Describe the bug

While hacking on #890, I decided to try and attach a fentry prog to all of my machine's symbols. Many of them fail with: attach Tracing/TraceFEntry: find target for fentry update_persistent_clock64 in vmlinux: type update_persistent_clock64: multiple candidates for *btf.Func

[1096] STRUCT 'timespec64' size=16 vlen=2
        'tv_sec' type_id=1095 bits_offset=0
        'tv_nsec' type_id=90 bits_offset=64
..
[28302] FUNC_PROTO '(anon)' ret_type_id=69 vlen=1
        'now64' type_id=1096
[28303] FUNC 'update_persistent_clock64' type_id=28302 linkage=static
..
[62944] FUNC_PROTO '(anon)' ret_type_id=69 vlen=1
        'now' type_id=1096
[62945] FUNC 'update_persistent_clock64' type_id=62944 linkage=static

update_persistent_clock64 is a weak vmlinux symbol that has an arch-specific implementation:
https://elixir.bootlin.com/linux/v6.1.1/source/kernel/time/ntp.c#L568
https://elixir.bootlin.com/linux/v6.1.1/source/arch/x86/kernel/rtc.c#L103

The argument in the weak symbol is named now64, the arch-specific one is called now.

Expected behavior

The library should either go with the first candidate, or additionally verify binary compatibility between all candidates' function signatures.

Or, we might be stuck between a rock and a hard place if the kernel expects the function's specific BTF id to be given. Perhaps we may need to try loading a prog pointing at each candidate until one is accepted.

Looks similar to #723 and #466.

@ti-mo ti-mo added the bug Something isn't working label Dec 22, 2022
@lmb
Copy link
Collaborator

lmb commented Dec 22, 2022

Can both 28303 and 62945 be attached to? Why is linkage=static on both funcs? Shouldn't one be weak (maybe a clang / pahole version issue)?

@ti-mo
Copy link
Collaborator Author

ti-mo commented Dec 23, 2022

I assume one should be weak, but doesn't look like we can rely on this being set correctly. Even if it's fixed on master, older kernel/pahole versions will have it wrong.

@ti-mo ti-mo changed the title findTargetInKernel() must be able to resolve multiple candidates Tracing programs cannot be attached to non-unique kernel symbols Jan 13, 2023
@ti-mo
Copy link
Collaborator Author

ti-mo commented Jan 13, 2023

As it turns out, trying to attach Tracing programs to kernel symbols with overlapping names is currently subtly broken in Linux up until at least 6.1. Here's what I found:

λ  ~  sudo bpftool btf dump id 1 | grep "'type_show'"
[24381] FUNC 'type_show' type_id=3718 linkage=static
[28217] FUNC 'type_show' type_id=3798 linkage=static
[64201] FUNC 'type_show' type_id=64196 linkage=static
[73167] FUNC 'type_show' type_id=10388 linkage=static
[76730] FUNC 'type_show' type_id=76710 linkage=static
[108946] FUNC 'type_show' type_id=108945 linkage=static

λ  ~  grep -E "\stype_show$" /proc/kallsyms
0000000000000000 t type_show
0000000000000000 t type_show
... 18 symbols

In this kernel:

  • 6 BTF Funcs exist with name type_show, and all have various signatures. They are not (necessarily) equivalent in terms of function signature, implementation or semantics, so they can do totally different things and all have different signatures.
  • /proc/kallsyms shows 18 symbols with name type_show

When loading a program from section fentry/type_show:

  • ebpf-go will reject the Func name -> btf_id lookup with a 'multiple candidates' error caused by TypeByName()
  • libbpf will pick the first-found Func, so 24381 in this case

When the kernel receives the program, it:

  • Looks up the Func in vmlinux BTF for the given btf_id.
  • Verifies the program taking into account the FuncProto corresponding to the given btf_id.

Finally, when attaching the program, the kernel takes the Func.Name of the given btf_id and looks up any kernel symbol using kallsyms_lookup_name. Note that this returned symbol address doesn't necessarily correspond to the given btf_id. BTF goes through dedup, and btf IDs are allocated by the compiler in order of declaration.

This means:

  • the program is unlikely to be attached to the one intended by the user, especially the more candidates there are
  • the program is verified against a signature that often doesn't match the target program, allowing unsafe memory access

As such, we'll keep rejecting program loads for ambiguous attach targets. PR coming to make the error a bit more helpful.

ti-mo added a commit to ti-mo/ebpf that referenced this issue Jan 13, 2023
See cilium#894 for more context. Explicitly refuse loading programs with
multiple AttachTo candidates. BTF needs to carry more information to allow
disambiguating between them. The kernel API likely needs to be extended to
allow specifying which candidate to pick.

Signed-off-by: Timo Beckers <[email protected]>
ti-mo added a commit to ti-mo/ebpf that referenced this issue Jan 13, 2023
See cilium#894 for more context. Explicitly refuse loading programs with
multiple AttachTo candidates. BTF needs to carry more information to allow
disambiguating between them. The kernel API likely needs to be extended to
allow specifying which candidate to pick.

Signed-off-by: Timo Beckers <[email protected]>
ti-mo added a commit that referenced this issue Jan 13, 2023
See #894 for more context. Explicitly refuse loading programs with
multiple AttachTo candidates. BTF needs to carry more information to allow
disambiguating between them. The kernel API likely needs to be extended to
allow specifying which candidate to pick.

Signed-off-by: Timo Beckers <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants