Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate eBPF files as part of the build process #1666

Merged
merged 21 commits into from
Feb 27, 2025
Merged

Conversation

rafaelroquetto
Copy link
Contributor

@rafaelroquetto rafaelroquetto commented Feb 14, 2025

Instead of committing eBPF binaries and generated code, this PR adds the required changes to allow for the eBPF binaries to be generated as part of the Beyla build process. In particular, it adds the necessary boilerplate to allow projects that vendor Beyla to be able to produce these binaries without requiring a clang/llvm toolchain.

The main motivations are

  • prevent malicious pre-compiled binaries from being injected into the Beyla tree
  • removing the binaries ensures that they remain consistent with the source code, preventing discrepancies.
  • repos importing Beyla (i.e. Alloy) won’t need to manage git-lfs, as the removal of pre-built binaries eliminates that requirement and reduces repository overhead, and the requirement for special branches - git-lfs has been removed.

The go module vendoring system, via go mod can be quite pedantic. In order to force the contents of the bpf/ directory to be distributed, a few placeholder files had to be added. Projects vendoring Beyla then need to add the following import:

Vendoring Beyla

import _ "github.com/grafana/beyla/bpf"

Additionally, because go build is not suitable for building non-go source code, projects vendoring Beyla are required to explicitly trigger the eBPF file generation as a fair trade off by issuing:

go generate vendor/github.com/grafana/beyla/bpf/build_ebpf.go

or more generically:

go generate $(go list -tags beyla_bpf -f '{{.Dir}}' github.com/grafana/beyla/v2/bpf)/build_ebpf.go  

Changes to the generator image

The docker generator image now uses a Go / Alpine base image. This has massively simplified the Dockerfile and shrunk the image size in more than 1GB.

Another benefit of the new image is the ability of generating the eBPF files in parallel. It's significantly faster than make generate. As a result, it's now the preferred way of building eBPF images, and has been reflected in the Makefile. Finally, this promotes docker to being an official build requirement, but at the same time, the requirement for having clang/llvm is now dropped. I find this to be a fair trade-off, given most of our user base is more likely to have docker installed rather than clang.

Note

The image has been renamed from beyla-generator to beyla-bpf-generator to prevent any clashes during development

Epilogue

For Beyla itself, nothing changes apart from the fact that make dev now defaults to docker-generate. You can still compile things locally by running make generate, as customary.

Finally, after experimenting with multiple approaches, I believe the other feasible alternative would be to simply commit go binaries as done by most other projects, without git-lfs - and live with the set of problems this causes.

There's no free lunch.

See related Alloy branch: grafana/alloy#2734

Closes #1243

@rafaelroquetto rafaelroquetto force-pushed the ebpf_gen branch 2 times, most recently from c31901a to 9027cd2 Compare February 15, 2025 00:07
Copy link

codecov bot commented Feb 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.89%. Comparing base (2ded441) to head (4100d20).
Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1666      +/-   ##
==========================================
+ Coverage   67.34%   69.89%   +2.55%     
==========================================
  Files         209      209              
  Lines       21065    21065              
==========================================
+ Hits        14186    14724     +538     
+ Misses       6144     5632     -512     
+ Partials      735      709      -26     
Flag Coverage Δ
integration-test 54.32% <ø> (?)
k8s-integration-test 54.43% <ø> (-0.03%) ⬇️
oats-test 34.89% <ø> (-0.21%) ⬇️
unittests 46.24% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rafaelroquetto rafaelroquetto force-pushed the ebpf_gen branch 2 times, most recently from 553afd6 to 2dfd519 Compare February 15, 2025 00:57
@rafaelroquetto rafaelroquetto added documentation Improvements or additions to documentation do-not-merge WIP or something that musn't be merged labels Feb 15, 2025
@rafaelroquetto rafaelroquetto marked this pull request as ready for review February 15, 2025 02:17
@rafaelroquetto rafaelroquetto requested a review from a team as a code owner February 15, 2025 02:17
@rafaelroquetto rafaelroquetto marked this pull request as draft February 15, 2025 02:25
@rafaelroquetto rafaelroquetto force-pushed the ebpf_gen branch 10 times, most recently from fe7f9f6 to 0d40b99 Compare February 25, 2025 21:23
@rafaelroquetto rafaelroquetto changed the title WIP: Generate eBPF files as part of the build process Generate eBPF files as part of the build process Feb 26, 2025
@rafaelroquetto rafaelroquetto marked this pull request as ready for review February 26, 2025 22:51
@rafaelroquetto rafaelroquetto removed the do-not-merge WIP or something that musn't be merged label Feb 26, 2025
Copy link
Contributor

@mariomac mariomac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -1,3 +1,5 @@
//go:build beyla_bpf_ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to add this from now on in all our C files?

Is this tag specified later in any other go command invocation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct - it's never specified in any other go command invocation though - the purpose of this tag is so that does not try to build these .c files as part of package bpf (and then, in our case, complain there's no CGO). In other words, it prevents go from building those, as they are actually built by clang/bpf2go.


parser := newargParser(line)

// yank //go-generate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: what are these comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is parsing a string like this one:

//go:generate $BPF2GO -cc $BPF_CLANG -cflags $BPF_CFLAGS -target amd64,arm64 bpf ../../../../bpf/tc_http_tp.c -- -I../../../../bpf/headers

the way the parser works is by "shifting" the arguments left - so here, we are no interested in the //go:generate and the $BPF2GO args, so we just discard them.

Comment on lines +126 to +147
if arg == "-target" {
if arg, ok = parser.shift(); ok {
ctx.goArchs = strings.Split(arg, ",")
}
} else if arg == "-output-stem" {
if arg, ok = parser.shift(); ok {
ctx.stem = strings.ToLower(arg)
}
} else if arg == "-output-dir" {
if arg, ok = parser.shift(); ok {
ctx.outDir = arg
}
} else if arg == "--" {
break
} else if arg[0] == '-' {
parser.shift()
} else if ctx.stem == "" {
ctx.stem = strings.ToLower(arg)
} else if ctx.srcFile == "" {
ctx.srcFile = filepath.Join(ctx.srcDir, arg)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe now is too late, but do CLI in a cleaner way, is usually better to use https://github.com/spf13/cobra, which is widely used in Grafana

Copy link
Contributor

@mariomac mariomac Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good observation, Marc! Actually, even if you don't use Cobra, Golang standard library allows simple handling of command line arguments. The above arg parsing would be simpler as:

package main

import (
	"flag"
	"fmt"
)

func main() {
    target := flag.String("target", "", "the target architecture for the generated code")
    outputStem := flag.String("output-stem", "", "the stem for the generated code")
    outputDir := flag.String("output-dir", "", "the output directory for the generated code")
    flag.Parse()
    fmt.Println("target", *target)
    fmt.Println("outputStem", *outputStem)
    fmt.Println("outputDir", *outputDir)
    fmt.Printf("args %#v\n", flag.Args())
}

Examples:

$ go run main.go -h
Usage of /..../main:
  -output-dir string
    	the output directory for the generated code
  -output-stem string
    	the stem for the generated code
  -target string
    	the target architecture for the generated code

$ go run main.go --output-dir foo --target=the-target -output-stem out-stem -- some extra args --unparsed-flag
target the-target
outputStem out-stem
outputDir foo
args []string{"some", "extra", "args", "--unparsed-flag"}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few reasons why I don't want to go down the route of using cobra. First, it's not really parsing a real command line here, but just string parsing, i.e. the following line:

//go:generate $BPF2GO -cc $BPF_CLANG -cflags $BPF_CFLAGS -target amd64,arm64 bpf ../../../../bpf/tc_http_tp.c -- -I../../../../bpf/headers` line.

Second, the "parser" is written in a fashion that there are no allocations - you will see that shift() returns a sub-slice of original string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate what's is parser.shift() is doing and why it has to be that way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - this is basically tokenizing the string - given a string that contains token separated by any number of white space:

"foo bar     baz abcdef 124 --   yxz"

shift() returns the next token.

so in this case:

parser := newParser("foo bar     baz abcdef 124 --   yxz")
a := parser.shift() // returns foo as a substring/slice - no copy/alloc
b := parser.shift() // returns bar a substring/slice - no copy/alloc
c := parser.shift() // returns baz a substring/slice - no copy/alloc
...

It enables me to parse things with a well defined order (i.e. as a stream of tokens), which then lets me infer the order, or the next expected argument for correctness, as in this case:

arg, ok := parser.shift()

if !ok { return }
if arg == "-out-dir" {
   // I expect another token that is the out dir
   outdir, ok := parser.shift()
}

it's basically a poor man's state machine...

This is inspired by Perl's shift operator and, to a lesser extent, strtok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for completion, this is analogous to string.Fields(...) - but that returns an array of strings, and iterating over an array leads to more cumbersome code in this case, because you need to be able to advance the index inside the loop, i.e.

args := strings.Split("//go:generate $BPF2GO -cc $BPF_CLANG -cflags $BPF_CFLAGS -target amd64,arm64 bpf ../../../../bpf/tc_http_tp.c -- -I../../../../bpf/headers` line.")

for i := 0; i < len(args); ++i {
   if args[i] == "-out-dir" {
      i++
      if i >= len(args) {
        break; // error
      }
      outdir := args[i]
      continue
   }
}

compare with

arg, ok := parser.shift()

if !ok { return }
if arg == "-out-dir" {
   // I expect another token that is the out dir
   outdir, ok := parser.shift()
}

so I prefer the kind of semantics that shift() offers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marctc I would like to get this in (it's been a bit tedious keeping this up to date/rebases). If anything, we can discuss this particular code snippet separately and I can raise a new PR if push comes to shove.

@rafaelroquetto rafaelroquetto merged commit 998ab56 into main Feb 27, 2025
14 checks passed
@rafaelroquetto rafaelroquetto deleted the ebpf_gen branch February 27, 2025 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Build ebpf objects from a standardised container
3 participants