Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

attacher: try different kernel sources if there is none in default locations #733

Merged
merged 7 commits into from
Jun 15, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions build/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,10 @@ ADD https://raw.githubusercontent.com/sustainable-computing-io/kepler-model-serv

ADD https://raw.githubusercontent.com/sustainable-computing-io/kepler-model-server/main/tests/test_models/AbsComponentModelWeight/Full/KerasCompWeightFullPipeline/KerasCompWeightFullPipeline.json /var/lib/kepler/data/KerasCompWeightFullPipeline.json

# pre install kernel sources
RUN mkdir -p /usr/share/kepler/kernel_sources

COPY --from=quay.io/sustainable_computing_io/kepler_kernel_source_images:ubi8 /usr/src/kernels /usr/share/kepler/kernel_sources
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this be kept in the released image? how big it is for the copied files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

each kernel-devel is about 5M

COPY --from=quay.io/sustainable_computing_io/kepler_kernel_source_images:ubi9 /usr/src/kernels /usr/share/kepler/kernel_sources

ENTRYPOINT ["/usr/bin/kepler"]
4 changes: 4 additions & 0 deletions build/kernel-source-images/Dockerfile.ubi
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
FROM ImageName

ARG ARCH=amd64
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SamYuan1990 seems we need add s390x me too as well?

RUN yum install -y kernel-devel
14 changes: 14 additions & 0 deletions build/kernel-source-images/build-kernel-source-images.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#!/bin/bash
set -x
IMAGE_BASE="quay.io/sustainable_computing_io/kepler_kernel_source_images"
for i in "8" "9"; do
Image="registry.access.redhat.com/ubi${i}/ubi"
echo "Building $i"
# podman doesn't support --build-arg
# replace ImageName with the actual image name using sed
sed "s|ImageName|${Image}|g" Dockerfile.ubi > Dockerfile.ubi.${i}

docker build --build-arg ImageName=${Image} -t ${IMAGE_BASE}:ubi${i} -f Dockerfile.ubi.${i} .
docker push ${IMAGE_BASE}:ubi${i}
rm Dockerfile.ubi.${i}
done
2 changes: 2 additions & 0 deletions cmd/exporter.go
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ var (
profileDuration = flag.Int("profile-duration", 60, "duration in seconds")
enabledMSR = flag.Bool("enable-msr", false, "whether MSR is allowed to obtain energy data")
enabledBPFBatchDelete = flag.Bool("enable-bpf-batch-del", true, "bpf map batch deletion can be enabled for backported kernels older than 5.6")
kernelSourceDirPath = flag.String("kernel-source-dir", "", "path to the kernel source directory")
)

func healthProbe(w http.ResponseWriter, req *http.Request) {
Expand Down Expand Up @@ -151,6 +152,7 @@ func main() {
config.SetEnabledHardwareCounterMetrics(*exposeHardwareCounterMetrics)
config.SetEnabledGPU(*enableGPU)
config.EnabledMSR = *enabledMSR
config.SetKernelSourceDir(*kernelSourceDirPath)

// the ebpf batch deletion operation was introduced in linux kernel 5.6, which provides better performance to delete keys.
// but the user can enable it if the kernel has backported this functionality.
Expand Down
18 changes: 17 additions & 1 deletion pkg/bpfassets/attacher/bcc_attacher.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ package attacher

import (
"fmt"
"os"
"runtime"
"strconv"

Expand Down Expand Up @@ -138,9 +139,24 @@ func AttachBPFAssets() (*BpfModuleTables, error) {
}
// TODO: verify if ebpf can run in the VM without hardware counter support, if not, we can disable the HC part and only collect the cpu time
m, err := loadModule(objProg, options)
if err != nil {
klog.Infof("failed to attach perf module with options %v: %v, from default kernel source.\n", options, err)
dirs := config.GetKernelSourceDir()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: better rename it to SourceDirs if we allow multiple dirs ..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to the suggestion above and if you follow https://go.dev/doc/effective_go#Getters idiom, the recommended name would be config.KernelSourceDirs() 👼

for _, dir := range dirs {
klog.Infof("try to load eBPF module with kernel source dir %s\n", dir)
rootfs marked this conversation as resolved.
Show resolved Hide resolved
os.Setenv("BCC_KERNEL_SOURCE", dir)
m, err = loadModule(objProg, options)
if err != nil {
klog.Infof("failed to attach perf module with options %v: %v, from kernel source %q\n", options, err, dir)
} else {
klog.Infof("Successfully load eBPF module with option: %s from kernel source %q", options, dir)
rootfs marked this conversation as resolved.
Show resolved Hide resolved
break
}
}
}
if err != nil {
klog.Infof("failed to attach perf module with options %v: %v, not able to load eBPF modules\n", options, err)
return nil, err
return nil, fmt.Errorf("failed to attach perf module with options %v: %v, not able to load eBPF modules", options, err)
}

tableId := m.TableId(tableProcessName)
Expand Down
27 changes: 27 additions & 0 deletions pkg/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,9 @@ var (

configPath = "/etc/kepler/kepler.config"

// dir of kernel sources for bcc
kernelSourceDir = []string{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems it's a plural ?


////////////////////////////////////
ModelServerEnable = getBoolConfig("MODEL_SERVER_ENABLE", false)
ModelServerEndpoint = SetModelServerReqEndpoint()
Expand Down Expand Up @@ -148,6 +151,30 @@ func getConfig(configKey, defaultValue string) (result string) {
return
}

// SetKernelSourceDir sets the directory for all kernel source. This is used for bcc. Only the top level directory is needed.
func SetKernelSourceDir(dir string) {
rootfs marked this conversation as resolved.
Show resolved Hide resolved
// read all the kernel source directories
if dir != "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the above proposed validation, we shouldn't need to check if dir != ""

// list all the directories under `dir` and store in kernelSourceDir
klog.V(4).Infoln("kernel source dir is set to", dir)
files, err := os.ReadDir(dir)
if err != nil {
klog.Warning("failed to read kernel source dir", err)
}
kernelVers := fmt.Sprintf("%v", getKernelVersion(c))
for _, file := range files {
// only store directories and the directory name should contain the kernel version
if strings.Contains(file.Name(), kernelVers) && file.IsDir() {
kernelSourceDir = append(kernelSourceDir, filepath.Join(dir, file.Name()))
}
}
}
}

func GetKernelSourceDir() []string {
return kernelSourceDir
}

func SetModelServerReqEndpoint() (modelServerReqEndpoint string) {
modelServerURL := getConfig("MODEL_SERVER_URL", modelServerService)
if modelServerURL == modelServerService {
Expand Down