RHOAIENG-9707: chore(tests/containers): check shared objects with ldd #871

jiridanek · 2025-01-29T07:46:09Z

This sanity check will report any ELF files with dynamic linking that have unsatisfied dependencies.

apply the "LD_LIBRARY_PATH": os.path.dirname(dlib) fix/workaround

cuda image already has some LD_LIBRARY_PATH, this is to be preserved

(app-root) sh-5.1$ echo $LD_LIBRARY_PATH
/usr/local/nvidia/lib:/usr/local/nvidia/lib64

use subtests to handle multiple libs not found
skip known issue and false positives

Description

Running ldd one by one on every file is quite lengthy on macOS, due to cross-architecture emulation

(app-root) bash-5.1$ time bash -c 'i=0; while [[ $i -lt 100 ]]; do ldd /bin/bash >/dev/null; i=$((i+1)); done'

real    0m5.684s
user    0m4.527s
sys     0m0.795s

compare that with native linux vm, without this cross-arch translation

[jdanek@lima-default workbenches]$ time bash -c 'i=0; while [[ $i -lt 100 ]]; do ldd /bin/bash >/dev/null; i=$((i+1)); done'

real    0m0.220s
user    0m0.130s
sys     0m0.061s

Since there's 1000 to 2000 dynamic libs to check in each of our images, the check on a macOS when using ldd can take 70 or more seconds.

When mentioned on a daily meeting, ppl weren't very concerned, so I'm not going to optimize this now.

For the future, already tried out one approach, using the debug/elf package in Go's standard library to simply read the DT_NEEDED field out of the libraries and check that. It's quite a lot of code, but it seems to work, and work fast.
check_elf_file_2.go.txt

Alternatively, it should be possible to just go around loading the dlls from the python script directly, to see if it loads or not

            from ctypes import CDLL
            try:
                CDLL(lib_path)
                print(f"Successfully loaded: {lib}")
            except Exception as e:
                print(f"Error loading {lib}: {e}")

Question is whether binaries can be casually dlopened like this. I did this in skupper-router (dlopening the current executable ;) so it is possible sometimes, just don't know if always; worry that binary has to be compiled with some special options.

How Has This Been Tested?

https://github.com/jiridanek/notebooks/actions/runs/13026586540

Checked all what it found by hand and populated an allowlist to suppress false positives.

Over all, the usefulness of this check will be best seen when we do version upgrades or some significant version upgrades for 2025.a images. Is this going to be full of false positives for trivialities, or is it going to save us from dll hell? So far it found a true positive https://issues.redhat.com/browse/RHOAIENG-18904, but I do worry that the maintenance of the allowlist is not going to be worth it.

Possibly running import xyz for every python package we have is more useful. Adding to that small helloworld script to check functionality would be even more powerful.

Merge criteria:

The commits are squashed in a cohesive manner and have meaningful messages.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work

This sanity check will report any ELF files with dynamic linking that have unsatisfied dependencies. * apply the `"LD_LIBRARY_PATH": os.path.dirname(dlib)` fix/workaround cuda image already has some LD_LIBRARY_PATH, this is to be preserved ``` (app-root) sh-5.1$ echo $LD_LIBRARY_PATH /usr/local/nvidia/lib:/usr/local/nvidia/lib64 ``` * use subtests to handle multiple libs not found * skip known issue and false positives

openshift-ci · 2025-01-29T07:46:17Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from jiridanek. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

jstourac · 2025-01-29T10:26:45Z

tests/containers/base_image_test.py

+                    isdirectory = stat.S_ISDIR(s.st_mode)
+                    isfile = stat.S_ISREG(s.st_mode)
+                    executable = bool(s.st_mode & (stat.S_IXUSR | stat.S_IXGRP | stat.S_IXOTH))
+                    if isdirectory or not executable or not isfile:


I wonder whether we should log/print the skipped/ignored files here with the explanation?

maybe, but for /opt/app-root this will produce a lot of output that nobody in their right mind will have the energy to read through

cc @opendatahub-io/notebook-devs wdyt? do you like your pytest tests to be vvvverbose or not?

logged ticket for improving output

Improve both the clarity and verbosity of pytest outputs #876

jstourac · 2025-01-29T10:27:11Z

tests/containers/base_image_test.py

+                        continue
+                    with open(dlib, mode='rb') as fp:
+                        magic = fp.read(4)
+                    if magic != b'\x7fELF':


Similar as above - do we want to log/print files like these for the convenience?

jstourac · 2025-01-29T10:31:06Z

/lgtm

openshift-ci · 2025-01-29T10:58:23Z

@jiridanek: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/images	`6c707cb`	link	true	`/test images`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci bot requested review from dibryant and jstourac January 29, 2025 07:46

openshift-ci bot added size/l and removed size/l labels Jan 29, 2025

jstourac reviewed Jan 29, 2025

View reviewed changes

openshift-ci bot assigned jstourac Jan 29, 2025

openshift-ci bot added the lgtm label Jan 29, 2025

jiridanek mentioned this pull request Jan 29, 2025

Improve both the clarity and verbosity of pytest outputs #876

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RHOAIENG-9707: chore(tests/containers): check shared objects with ldd #871

RHOAIENG-9707: chore(tests/containers): check shared objects with ldd #871

jiridanek commented Jan 29, 2025 •

edited

Loading

openshift-ci bot commented Jan 29, 2025

jstourac Jan 29, 2025

jiridanek Jan 29, 2025

jiridanek Jan 29, 2025

jstourac Jan 29, 2025

jstourac commented Jan 29, 2025

openshift-ci bot commented Jan 29, 2025

RHOAIENG-9707: chore(tests/containers): check shared objects with ldd #871

Are you sure you want to change the base?

RHOAIENG-9707: chore(tests/containers): check shared objects with ldd #871

Conversation

jiridanek commented Jan 29, 2025 • edited Loading

Description

How Has This Been Tested?

Merge criteria:

openshift-ci bot commented Jan 29, 2025

jstourac Jan 29, 2025

Choose a reason for hiding this comment

jiridanek Jan 29, 2025

Choose a reason for hiding this comment

jiridanek Jan 29, 2025

Choose a reason for hiding this comment

jstourac Jan 29, 2025

Choose a reason for hiding this comment

jstourac commented Jan 29, 2025

openshift-ci bot commented Jan 29, 2025

jiridanek commented Jan 29, 2025 •

edited

Loading