Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: disregard RPM module build number in version comparison #2375

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

willmurphyscode
Copy link
Contributor

Previously, different build numbers of the same RPM version release string were compared lexicographically, leading to incorrect comparisons between RedHat and RedHat clone RPMs, resulting in FPs or FNs on centOS.

Comment on lines 119 to 128
aParts := strings.Split(a, "+")
bParts := strings.Split(b, "+")
if len(aParts) > 2 {
aParts = aParts[:len(aParts)-2]
}
if len(bParts) > 2 {
bParts = bParts[:len(bParts)-2]
}
trimmedA := strings.Join(aParts, "+")
trimmedB := strings.Join(bParts, "+")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this takes the presumption that the last two + should be ignored, but in reality any number of +s could be supplied... shouldn't we really be ignoring anything after the first +?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there arbitrarily many + separators? Is there like a spec to follow or something here @wagoodman?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see some version strings with 4 + in them in the grype database (apparently Oracle Linux RPMs). Other vendors seem only to ever have 3 + in the RPM.

Copy link
Contributor Author

@willmurphyscode willmurphyscode Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wagoodman it doesn't seem to work to remove everything after the first +, because we are sometimes comparing versions with a different number of +:

This query run in Grype DB:

select id, fixed_in_versions from vulnerability 
where version_format = "rpm" and fixed_in_versions like "%+%" and fixed_in_versions like "%module+el%" and namespace like "%red%" 
order by random() limit 10;
id fixed_in_versions
CVE-2019-2808 ["0:8.0.17-3.module+el8.0.0+3898+e09bb8de"]
CVE-2020-14350 ["0:9.6.20-1.module+el8.3.0+8938+7f0e88b6"]
CVE-2024-21051 ["0:8.0.36-1.module+el8.9.0+21207+6c20cb3d"]
CVE-2020-2898 ["0:8.0.21-1.module+el8.2.0+7855+47abd494"]
CVE-2019-9516 ["1:10.16.3-2.module+el8.0.0+4214+49953fda"]
CVE-2019-2481 ["0:8.0.17-3.module+el8.0.0+3898+e09bb8de"]
CVE-2021-23214 ["0:12.9-1.module+el8.5.0+13373+4554acc4"]
CVE-2022-21638 ["0:8.0.30-1.module+el8.6.0+16523+5cb0e868"]
CVE-2021-2305 ["0:8.0.26-1.module+el8.4.0+12359+b8928c02"]
CVE-2024-21896 ["1:20.11.1-1.module+el9.3.0+21385+bac43d5a"]

produces plenty of rows. But then

❯ syft -q -o json \
docker.io/anchore/test_images@sha256:524ff8a75f21fd886ec7ed82387766df386671e8b77e898d05786118d5b7880b | \
jq -r '.artifacts[] | .purl' | rg -e module_el -e module+el | \
python3 -c "import sys, urllib.parse; from packageurl import PackageURL; print('\n'.join(PackageURL.from_string(line.strip()).version for line in sys.stdin))"

10.3.28-1.module_el8.3.0+757+d382997d
10.3.28-1.module_el8.3.0+757+d382997d
2.066-4.module_el8.4.0+517+be1595ff
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
12.11-2.module_el8.6.0+1153+eb826827

Notice that in grype-db we see the substring module+el8.3.0 and in Syft PURLs we see the SBOM module_el8.3.0 (_ not +), so if we take the first plus, these compare incorrectly because the el8.3.0 is dropped from one but not the other. (edit: better shell oneliner, but no change in output)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know, but I don't think we can always assume we can operate on the last two + fields. What about parsing known fields from the start of the string through [_+]el\d and dropping everything after the first + found from that point on?

Previously, different build numbers of the same RPM version release
string were compared lexicographically, leading to incorrect comparisons
between RedHat and RedHat clone RPMs, resulting in FPs or FNs on centOS.

Signed-off-by: Will Murphy <[email protected]>
Also, add more test cases from different distros.

Signed-off-by: Will Murphy <[email protected]>
@willmurphyscode willmurphyscode force-pushed the fix-rpm-release-without-build-no branch from 12dcb51 to 736220a Compare February 13, 2025 18:37
@willmurphyscode willmurphyscode marked this pull request as draft February 14, 2025 16:32
@willmurphyscode
Copy link
Contributor Author

Based on some experimentation, this change is insufficient as is:

Consider two version numbers:

[root@19ac959dacf7 /]# rpmdev-vercmp 3:10.3.28-1.module_el8.3.0+757+d382997d 3:10.3.28-1.module+el8.3.0+10472+7adc332a
3:10.3.28-1.module_el8.3.0+757+d382997d < 3:10.3.28-1.module+el8.3.0+10472+7adc332a

This is correct! build 757 is earlier than build 10472. However, if we're comparing Centos8 artifacts against RHEL8 data (which we do), than it is incorrect, because the build numbers are not comparable.

This change, as is, would make us worse at comparing RHEL8 packages to RHEL8 vuln data, in exchange for making us (probably) better at comparing Centos8 packages to RHEL8 data. Instead, we should add logic to detect whether we're comparing across similar OSes, and use fuzzier version comparison only then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants