-
Notifications
You must be signed in to change notification settings - Fork 607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: disregard RPM module build number in version comparison #2375
base: main
Are you sure you want to change the base?
Conversation
grype/version/rpm_version.go
Outdated
aParts := strings.Split(a, "+") | ||
bParts := strings.Split(b, "+") | ||
if len(aParts) > 2 { | ||
aParts = aParts[:len(aParts)-2] | ||
} | ||
if len(bParts) > 2 { | ||
bParts = bParts[:len(bParts)-2] | ||
} | ||
trimmedA := strings.Join(aParts, "+") | ||
trimmedB := strings.Join(bParts, "+") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this takes the presumption that the last two +
should be ignored, but in reality any number of +
s could be supplied... shouldn't we really be ignoring anything after the first +
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there arbitrarily many +
separators? Is there like a spec to follow or something here @wagoodman?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see some version strings with 4 +
in them in the grype database (apparently Oracle Linux RPMs). Other vendors seem only to ever have 3 +
in the RPM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wagoodman it doesn't seem to work to remove everything after the first +
, because we are sometimes comparing versions with a different number of +
:
This query run in Grype DB:
select id, fixed_in_versions from vulnerability
where version_format = "rpm" and fixed_in_versions like "%+%" and fixed_in_versions like "%module+el%" and namespace like "%red%"
order by random() limit 10;
id | fixed_in_versions |
---|---|
CVE-2019-2808 | ["0:8.0.17-3.module+el8.0.0+3898+e09bb8de"] |
CVE-2020-14350 | ["0:9.6.20-1.module+el8.3.0+8938+7f0e88b6"] |
CVE-2024-21051 | ["0:8.0.36-1.module+el8.9.0+21207+6c20cb3d"] |
CVE-2020-2898 | ["0:8.0.21-1.module+el8.2.0+7855+47abd494"] |
CVE-2019-9516 | ["1:10.16.3-2.module+el8.0.0+4214+49953fda"] |
CVE-2019-2481 | ["0:8.0.17-3.module+el8.0.0+3898+e09bb8de"] |
CVE-2021-23214 | ["0:12.9-1.module+el8.5.0+13373+4554acc4"] |
CVE-2022-21638 | ["0:8.0.30-1.module+el8.6.0+16523+5cb0e868"] |
CVE-2021-2305 | ["0:8.0.26-1.module+el8.4.0+12359+b8928c02"] |
CVE-2024-21896 | ["1:20.11.1-1.module+el9.3.0+21385+bac43d5a"] |
produces plenty of rows. But then
❯ syft -q -o json \
docker.io/anchore/test_images@sha256:524ff8a75f21fd886ec7ed82387766df386671e8b77e898d05786118d5b7880b | \
jq -r '.artifacts[] | .purl' | rg -e module_el -e module+el | \
python3 -c "import sys, urllib.parse; from packageurl import PackageURL; print('\n'.join(PackageURL.from_string(line.strip()).version for line in sys.stdin))"
10.3.28-1.module_el8.3.0+757+d382997d
10.3.28-1.module_el8.3.0+757+d382997d
2.066-4.module_el8.4.0+517+be1595ff
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
7.4.30-1.module_el8.7.0+1190+d11b935a
12.11-2.module_el8.6.0+1153+eb826827
Notice that in grype-db we see the substring module+el8.3.0
and in Syft PURLs we see the SBOM module_el8.3.0
(_
not +
), so if we take the first plus, these compare incorrectly because the el8.3.0
is dropped from one but not the other. (edit: better shell oneliner, but no change in output)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know, but I don't think we can always assume we can operate on the last two + fields. What about parsing known fields from the start of the string through [_+]el\d
and dropping everything after the first + found from that point on?
Previously, different build numbers of the same RPM version release string were compared lexicographically, leading to incorrect comparisons between RedHat and RedHat clone RPMs, resulting in FPs or FNs on centOS. Signed-off-by: Will Murphy <[email protected]>
Also, add more test cases from different distros. Signed-off-by: Will Murphy <[email protected]>
Signed-off-by: Will Murphy <[email protected]>
12dcb51
to
736220a
Compare
Based on some experimentation, this change is insufficient as is: Consider two version numbers:
This is correct! build 757 is earlier than build 10472. However, if we're comparing Centos8 artifacts against RHEL8 data (which we do), than it is incorrect, because the build numbers are not comparable. This change, as is, would make us worse at comparing RHEL8 packages to RHEL8 vuln data, in exchange for making us (probably) better at comparing Centos8 packages to RHEL8 data. Instead, we should add logic to detect whether we're comparing across similar OSes, and use fuzzier version comparison only then. |
Previously, different build numbers of the same RPM version release string were compared lexicographically, leading to incorrect comparisons between RedHat and RedHat clone RPMs, resulting in FPs or FNs on centOS.