-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pip] Don't record all dependencies as binary packages in the SBOM when allow_binary: true
is used
#813
Comments
Unless explicitly instructed otherwise, pip will prefer wheels over sdists during installation. If we have downloaded both the wheel(s) and a sdist for a package, reporting it as binary will be the right thing to do in most cases.
What would make more sense to me would be to let |
Except we wouldn't download both as mentioned in the description (admittedly, not the best wording):
...
This isn't backwards compatible though, because the field is a boolean. That's why I proposed to change the default internal implementation such that we would only download wheels for packages that don't distribute otherwise, hence recording the majority of packages as sdists which is a completely transparent change to the end user and still backwards compatible with our current JSON interface. |
Is this issue related to #730 ? |
Only if you drop support for booleans. We can support both. Code-wise, it could be a pydantic hook in the input model that converts
This sounds like a bigger breaking change. Builds that previously worked with |
Fair enough, that's a good argument.
I'd refrain from abusing pydantic validators to change field types on the go, but yeah, the idea behind the proposal is valid. |
This could be addressed by adding a new argument, something like |
Separating the options would allow for nonsensical inputs:
What's worse, the first of these nonsensical inputs would be easy to encounter. As a user, the obvious way to tell hermeto to allow wheels for |
When it comes to input options this is IMO a matter of clear documentation, it is common to have supplementary CLI options, so if this is properly documented how these options work together, I have no problem with this approach because it's cleaner and less work, less maintenance burden than the counterproposal IMO. |
Yeah, with good docs it should be fine |
Is addressed by the following rule:
As for the second example it is a degenerate case, but exactly what a user wants (apparently) -- someone went out of their way to add an argument and then block everything with it. |
Yeah, I wasn't worried about what the implementation would do but about the user experience. Maybe input validation could help here, to give users earlier feedback that they forgot to set the second option |
Yes, these situations do warrant a warning as possible anomalies. |
Whenever a user provides the
allow_binary
input JSON option, all dependencies are then recorded in the SBOM as binary packages even though many/most of the dependencies are actually distributed as tarballs as well. In fact, for dependencies that distribute both wheels and sdists we download both, but prefer the former over the latter when it comes to SBOM information recording. That may cause unnecessary friction for users with a ton of Python dependencies where they only need a handful of wheels (AI/ML related stuff), leading to potential problems with further policy checks (i.e. usage of binary packages vs building from sources) based on the SBOM we provided.When
allow_binary
is provided, our default behaviour should be to only record packages as binary ones when the resource isn't provided as an sdist (we should have that information at hand already). By doing this we can also clean up the prefetch cache significantly if we don't propagate a ton of wheels for packages that distribute as sdists as well.Conversely, if a consumer wishes to build out of wheels only and they are okay with the way how these are recorded in our SBOM then we should introduce a new option
enforce_binary
that would skip ignore all sdists unless a wheel isn't available in which case sdist should be fetched.The text was updated successfully, but these errors were encountered: