-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ledger of software, versions, commit hashes in worker images #476
Comments
For research and reference purposes, this is often referred to as a "software bill of materials" (SBOM). Since we control most of the ecosystem in question, and the pieces have many uses outside the container images, I think a good initial approach to this would be to make sure that the component packages (especially ngen, ngen-cal, and t-route) can easily identify/describe themselves. |
@PhilMiller, thanks for the suggestion! I brought up SBOMs in the past in the context of NextGen when there was conversation about getting rid of the To your second point, I agree. There are conventions in the python community for reporting version and system information that I will use as inspiration. To my knowledge, those conventions rely on the use of semantic versioning which across the organization, we use / update loosely at best. I will investigate to determine a short term solution, at least for our python packages, that does not solely rely on semver. |
Two things:
|
Got around to investigating solutions for capturing version and version control information in python packages. The main criteria was to find a long term solution that is easily added to our existing python repositories that causes little friction and works well with namespace packages. Below are the solutions I investigated:
|
I agree that we eventually need a long-term solution, as you stated in your criteria, but I question it as a hard requirement for shorter-term improvements. We could expediently modify our scripts to capture the simple stuff, and then start work on integrating something more robust and adapting other bits ( |
Completely agree. I just opened a proposal PR NOAA-OWP/ngen-cal#92 that could serve as a pathway for adoption if we choose to go this route. |
I do like the |
@robertbartel, thanks for bringing up that point. If im understanding you correctly, that would imply that we need a way to scope which files contribute to a package and compute the version relative to the last change in that scope, right? |
So, more plainly, compute package A's version relative to package A's top level repo directory. Ive got a working prototype of scoped versioning working using |
@aaraney, I think so, but it occurs to me that, while this is likely a worthwhile improvement, it is a solution to a different (and less pressing) problem. I.e., we have a working mechanism for versioning DMOD packages in the worker image. It isn't optimal, but it is sufficient, and we should be able figure out exactly what DMOD package code is used by a container. I suppose we could also record the versions to a central file for convenience, but that's still separate from most of what we've been discussing. The real problem is other stuff: ngen, t-route, manually compiled dependencies like MPI, etc. We can't easily reverse engineer exactly what versions of those are running. We need to start recording those items, ideally centrally, but at least somewhere that allows visibility after the image is built. |
Exactly! I should have elaborated on the scope. I was thinking other OWP python projects would adopt this approach too, to unify the ecosystem. At the minimum the |
In debugging #472, I have found myself chasing down the version / commit hash of software built and installed in the
NextGen
image. IMO it would be invaluable to record the software, version (if possible), and git hash (if applicable) in a file in the final image. For example, in debugging an issue with t-route, I was not able to determine the git hash of the code installed in the image. This made it difficult to know if the issue has already been addressed in a newer version of the code, report on the origin of the faulty code, or setup an isolated environment to try and reproduce the issue.This is a hard problem if you go down the seemingly endless rabbit hole. I don't think it is the best use of our time to try and completely solve the problem, but instead provide a reasonable amount of information that will primary be used for debugging purposes.
The text was updated successfully, but these errors were encountered: