-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create and release a Docker image to run validations #57
Comments
I removed the assignees from this issue, as I don't think this is being actively worked on. |
I will investigate this today. |
Initial container in https://github.com/hubverse-org/test-docker-hubValidations. I will follow up tomorrow by experimenting with incorporating this into a fork of a hub. |
I've tested it in zkamvar/variant-nowcast-hub#1, which was created by duplicating the output in reichlab/variant-nowcast-hub#246 and adding an error to the value column Some observations:
|
My feeling is that we should not be wrapping the validation code within scripts in the container. I feel using an image that contains all necessary software and packages and using it to run the rest of the workflow steps is preferable for a number of reasons:
In addition before deploying any of this, I feel we need a clear system that will ensure images are up to date, ideally as automated as possible. Give all this and how important validation is to hub functionality, I feel this issue requires a RFC and decision document before moving forward further. |
Sorry! I think I confused what was going on when I looked at it last week but now that all the docs are in the repo, it's clear the way you've designed this can be used already as described above 🚀. Given the step to install packages will no longer be required in the vanilla validation workflow, it would be good once we've decided to roll the above out in our validation workflows to include an example of how hub admins could amend their workflow to include additional dependencies. I imagine the vignette here in |
One last question, should we be installing the released hubverse packages from r-universe instead of from GitHub? Would it be useful to maintain a dev version of the image with dev versions of the packages? |
No. We are using {renv} to control the versioning, which expects an archival repository. Theoretically, {renv} should fall back to GitHub if the package is from a universe repository, but I've found in rstudio/renv#2068 that this is still not the case.
I think so. |
I agree and I want to make it clear that my exercise here was to explore the space and come up with a prototype/minimal viable product. There are a lot of different layers that can be optimized and I wanted to explore them sufficiently before proposing a project. |
That's a good point. I'll have to look at https://rstudio.github.io/renv/articles/docker.html#dynamically-provisioning-r-libraries-with-renv to make sure our implementation can be extended. |
I know this is still WIP but I think some of the functionality is really useful and might be better as official It could be that we add it to the print method but only deploy through an argument (e.g. |
I can take a stabby-stab at it. I agree that the functionality should live in hubValidaitons. |
Back on thread: I'm now realizing that the renv.lock is over-complicating things a bit. The docker container could be built from a simple docker file. I think it might be possible for the docker files to actually live inside the repository and ignored in
We could have a GitHub workflow that will build the dev container on each push and on a schedule (to capture updates from dependencies). The release container will be built when a release is created. I've demonstrated this in my fork of hubUtils: https://github.com/zkamvar/hubUtils/ The container with the released version is https://github.com/zkamvar/hubUtils/pkgs/container/hubutils/339039092?tag=main The released version contains a and the container with the dev version is |
In past investigations essentially independently replicated by @matthewcornell and me, we've landed on building R-based containers using renv in order to get the installation of specific R package versions that renv provides, and which we've found somewhat difficult to achieve without renv. @matthewcornell has written up the procedures we've used here: https://github.com/reichlab/operational-models/blob/main/flu_ar2/..%2FREADME.md#requirementstxt-and-renvlock-details. I (and I'm sure Matt) would be interested to hear if you're envisioning something else specific to deal with this? |
I guess it really depends on what the container is for. If it's a drop-in replacement for our current GitHub workflow setup, then a simple Dockerfile that generates a container at release time is sufficient. If, however, we would need this for future reproducibility, then the renv.lock file is an important element because we would be able to re-build the container from an old renv.lock. I think for a container that should mirror the release and development versions of the packages , using a renv.lock might be a bit too prescriptive because we rebuild the containers every time we do a release, containing a snapshot of the R package environment at the time of release, which better reflects the testing environment and it reduces the burden of maintaining and synchronizing a renv.lock file across the dependent package containers (e.g. the renv.lock for hubUtils will serve as the base renv.lock for hubValidations). |
That all being said, we might be able to have the Dockerfile even generate the renv.lock as an artifact when it's built. |
makes sense. i guess there's a plan to distribute the built images via docker hub or similar, and that's what a user would effectively interact with? |
Yes. The plan is to use GitHub Workflows to build and deploy the containers to GitHub's container registry (ghcr.io). This is how https://github.com/hubverse-org/hub-dash-site-deployer and https://github.com/hubverse-org/hubPredEvalsData-docker are deployed. |
this would likely live in a different repo, but am filing the issue here for now
The text was updated successfully, but these errors were encountered: