Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix docker build #416

Merged
merged 2 commits into from
Feb 21, 2024
Merged

Fix docker build #416

merged 2 commits into from
Feb 21, 2024

Conversation

victorlin
Copy link
Member

@victorlin victorlin commented Aug 11, 2023

Description of proposed changes

Fix the Docker build by switching away from Pipenv and removing some unused dependencies.

Related issue(s)

Fixes #413

Testing

@victorlin victorlin requested a review from a team August 11, 2023 16:31
@victorlin victorlin self-assigned this Aug 11, 2023
@victorlin victorlin force-pushed the victorlin/fix-docker-build branch from 48346aa to 04577fa Compare August 14, 2023 16:54
Copy link
Member

@tsibley tsibley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for ditching Pipenv!

@victorlin victorlin force-pushed the victorlin/fix-docker-build branch from 04577fa to 2d94dc9 Compare August 19, 2023 00:03
@victorlin victorlin marked this pull request as draft September 21, 2023 23:06
@corneliusroemer
Copy link
Member

Any progress here? Our CI is still broken 🙃

@victorlin
Copy link
Member Author

@corneliusroemer not yet. I'll prioritize this for myself since I'm now realizing that ncov-ingest has been stuck using old versions of Nextclade and Augur. Since Nextclade 3 is now the default in the newer nextstrain/base images, we may have to bundle some Nextclade-related changes with this PR to at least use nextclade2 explicitly.

@joverlee521
Copy link
Contributor

@corneliusroemer not yet. I'll prioritize this for myself since I'm now realizing that ncov-ingest has been stuck using old versions of Nextclade and Augur. Since Nextclade 3 is now the default in the newer nextstrain/base images, we may have to bundle some Nextclade-related changes with this PR to at least use nextclade2 explicitly.

ncov-ingest currently downloads Nextclade executable as part of the pipeline, so it's not using Nextclade version from the nextstrain/base image. @corneliusroemer is already migrating this to Nextclade V3 in #435.

@victorlin
Copy link
Member Author

Oh thanks for clarifying! Is it common to have a Snakemake rule for env setup? I would expect it to be in the Dockerfile - that used to be the case but it was removed in 2ec2a6b. My guess is that Docker caching made the "latest" download outdated. Either an exact version needs to be specified or cache should be invalidated for the "latest" download (this is what we do for nextstrain/base).

@joverlee521
Copy link
Contributor

Is it common to have a Snakemake rule for env setup?

I think this is left over from historical use of Nextclade before it was part of nextstrain/base. Previous conversation on this makes it seem like we can remove this from the Snakemake workflow itself and just use the Nextclade version available in the nextstrain/base image.

@corneliusroemer
Copy link
Member

Yes, we can drop in the future if we have a new image, but right now it obviously doesn't work yet 😀

@victorlin
Copy link
Member Author

I see it's on the task list for #375. Thanks for filling me in here! I'll still look into fixing the image build.

@victorlin
Copy link
Member Author

victorlin commented Feb 20, 2024

I'm going to rebase this onto latest master, run GenBank fetch and ingest (on branch), then merge if everything succeeds.

EDIT: GenBank fetch and ingest running: GitHub workflow, AWS Batch

EDIT 2: Oops, I did the test wrong. Need to run the workflow on the PR image using aab1f5f, not the PR source branch. Running: GitHub workflow, AWS Batch

The previous setup using Pipenv started failing on `pipenv sync` without
any obvious changes on our end¹.

Instead of trying to fix the Pipenv setup, I chose to move the
dependencies from the Pipfile into a simpler requirements.txt. This lets
pip do its own dependency resolution which is sufficient for the needs
here.

¹ #413
The last usage of fsspec was removed in efd1a31.

s3fs and aiobotocore were only necessary as runtime deps of fsspec¹, so
those can be removed.

¹ nextstrain/cli@d1e54c3
@victorlin victorlin force-pushed the victorlin/fix-docker-build branch from 2d94dc9 to caf1be4 Compare February 20, 2024 19:28
@victorlin victorlin marked this pull request as ready for review February 20, 2024 23:16
@victorlin victorlin dismissed tsibley’s stale review February 20, 2024 23:16

updated to use requirements.txt

@victorlin victorlin merged commit e6c904f into master Feb 21, 2024
4 checks passed
@victorlin victorlin deleted the victorlin/fix-docker-build branch February 21, 2024 18:50
Copy link
Contributor

@joverlee521 joverlee521 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the GitHub workflow was still using the nextstrain/ncov-ingest image rather than the branch image added in aab1f5f.

I just ran the GH Action for #437 and ran into an error on AWS Batch:

[batch] [2024-02-21T13:26:07-08:00] Traceback (most recent call last):
[batch] [2024-02-21T13:26:07-08:00]   File "/usr/local/lib/python3.10/site-packages/snakemake/__init__.py", line 610, in snakemake
[batch] [2024-02-21T13:26:07-08:00]     workflow = Workflow(
[batch] [2024-02-21T13:26:07-08:00]   File "/usr/local/lib/python3.10/site-packages/snakemake/workflow.py", line 235, in __init__
[batch] [2024-02-21T13:26:07-08:00]     self._sourcecache = SourceCache()
[batch] [2024-02-21T13:26:07-08:00]   File "/usr/local/lib/python3.10/site-packages/snakemake/sourcecache.py", line 353, in __init__
[batch] [2024-02-21T13:26:07-08:00]     os.makedirs(self.cache, exist_ok=True)
[batch] [2024-02-21T13:26:07-08:00]   File "/usr/local/lib/python3.10/os.py", line 215, in makedirs
[batch] [2024-02-21T13:26:07-08:00]     makedirs(head, exist_ok=exist_ok)
[batch] [2024-02-21T13:26:07-08:00]   File "/usr/local/lib/python3.10/os.py", line 215, in makedirs
[batch] [2024-02-21T13:26:07-08:00]     makedirs(head, exist_ok=exist_ok)
[batch] [2024-02-21T13:26:07-08:00]   File "/usr/local/lib/python3.10/os.py", line 225, in makedirs
[batch] [2024-02-21T13:26:07-08:00]     mkdir(name, mode)
[batch] [2024-02-21T13:26:07-08:00] PermissionError: [Errno 13] Permission denied: '/nextstrain/.cache/snakemake'

@joverlee521
Copy link
Contributor

Ah, error comes from the removal of rm -rf ~/.cache, see details in fd2b922.

Will make a follow-up PR shortly.

@victorlin
Copy link
Member Author

Oh sorry! In aab1f5f I changed fetch-and-ingest-genbank-master.yml instead of fetch-and-ingest-genbank-branch.yml 🤦

@joverlee521 joverlee521 mentioned this pull request Feb 21, 2024
1 task
@joverlee521
Copy link
Contributor

joverlee521 commented Feb 21, 2024

No worries @victorlin, the set up of GH Action workflows in this repo is a bit outdated.
I'm planning on doing updates to reduce potential confusion like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Unable to build Docker image
4 participants