Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a Docker image for topic generator #505

Merged
merged 30 commits into from
Oct 21, 2024

Conversation

coverbeck
Copy link
Contributor

@coverbeck coverbeck commented Oct 4, 2024

Description
Creates a Docker image for the topic generator. We'll use that image for the scheduled task in Fargate.

Depends on companion workflow-actions PR

Review Instructions
See review steps in dockstore/workflow-actions#2

Issue
SEAB-6653

Security
If there are any concerns that require extra attention from the security team, highlight them here.

Please make sure that you've checked the following before submitting your pull request. Thanks!

  • Check that you pass the basic style checks and unit tests by running mvn clean install in the project that you have modified (until https://ucsc-cgl.atlassian.net/browse/SEAB-5300 adds multi-module support properly)
  • Ensure that the PR targets the correct branch. Check the milestone or fix version of the ticket.
  • If you are changing dependencies, check with dependabot to ensure you are not introducing new high/critical vulnerabilities
  • If this PR is for a user-facing feature, create and link a documentation ticket for this feature (usually in the same milestone as the linked issue). Style points if you create a documentation PR directly and link that instead.

@coverbeck coverbeck self-assigned this Oct 4, 2024
Dockerfile Outdated
RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get install -y --no-install-recommends
COPY topicgenerator/target/topicgenerator*[^s].jar /home
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could add other JARs from this repo in there, but the metrics aggregator and github delivery both require s3 permissions and documented to be run on the deployer, where, AFAIK, we don't have Docker installed. So for now, I'm only doing this one, but we could more in the future.

Copy link

codecov bot commented Oct 4, 2024

Codecov Report

Attention: Patch coverage is 28.57143% with 5 lines in your changes missing coverage. Please review.

Project coverage is 19.47%. Comparing base (b964bfb) to head (048dc9e).
Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
...opicgenerator/client/cli/TopicGeneratorClient.java 0.00% 5 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             develop     #505      +/-   ##
=============================================
+ Coverage      15.91%   19.47%   +3.56%     
- Complexity       154      178      +24     
=============================================
  Files             57       57              
  Lines           2897     2901       +4     
  Branches         228      228              
=============================================
+ Hits             461      565     +104     
+ Misses          2400     2297     -103     
- Partials          36       39       +3     
Flag Coverage Δ
tooltester 15.89% <0.00%> (-0.03%) ⬇️
topicgenerator 19.47% <28.57%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@coverbeck coverbeck force-pushed the feature/seab-6653/topicUpdater branch from 5ad260d to 7a4d732 Compare October 10, 2024 22:14
cd /home

echo "Creating a CSV file of AI topic candidates from Dockstore"
java -jar $APP_JAR -c $CONFIG get-topic-candidates
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command doesn't exist yet, it's in PR #504.

@coverbeck coverbeck force-pushed the feature/seab-6653/topicUpdater branch from a686116 to ee11fd3 Compare October 14, 2024 17:33
CLAUDE_3_5_SONNET("anthropic.claude-3-5-sonnet-20240620-v1:0", 0.003, 0.015, 200000),
CLAUDE_3_HAIKU("anthropic.claude-3-haiku-20240307-v1:0", 0.00025, 0.00125, 200000),
CLAUDE_3_5_SONNET("us.anthropic.claude-3-5-sonnet-20240620-v1:0", 0.003, 0.015, 200000),
CLAUDE_3_HAIKU("us.anthropic.claude-3-haiku-20240307-v1:0", 0.00025, 0.00125, 200000),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to change these to be cross-region inferences, so QA would work.

@coverbeck coverbeck force-pushed the feature/seab-6653/topicUpdater branch from 52fb5fa to d3c477a Compare October 15, 2024 16:25
final String mainScriptPath = NextflowUtilities.grabConfig(nextflowConfigFileContent).getString("manifest.mainScript", "main.nf");
final String mainScriptPath;
try {
mainScriptPath = NextflowUtilities.grabConfig(nextflowConfigFileContent).getString("manifest.mainScript", "main.nf");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First time I ran this, I believe I got a NextflowParsingException, which caused the program to exit early. Put this in a try catch.

Kathy thinks that it may have just been logging noise that I saw, so I could be wrong. But the method can throw a NextFlowParsingException, so we should catch it anyway.


cd /home

echo "Generating topics for all AI topic candidates from Dockstore"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"for all AI topic candidates" -> for 500

topicgenerator/scripts/updateDockstoreTopicsFromDocker.sh Outdated Show resolved Hide resolved
RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get install -y --no-install-recommends
COPY topicgenerator/target/topicgenerator*[^s].jar /home/topic-generator.jar

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the [^s] do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a build, there are 2 JARs in the target directory:

topicgenerator-1.16.0-SNAPSHOT-sources.jar
topicgenerator-1.16.0-SNAPSHOT.jar

The [^s] is what I came up with to skip the sources JAR.

@kathy-t
Copy link
Contributor

kathy-t commented Oct 18, 2024

Added two additional things to this PR:

  • In the maven.yml GitHub Action, added steps to build and push an image to quay.io/dockstore/dockstore-support if the branch being pushed to is develop. This is to support the auto generation of the SupportVersion property in the properties file and so that we don't need to do a tagged release every time we need a new image in QA.
  • Installed additional dependencies in the Docker image to support uploading files to S3

Re-requesting reviews

Copy link

sonarcloud bot commented Oct 18, 2024

echo "S3_FOLDER=${S3_FOLDER}-$(echo $GITHUB_SHA | cut -c -7)" >> $GITHUB_ENV

# neat, quay itself uses manual github actions https://github.com/quay/quay/blob/master/.github/workflows/build-and-publish.yaml
- if: github.ref == 'refs/heads/develop'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point in the future, especially if we end up creating additional Docker images, we might want to consider moving this code to dockstore/workflows. But looks good now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I noticed we create images and upload checksums for images in dockstore/dockstore and dockstore-deploy as well, so it'll be good to create a re-usable workflow. I'll create a ticket for this or maybe this can be tackled when we implement uploading the image digest for this image

@coverbeck
Copy link
Contributor Author

I can't approve this because I'm still the assignee -- not sure if that's intentional or not, so signaling my approval here.

@kathy-t kathy-t merged commit 0db8b4b into develop Oct 21, 2024
10 checks passed
@kathy-t kathy-t deleted the feature/seab-6653/topicUpdater branch October 21, 2024 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants