From a6c14751de1f9cff74059f96ff27449545f3b75e Mon Sep 17 00:00:00 2001 From: "Anna (Anya) Parker" <50943381+anna-parker@users.noreply.github.com> Date: Thu, 29 Aug 2024 21:28:46 +0200 Subject: [PATCH] Add local debug instructions --- ena-submission/README.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/ena-submission/README.md b/ena-submission/README.md index 189f183d78..6aee6a8b6a 100644 --- a/ena-submission/README.md +++ b/ena-submission/README.md @@ -74,3 +74,22 @@ Then run snakemake using `snakemake` or `snakemake {rule}`. micromamba activate loculus-ena-submission python3 scripts/test_ena_submission.py ``` + +### Testing submission locally + +ENA-submission currently is only triggered after manual approval. + +The `get_ena_submission_list` runs as a cron-job. It queries Loculus for new sequences to submit to ENA (these are sequences that are in state OPEN, were not submitted by the INSDC_INGEST_USER, do not include ena external_metadata fields and are not yet in the submission_table of the ena-submission schema). If it finds new sequences it sends a notification to slack with all sequences. + +It is then the reviewer's turn to review these sequences. [TODO: define review criteria] If these sequences meet our criteria they should be uploaded to [pathoplexus/ena-submission](https://github.com/pathoplexus/ena-submission/blob/main/approved/approved_ena_submission_list.json) (currently we read data from the [test folder](https://github.com/pathoplexus/ena-submission/blob/main/test/approved_ena_submission_list.json) - but this will be changed to the `approved` folder in production). The `trigger_submission_to_ena` rule is constantly checking this folder for new sequences and adding them to the submission_table if they do not exist. Note we cannot yet handle revisions so these should not be added to the approved list [TODO: do not allow submission of revised sequences in `trigger_submission_to_ena`]- revisions will still have to be performed manually. + +If you would like to test `trigger_submission_to_ena` while running locally you can also use the `trigger_submission_to_ena_from_file` rule, this will read in data from `results/approved_ena_submission_list.json` (see the test folder for an example). You can also upload data to the [test folder](https://github.com/pathoplexus/ena-submission/blob/main/test/approved_ena_submission_list.json) - note that if you add fake data with a non-existent group-id the project creation will fail, additionally the `upload_to_loculus` rule will fail if these sequences do not actually exist in your loculus instance. + +All other rules query the `submission_table` for projects/samples and assemblies to submit. Once successful they add accessions to the `results` column in dictionary format. Finally, once the entire process has succeeded the new external metadata will be uploaded to Loculus. + +Note that ENA's dev server does not always finish processing and you might not receive a gcaAccession for your dev submissions. If you would like to test the full submission cycle on the ENA dev instance it makes sense to manually alter the gcaAccession in the database using `ERZ24784470`. You can connect to a preview instance via port forwarding to these changes on local database tool such as pgAdmin: + +1. Apply the preview `~/.kube/config` +2. Find the database POD using `kubectl get pods -A | grep database` +3. Connect via port-forwarding `kubectl port-forward $POD -n $NAMESPACE 5432:5432` +4. If necessary find password using `kubectl get secret`