-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ena-submission): Upload ena submission results to Loculus #2417
Conversation
bc7da3e
to
6353a76
Compare
039867d
to
61826a5
Compare
4d0d5d9
to
ebd05d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that there is at least one possible way for an SQL injection.
There are multiple places where a function constructs an SQL query via string concatenation using values passed into the function without escaping and validation. This is very dangerous and even for places where an attack is currently not possible because the input values are all under our control (e.g., the organism
field), this could potentially lead to future vulnerabilities.
Thanks @anna-parker for this huge piece of work!! I read (rather briefly) through the code and commented whenever I noticed something. I found a few vulnerabilities that should be fixed. Otherwise, it looks great for now! |
be659f1
to
c8f8530
Compare
4b9f73e
to
c75fa88
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll review as we go - let's try this out
…hub url is not found
SQL injection has been prevented with subsequent code changes - I am not sure why I need to dismiss this review it seems I should be able to do it automatically.
@chaoran-chen sorry I had to dismiss your review to merge this (but I did address and fix the issues you found), from discussion with @rneher and @corneliusroemer we will merge this now and work on #2832 in a separate PR. |
resolves #2424
preview URL: https://upload-ena-submission-res.loculus.org/
Summary
This PR encompasses the entire ena-submission process, for details see: ena-submission/README.md.
It adds database connection pooling, otherwise I get this error:
caused by Exhaustion of Local Ports: "When your Python application makes connections to the database, it uses a local (ephemeral) port on your machine. If your application opens many connections in a short period and doesn’t properly close them, you might run out of available local ports." - it might be possible to fix this just with better closing of port connections... I checked and the postgresDB can support
SHOW max_connections;
= 100. After the ena-submission-pod jobs were killed there were stillSELECT count(*) FROM pg_stat_activity;
= 20 connections to the db. Probably from the backend. I wasn't closing db updates (this would explain ~20-30 additional connections).It also fixes an issue in the backend when merging external metadata with other metadata fields, an alternative solution could be:
and use this like:
Screenshot
PR Checklist
Note Biosamples was down so I had to artificially add the values in.
(as biosamples is currently down), and then did the same for the erz_accession (this was successful but to not have to wait I set this to an existing erz accession). This finishes the submission cycle and then the results are sent to loculus:
and are seen on the website (with links that work except for sra... it seems like this isn't actually an sra accession??):
now without sra_run_accession: