Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EVA-3748 - Release from DB #468

Merged
merged 3 commits into from
Feb 18, 2025
Merged

Conversation

nitin-ebi
Copy link
Contributor

No description provided.

@nitin-ebi nitin-ebi self-assigned this Feb 13, 2025
Comment on lines +228 to +232
/**
* The query performed in mongo can retrieve more variants than the actual ones because in some cases the same
* clustered variant is mapped against multiple locations. So we need to check that that clustered variant we are
* processing only appears in the VCF release file with the alleles from submitted variants matching the location.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering how much of this is still true. We should keep it for now but add a ticket to review this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will create a ticket.

Comment on lines +144 to +148
private List<SubmittedVariantEntity> getSubmittedVariantEntities(Set<Long> cveAccs) {
Query query = query(where(SVE_RS_FIELD).in(cveAccs).and(SVE_ASSEMBLY_FIELD).is(assembly).and(SVE_TAX_FIELD).is(taxonomy));
List<SubmittedVariantEntity> evaResults = mongoTemplate.find(query, SubmittedVariantEntity.class);
List<DbsnpSubmittedVariantEntity> dbsnpResults = mongoTemplate.find(query, DbsnpSubmittedVariantEntity.class);
return Stream.concat(evaResults.stream(), dbsnpResults.stream()).collect(Collectors.toList());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might never be a problem but this function can in theory return a very large number of variant creating a memory issue. It will be especially true in the merged and deprecated use case, less so in the case of active variants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can refactor as part of the next ticket when we write the queries for the operations

@nitin-ebi nitin-ebi merged commit 95d0199 into EBIvariation:master Feb 18, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants