Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is getSequences endpoint really used in full by anything? It seems no longer fit for purpose: should it be replaced? #3280

Open
corneliusroemer opened this issue Nov 23, 2024 · 2 comments
Labels
backend related to the loculus backend component discussion Open questions refactoring Code requires refactoring website Tasks related to the web application

Comments

@corneliusroemer
Copy link
Contributor

get-sequences endpoint currently does all of the following:

  • A) for a given group and organism: return status of all sequence entries (received/processing/processed/released)
  • B) for a given group and organism: return processing results for sequences in status "processed" (no issues/warnings/errors)
  • C) for a given group and organism and optionally filtered by status and/or processing result and optionally paginated:
                  SequenceEntriesView.accessionColumn,
                  SequenceEntriesView.versionColumn,
                  SequenceEntriesView.submissionIdColumn,
                  SequenceEntriesView.statusColumn,
                  SequenceEntriesView.isRevocationColumn,
                  SequenceEntriesView.groupIdColumn,
                  SequenceEntriesView.submitterColumn,
                  SequenceEntriesView.organismColumn,
                  SequenceEntriesView.submittedAtTimestampColumn,
                  SequenceEntriesView.errorsColumn,
                  SequenceEntriesView.warningsColumn,
                  SequenceEntriesView.processingResultColumn,
                  DataUseTermsTable.dataUseTermsTypeColumn,
                  DataUseTermsTable.restrictedUntilColumn,
    

Parts of it are used in the following places by the website (afaict):

I couldn't find anything that uses all of the other columns: in particular the data use terms seem to be joined for no reason. There's no user of it.

It looks like we could serve the current website much leanly by:

  • Having an endpoint that simply serves aggregate stats for a given group and organism and optionally returns accessions but nothing else

It's better to make a new dedicated endpoint if we need more stuff rather than having huge beasts that do lots of things but are slow and complicated as a result.

@corneliusroemer corneliusroemer added website Tasks related to the web application discussion Open questions backend related to the loculus backend component refactoring Code requires refactoring labels Nov 23, 2024
@corneliusroemer
Copy link
Contributor Author

I looked into it and we do use almost all the fields, including data use terms. Current situation is not a problem anymore now that we've got #3279 - if we find it getting slow again we can reconsider.

@corneliusroemer corneliusroemer closed this as not planned Won't fix, can't repro, duplicate, stale Nov 23, 2024
@corneliusroemer
Copy link
Contributor Author

If we separated the counts and the data to show on the cards, we could cache the data on the cards. That way we wouldn't have to repeatedly transfer large amounts of data every second, only do it initially.

Would be easy to use caching similar to what we've got already:

  • no need to refresh if there haven't been any database changes (table update tracker)
  • if there has been change, one can just calculate a hash of the result to save bandwidth
  • sequence entries could return all metadata (but not sequence data - as we don't show that)
  • that would be enough to create all the cards in one go without 50 requests

50 requests aren't that bad, but they are also not great.

Overall this is now a small issue, not a big one as we've solved perf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend related to the loculus backend component discussion Open questions refactoring Code requires refactoring website Tasks related to the web application
Projects
None yet
Development

No branches or pull requests

1 participant