Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete processed data of no-longer-current pipeline versions after upgrade to save on storage #3240

Open
corneliusroemer opened this issue Nov 19, 2024 · 1 comment
Labels
backend related to the loculus backend component feature Feature proposal performance

Comments

@corneliusroemer
Copy link
Contributor

corneliusroemer commented Nov 19, 2024

After we bump the current processing pipeline version, processed data of previous pipeline versions doesn't serve a purpose anymore. We should just delete it to not accrue waste.

No longer purposeful processed data is the main source of db storage at the moment in production pathoplexus (already noticeably slowing down cloning from prod to staging).

Proposed feature:

  • As part of bumping the processing pipeline version, after successfully having bumped the version, delete all outdated rows.

See this comment for an analysis of db storage: #3232 (comment)

The alternative is manually deleting old versions but I don't see reason for doing this manually when we can do it automatically.

@corneliusroemer corneliusroemer added backend related to the loculus backend component feature Feature proposal performance labels Nov 19, 2024
@chaoran-chen
Copy link
Member

I wonder whether we should keep current - 1 around just in case one would like to reverse and delete older ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend related to the loculus backend component feature Feature proposal performance
Projects
None yet
Development

No branches or pull requests

2 participants