Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce weather forecast page storage #205

Open
peterdudfield opened this issue Oct 23, 2024 · 9 comments
Open

reduce weather forecast page storage #205

peterdudfield opened this issue Oct 23, 2024 · 9 comments
Assignees

Comments

@peterdudfield
Copy link
Contributor

Currently the weather page downloads lots of data. And causes machien to run out of memory some options

  1. Download less data, put default as 2 timepoints
  2. store data in s3, requires a bit of thinking how we do that, if we can put the cahce data on s3
  3. remove old cached data,
  4. other options?
@peterdudfield
Copy link
Contributor Author

The current solution is to terminate the ec2 instance, causing another new one to start

@akshayw1
Copy link

Assign

@peterdudfield
Copy link
Contributor Author

Thanks @akshayw1, let us know how it goes

@akshayw1
Copy link

I’d like to discuss the proposed solution, as there could be multiple approaches to consider. @peterdudfield, could you also provide access to the Notion database via the shared URL? I had some university exams last week, but I’m now fully available to sync on the issues assigned to me. Let’s aim to complete them soon!

@peterdudfield
Copy link
Contributor Author

Sorry, I wont be able give access tot eh Notion page. It has too much sensitive information on it. Where was the link to that?

Please, discuss on here.

@akshayw1
Copy link

This are some of my thougts @peterdudfield

, we can reduce the initial data load by setting the default view to display only two timepoints instead of the full dataset, allowing users to load more data on demand if needed. This would significantly decrease the initial memory footprint.

Another option is to integrate S3 for caching. Cached weather data can be moved to S3 storage, which would require setting up an appropriate bucket structure, implementing efficient read/write operations, managing data versioning and updates, and potentially integrating a CDN for faster access.

Additionally, implementing automated cache cleanup can help manage storage. A retention policy based on data age, usage patterns, and storage constraints can be defined, and a background job can handle regular cleanup of older cached data.

Other considerations include implementing progressive data loading, exploring data compression techniques, optimizing the data structure to reduce memory overhead, and evaluating the use of streaming for larger datasets. These combined efforts should effectively mitigate the memory issues.

@peterdudfield
Copy link
Contributor Author

I like the s3 option, would you be interested in doing that?

@akshayw1
Copy link

akshayw1 commented Jan 23, 2025

For the weather data caching flow: User requests weather data -> Check S3 bucket for cached data for that location -> If valid cache exists (within TTL), return it -> If no cache/expired, fetch fresh data from weather API, store only 2 timepoints in S3 with expiry timestamp -> Schedule daily cleanup to remove caches older than 24h. This reduces memory usage by limiting data points and offloading storage to S3. Does this match your requirements for implementation? @peterdudfield

@peterdudfield
Copy link
Contributor Author

Sounds good. S3 has some life cycle rules we can put on that can do the tidying up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants