Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory fix #23

Merged
merged 3 commits into from
Dec 16, 2024
Merged

Memory fix #23

merged 3 commits into from
Dec 16, 2024

Conversation

lucas-diedrich
Copy link
Owner

Changed logic of image Dask Array creation

Addresses issues in #4

Before:

  1. _chunk_factory: Read in tiles as numpy arrays (delayed)
  2. Assemble chunks (delayed)
  3. Create array (end of delayed)
    This led to the issue that tiles were loaded into memory and persisted, leading to extremely high memory utilization

Now:

  1. Create nested list of dask.array right away in _chunk_factory. Use delayed paradigm for parallelization
  2. Assemble chunks
    This led to a significant speed up and hopefully to better memory management.

Lucas Diedrich added 3 commits December 16, 2024 18:46
…dentally) generated which is extremely memory intensive and not fully parallelized, now, dask arrays are immediately generated after reading in the image, which reduces memory burden and allows dask to use lazy loading
@lucas-diedrich lucas-diedrich mentioned this pull request Dec 16, 2024
2 tasks
@lucas-diedrich lucas-diedrich merged commit 9e93dd8 into main Dec 16, 2024
4 checks passed
@lucas-diedrich lucas-diedrich deleted the memory-fix branch February 3, 2025 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant