Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy documentation and document time series api limits #42

Merged
merged 19 commits into from
Jan 2, 2025

Conversation

hrodmn
Copy link
Contributor

@hrodmn hrodmn commented Dec 3, 2024

This PR adds changes for a) deploying the documentation to GH pages, b) adds an API benchmarking process (and report in the docs) for the time series endpoints. After talking about it some more with the team I think we can be more surgical in the benchmark approach (i.e. predict limits based on Lambda configuration parameters), but the exercise was still helpful for understanding when things fall apart and which knobs you can turn to get time series requests that return a response before the Lambda breaks down. There are definitely some modifications or alternative approaches that are worth considering to improve the capabilities of this application!

A preview of the docs for this PR are available here: https://developmentseed.org/titiler-cmr/pr-previews/pr-42/

I'm not sure if I'll keep the PR preview feature alive since it is so easy to deploy the docs locally (uv sync && uv run mkdocs serve -o) but I'll leave it up for now so others can take a look at the rendered site.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@hrodmn hrodmn changed the title Deploy docs and add performance benchmark WIP: Deploy docs and add performance benchmark Dec 3, 2024
@hrodmn hrodmn force-pushed the feat/test-performance branch 8 times, most recently from e4e6813 to acc239f Compare December 3, 2024 15:40
github-actions bot pushed a commit that referenced this pull request Dec 3, 2024
@hrodmn hrodmn force-pushed the feat/test-performance branch 11 times, most recently from 7ea2bb8 to abe6317 Compare December 3, 2024 17:11
github-actions[bot]

This comment was marked as resolved.

@hrodmn hrodmn force-pushed the feat/test-performance branch 4 times, most recently from 049c9a7 to f709c65 Compare December 4, 2024 02:03
github-actions bot pushed a commit that referenced this pull request Dec 4, 2024
@hrodmn hrodmn force-pushed the feat/test-performance branch from f709c65 to 156f54a Compare December 4, 2024 02:16
Copy link

github-actions bot commented Dec 4, 2024

📚 Documentation preview will be available at: https://developmentseed.github.io/titiler-cmr/pr-previews/pr-42/

Status: ✅ Preview is ready!

@hrodmn hrodmn force-pushed the feat/test-performance branch from e408669 to e52d5b2 Compare December 16, 2024 17:41
github-actions bot pushed a commit that referenced this pull request Dec 16, 2024
github-actions bot pushed a commit that referenced this pull request Dec 18, 2024
github-actions bot pushed a commit that referenced this pull request Dec 20, 2024
github-actions bot pushed a commit that referenced this pull request Dec 21, 2024
@hrodmn hrodmn changed the title Deploy docs and add performance benchmark Deploy documentation and document time series api limits Jan 2, 2025
@hrodmn hrodmn merged commit 49192c9 into develop Jan 2, 2025
7 checks passed
@hrodmn hrodmn deleted the feat/test-performance branch January 2, 2025 12:56
github-actions bot pushed a commit that referenced this pull request Jan 2, 2025
github-actions bot pushed a commit that referenced this pull request Jan 2, 2025
@abarciauskas-bgse
Copy link
Contributor

@hrodmn thanks for this, this guidance is super helpful. 2 questions:

  1. Do you mind if I change pixels to grid_cells in all places where we're referring to the original data? When I read pixels I think of the resolution of the output image but I think we are mostly referring to the input data in this guidance.
  2. Do you have any sense why the individual image size matters aside from it's contribution to the total image size?

@hrodmn
Copy link
Contributor Author

hrodmn commented Jan 2, 2025

@hrodmn thanks for this, this guidance is super helpful. 2 questions:

  1. Do you mind if I change pixels to grid_cells in all places where we're referring to the original data? When I read pixels I think of the resolution of the output image but I think we are mostly referring to the input data in this guidance.

Sure that makes sense to me - is that a convention for netcdf datasets? Most of my experience is with raster datasets so I usually think about cells as pixels.

  1. Do you have any sense why the individual image size matters aside from it's contribution to the total image size?

I have only done a small amount of profiling in the Lambda context, but I think there are two ways large images can affect time series requests:

  • Each lower-level request requires more resources (time + memory) which increases the probability that not all requests will succeed within 30 seconds or overflow the Lambda memory limit. I did some rough testing to figure out the maximum size for a single image that the Lambda could process without crashing and used that to set the cap for individual image sizes.
  • Some kind of resource limitation within the initial Lambda function call (e.g. /timeseries/bbox) makes the request for large images take longer (this is only a half-formed thought). I have tried using my local environment as a 'scheduler' by running the docker network locally with lower-level requests redirected to the actual deployed API (instead of the local API). When I do this I can make larger requests work, so I think there is something specific to the Lambda deployment that is limiting the capacity.

@abarciauskas-bgse
Copy link
Contributor

@hrodmn
re grid cells vs pixels - I have come across pixels being used in place of cells for raster datasets and always found it confusing. I think grid cell is common term for netcdf data but can't say declaratively. This all may just be a matter of personal preference so if you prefer pixels, perhaps just including a line about terminology may be helpful. Looking at some of the docs for NetCDF I just see references to values and cells...

And thanks for the responses to question 2 - those reasons make sense to me. I don't think it's worth further investigation at this point and it's great that the individual image size limits are included in the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants