-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create test notebooks to test expected functionality after image updates #27
Create test notebooks to test expected functionality after image updates #27
Comments
For the last new image I just ran through these quickstart notebooks: https://github.com/NASA-IMPACT/veda-docs/tree/main/notebooks/quickstarts. But they do require AWS creds. Is it feasible to set up CI to have credentials? Or should we be trying to create notebooks that only rely on public data? |
Also want to make sure that this ticket and NASA-IMPACT/veda-backend#357 are aware of each other since to goals are potentially overlapping |
@jsignell I would have a pretty strong preference for relying on public data and not requiring the tests to run in an environment with privileged access - I'd go even further and "if possible", to include very truncated data to allow us to test the bits we need just locally / in the repo, so there aren't externalities like network requests which could cause tests to fail. (Although, if the data is fetched from public s3 buckets or locations we assume to be pretty stable, that's totally fine). @jsignell does that seem okay? If you think we do need privileged access within the tests, I can check how we might make that possible, but does that seem like it would be a lot more complicated than being able to run the tests stand-alone without privileged access. Thanks again for looking into this! |
Yes I totally agree that it would be preferable to only rely on public data and APIs. My understanding is that VEDA doesn't have any public buckets right now and the expectation is that external usage will be via one of the APIs. So maybe the easiest first step would be to simplify some of the API notebooks while we think over how to have public data. I do think it is important to exercise reading data from s3 with fsspec, but maybe we start with partial tests and go from there. |
Yup - starting with simple / partial tests and growing out the complexity as we go sounds like a good approach to me. Sorry for the late response here @jsignell - @sunu is just working on moving the Thanks again! |
Apologies for the delay here - we now have the repository up that aims to simplify the Dockerfile and build process for our custom pangeo image, as well as allows us to run tests as part of CI. https://github.com/NASA-IMPACT/pangeo-notebook-veda-image . We will work on over-all documentation in the next days and migrate the pangeo images used on hubs. In that repository, you will see an We are using @jsignell @wildintellect - would you be able to work on the test notebooks? Can just make PRs to https://github.com/NASA-IMPACT/pangeo-notebook-veda-image Let me know if it helps to chat or if there are any questions or concerns. Thanks @sunu for putting this together! |
Oh neat! Yeah I can take a crack at adding some test notebooks. Should I just run them on the pangeo env on vedahub? |
I think this should be fine (but being aware that the environment the CI tests will run in won't have access to things like private s3 buckets). You should also be able to run the image built in https://github.com/NASA-IMPACT/pangeo-notebook-veda-image/ (which is based on the pangeo base image) locally. We'll add some documentation on how to do that. Filed a ticket for it here: NASA-IMPACT/pangeo-notebook-veda-image#3 Thanks so much @jsignell ! Please let us know however we can help. |
Ok! I just put up a PR with some silly little tests. I tried to include a lot of rendered outputs, but just let me know if you think the notebook is too big. I can certainly trim things down and do fewer viz tests. I was just kind of intrigued by the potential of testing that the rendered output works properly. |
@jsignell this looks really great to me! I don't see any issues with the notebook size, and testing visualization outputs sounds good to me! Unless @sunu has any concerns. I left a small comment on the PR about the STAC query that I'm a bit worried that could break our tests if the STAC catalog we are querying has any updates. This is exciting! Once we have this merged, I'd really like to do an update to the latest |
Ok great! I just made the recommended update on the PR so it is all good from my perspective. |
Part of Objective 1
Relates to:
We need a few "test notebooks" to use as part of CI tests when we do image updates. The tests would test that that when these notebooks are run in the updated environment, the outputs remain the same, and provide some interface to update the expected snapshot if the changes in output are expected.
These notebooks would ideally be simple enough that they are intentional about specific things being tested, and yet cover a broad amount of use-cases to catch potential regressions in upstream libraries. A good example of the kind of thing it should catch is #14 . It would be great if we can make the notebooks self contained (i.e. not dependent on external data sources), and deterministic in their outputs, avoiding outputs based on things that would then need to be mocked, like
datetime.now()
.If we have a few of these notebooks, we can easily run them as part of CI.
@wildintellect @jsignell @abarciauskas-bgse - would any of you be able to help with the creation of these test notebooks? Happy to have a chat about specifics - it can likely be some slightly simplified form of the
veda-docs
notebooks or so. I could also give it a stab just looking at thedocs
notebooks, but likely don't have as good a sense of the breadth of things that need testing as you all do.The text was updated successfully, but these errors were encountered: