Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a NASA jupyterhub/Colab instance to do data analysis in a cloud framework #203

Closed
swnesbitt opened this issue Aug 11, 2022 · 4 comments

Comments

@swnesbitt
Copy link

swnesbitt commented Aug 11, 2022

To ensure equity of data access and computation, as well as reproducible workflows, create a framework where users can log in, have a computational framework like jupyterhub/Google Colab and commonly used software (and the ability to install other GitHub packages), access NASA data in the cloud or in file systems available to the user, and develop, share and publish workflows using NASA data. Installed software versioning could be preserved and published along with products (data How-tos, publications, reports).

Additional computational resources (GPUs) could be available for NASA-funded researchers or other users (university-affiliated students/staff). Security, abuse would have to be managed.

@hmjbarbosa
Copy link

I think Steve's suggestion is great. As someone at a university, I can say it is hard to pull large volumes of data from NASA, NOAA, and other sources just to work locally. It would be a tremendous contribution to the scientific community if something along the lines of this suggestion were to be implemented.

@cgentemann
Copy link
Contributor

Yes, we are exploring how this might work. For now, there are things like MyBinder that provide ephemeral cloud access to a small machine. The Pangeo community used to provide access to larger machines - but they were attacked by bitcoin miners, see discussion here. There is need for both public access to cloud resources and NASA-funded researcher access. Right now it seems that until a longer term solution is developed, science teams are contracting out access for cloud-based access to pre-configured Jupyter Hubs.

@choldgraf
Copy link

Hey all 👋 I'm the director of 2i2c, a non-profit that manages and supports customized JupyterHubs for communities of practice in research and education. We've run several hubs that are similar to what @swnesbitt is describing in the top comment.

For example, we've been running a hub for the OpenScapes team to use in their community workshops, as well as for the OceanHackWeek team to use during their workshops. We also run all of Pangeo's cloud-based interactive computing infrastructure now (and we are working on getting a Binder up and running for Pangeo as well...we have made some progress on the crypto-mining front though that is an ongoing challenge in this space).

If I can be of any help either in brainstorming and discussing, or if you think that 2i2c could help provide infrastructure such as this, I'd be happy to speak further! I agree with @swnesbitt's post above that these kinds of hosted environments can be a huge help in making the infrastructure more accessible to beginners and veterans alike.

@swnesbitt
Copy link
Author

Great to hear that @cgentemann. Yes security is a challenge and right now the ephemeraility seems to be the primary way to get around that. Increasing cloud accessibility (availability and ease of use/authentication) of NASA datasets across the DAACs is another piece of this - there has been some progress in some areas which has made it a lot easier to work with these data (cloud services, API tools and authentication help), while others are more challenging and require visiting http sites or use of “shopping carts”. Thanks!

@nasa nasa locked and limited conversation to collaborators Sep 15, 2022
@cgentemann cgentemann converted this issue into discussion #240 Sep 15, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants