Repo for Authenticated Clients and Applications for ICICLE CI Services
The Artificial Intelligence (AI) institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE) is funded by the NSF to build the next generation of Cyberinfrastructure to render AI more accessible to everyone and drive its further democratization in the larger society. ICICLE aims to develop intelligent cyberinfrastructure with transparent and high-performance execution on diverse and heterogeneous environments as well as advance plug-and-play AI that is easy to use by scientists across a wide range of domains, promoting the democratization of AI. The deep AI infrastructure that ICICLE plans to utilize to sift through data relies on knowledge graphs (KGs) in which information is stored in a graph database that uses a graph-structured data model or data model to represent a network of entities and the relationships between them.
Finding a way to make KGs easily accessible and functional on high performance computing (HPC) systems is an important step in helping to democratize HPC. Thus a large focus of this project was contributing to the body of knowledge needed for hosting live, dynamic, and interactive services that interface with HPC systems hosting KGs for ICICLE based resources and services
In this project, we develop Jupyter Notebooks and Python command line clients that will access ICICLE resources and services using ICICLE authentication mechanisms. To connect our clients, we used Tapis, which is a framework that supports computational research to enable scientists to access and utilize and manage multi-institution resources and services. We used Neo4j to organize data into a knowledge graph (KG). We then hosted the KG on a Tapis Pod, which offers persistent data storage with a template made specifically for Neo4j KGs.
For this software release, we focussed on developing authenticated connections to kubernetes pods hosted on any Tapis service. These applciations are stand-alone and can be installed separately. For details, see:
Our CLI's are production software intended for use as interfaces to Tapis services hosted on HPC systems. These are ready to install and use, provided the proper requirements are fulfilled.
The Jupyter notebooks in this repository are primarily demonstrators for working Tapis code, written in python. We also made an extensible template notebook which has Tapis auth prebuilt, and can be easily modified to carry out specific Tapis related tasks.
The Notebooks and the CLIs each have their own directory and software requirements which are described here:
This software was developed as part of the SDSC/UCSD Smmer 2022 REHS Project, titled Developing Interactive Jupyter Notebooks to run on the SDSC HPC Expanse System and the “AI institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment” (ICICLE) project.
- Project Lead: Mary Thomas, Ph.D., SDSC HPC Training lead, and Computational Data Scientist in the Data-Enabled Scientific Computing Division.
- REHS Students:
- Sahil Samar, Del Norte High School, San Diego, CA, [email protected]
- Mia Chen, Westview High School, San Diego, CA, [email protected]
- Jack Karpinski, San Diego High School, San Diego, CA, USA, [email protected]
- Michael Ray, JSerra Catholic High School, San Juan Capistrano, CA, [email protected]
- Archita Sarin, Mission San Jose High School, Fremont, CA, [email protected]
- Collaborators/Mentors:
- Christian Garcia, Engineering Scientist Associate (Texas Advanced Computing Center [5]).
- Matthew Lange, Ph.D., CEO, International Center for Food Ontology Operability Data and Semantics (IC-FOODS [4]);
- Joe Stubbs, Ph.D., Manager, Cloud & Interactive Computing (Texas Advanced Computing Center [5]).
This work has been funded by grants from the National Science Foundation, including:
-
The AI Institute for Intelligent CyberInfrastructure with Computational Learning in the Environment (ICICLE) (#2112606);
-
The SDSC Expanse project (#1928224)
-
The TACC Stampede System (#1663578)
-
Tapis projects (#1931439)
-
the NSF Track 3 Award: COre National Ecosystem for CyberinfrasTructure (CONECT) (#2138307) and the Extreme Science and Engineering Discovery Environment (XSEDE) (ACI-1548562).
-
NOTES:
- YAML file: See
- Component Data Yaml file: https://github.com/ICICLE-ai/CI-Components-Catalog/blob/master/components-data.yaml