diff --git a/README.md b/README.md index bbe842c5b..b8c80ce72 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,3 @@ -[![CircleCI](https://dl.circleci.com/status-badge/img/gh/berkeley-dsep-infra/datahub/tree/staging.svg?style=svg)](https://dl.circleci.com/status-badge/redirect/gh/berkeley-dsep-infra/datahub/tree/staging) - # Berkeley JupyterHubs Contains a fully reproducible configuration for JupyterHub on datahub.berkeley.edu, @@ -9,6 +7,13 @@ as well as the single user images. [UC Berkeley CDSS](https://cdss.berkeley.edu) +## Single-user server images +All user images are located in their own repositories located in the +[Berkeley DSEP infra repo](https://github.com/berkeley-dsep-infra). You can +find them either by [searching there](https://github.com/orgs/berkeley-dsep-infra/repositories?language=&q=image&sort=&type=all) +or from links in the deployment's `image/README.md` +([eg: Datahub's](https://github.com/berkeley-dsep-infra/datahub/tree/staging/deployments/datahub/image)). + ## Branches The `staging` branch always reflects the state of the [staging JupyterHub](http://staging.datahub.berkeley.edu), @@ -107,7 +112,7 @@ branch of this repo while the choice for `head` is your fork. Once this is complete and if there are no problems, you can request that someone review the PR before merging, or you can merge yourself if you are -confident. This merge will trigger a CircleCI process which upgrades the +confident. This merge will trigger a Github Actions workflow which upgrades the helm deployment on the staging site. When this is complete, test your changes there. For example if you updated a library, make sure that a new user server instance has the new version. If you spot any problems you can diff --git a/deployments/biology/hubploy.yaml b/deployments/biology/hubploy.yaml index 440a86da9..7d5dfe522 100644 --- a/deployments/biology/hubploy.yaml +++ b/deployments/biology/hubploy.yaml @@ -1,5 +1,5 @@ images: - image_name: us-central1-docker.pkg.dev/ucb-datahub-2018/user-images/biology-user-image:c6aa3725c360 + image_name: us-central1-docker.pkg.dev/ucb-datahub-2018/user-images/biology-user-image:a81b11cbb998 cluster: provider: gcloud diff --git a/docs/.gitignore b/docs/.gitignore new file mode 100644 index 000000000..4d7fb32e9 --- /dev/null +++ b/docs/.gitignore @@ -0,0 +1,2 @@ +/.quarto/ +_site diff --git a/docs/admins/howto/clusterswitch.qmd b/docs/admins/howto/clusterswitch.qmd index 5d4d54d06..233f03b61 100644 --- a/docs/admins/howto/clusterswitch.qmd +++ b/docs/admins/howto/clusterswitch.qmd @@ -10,8 +10,8 @@ You might find it easier to switch to a new cluster if you're running a [very ol 1. Create a new cluster using the specified [configuration](../cluster-config.qmd). 2. Set up helm on the cluster according to the instructions here: http://z2jh.jupyter.org/en/latest/setup-helm.html - - Make sure the version of helm you're working with matches the version CircleCI is using. - For example: https://github.com/berkeley-dsep-infra/datahub/blob/staging/.circleci/config.yml#L169 + - Make sure the version of helm you're working with matches the version Github Actions is using. + For example: https://github.com/berkeley-dsep-infra/datahub/blob/staging/.github/workflows/deploy-support.yaml#L66 3. Re-create all existing node pools for hubs, support and prometheus deployments in the new cluster. If the old cluster is still up and running, you will probably run out of CPU quota, as the new node pools will immediately default to three nodes. Wait ~15m for the new pools to wind down to zero, and then continue. ## Setting the 'context' for kubectl and work on the new cluster. @@ -110,13 +110,21 @@ for x in $(cat hubs.txt); do hubploy deploy ${x} hub prod; done When done, add the modified configs to your feature branch (and again, don't push yet). -## Update CircleCI -Once you've successfully deployed the clusters manually via `hubploy`, it's time to update CircleCI to point to the new cluster. +## Update Github Actions +Once you've successfully deployed the clusters manually via `hubploy`, it's time to update the Github Actions to point to the new cluster. -All you need to do is `grep` for the old cluster name in `.circleci/config.yaml` and change this to the name of the new cluster. There should just be four entries: two for the `gcloud get credentials `, and two in comments. Make these changes and add them to your existing feature branch, but don't commit yet. +All you need to do is `grep` for the old cluster name in `.github/workflows/` and change this to the name of the new cluster. +There should just be two entries, one each in the support and node placeholder deploy workflows. +Make these changes and add them to your existing feature branch, but don't commit yet. + +``` +$ grep -ir spring .github/workflows +.github/workflows/deploy-node-placeholder.yaml: get-credentials spring-2024 +.github/workflows/deploy-support.yaml: get-credentials spring-2024 +``` ## Create and merge your PR! -Now you can finally push your changes to github. Create a PR, merge to `staging` and immediately kill off the deploy jobs for `node-placeholder`, `support` and `deploy`. +Now you can finally push your changes to github. Create a PR, merge to `staging` and immediately kill off the [deploy jobs](https://github.com/berkeley-dsep-infra/datahub/actions) for `node-placeholder`, `support` and `deploy`. Create another PR to merge to `prod` and that deploy should work just fine. diff --git a/docs/admins/howto/delete-hub.qmd b/docs/admins/howto/delete-hub.qmd index d61fbcb0f..880680050 100644 --- a/docs/admins/howto/delete-hub.qmd +++ b/docs/admins/howto/delete-hub.qmd @@ -28,8 +28,7 @@ gcloud filestore backups create -backup-YYYY-MM-DD --file-share=shares ``` 4. Log in to `nfsserver-01` and unmount filestore from nfsserver: `sudo umount /export/-filestore` -5. Comment out the hub build steps out in `.circleci/config.yaml` - (deploy and build steps) +5. Comment out the hub's image repo entry (if applicable) in `scripts/user-image-management/repos.txt` 6. Comment out GitHub label action for this hub in `.github/labeler.yml` 7. Comment hub entries out of `datahub/node-placeholder/values.yaml` diff --git a/docs/admins/howto/documentation.qmd b/docs/admins/howto/documentation.qmd index cf392e233..a0cb29d5b 100644 --- a/docs/admins/howto/documentation.qmd +++ b/docs/admins/howto/documentation.qmd @@ -5,7 +5,7 @@ title: Documentation ## Overview Documentation is managed under the `docs/` folder, and is generated with -[Quarto](https://quarto/). It is published to this site, +[Quarto](https://quarto.org/). It is published to this site, , hosted at GitHub Pages. Content is written in [markdown](https://quarto.org/docs/authoring/markdown-basics.html). diff --git a/docs/admins/howto/new-hub.qmd b/docs/admins/howto/new-hub.qmd index 727362a24..d6579c42b 100644 --- a/docs/admins/howto/new-hub.qmd +++ b/docs/admins/howto/new-hub.qmd @@ -35,7 +35,7 @@ Proper access to the following systems: - Google Cloud IAM: *owner* - Write access to the [datahub repo](https://github.com/berkeley-dsep-infra/datahub) - - CircleCI account linked to our GitHub organization. + - Owner or admin access to the [berkeley-dsep-infra organization](https://github.com/berkeley-dsep-infra/) ## Configuring a New Hub @@ -247,71 +247,48 @@ bcourses. Please reach out to Jonathan Felder to set this up, or if he is not available. -### CircleCI +### CI/CD and single-user server image -The CircleCI configuration file `.circleci/config.yml` will need to -include directives for building and deploying your new hub at several -phases of the CircleCI process. Generally speaking, an adequate manual -strategy for this is to pick the name of an existing hub, find each -occurrence of that name, and add analogous entries for your new hub -alongside your example existing hub. Please order new entries for your -new hub in alphabetical order amongst the entries for existing hubs. +CI/CD is managed through Github Actions, and the relevant workflows are located +in `.github/workflows/`. Deploying all hubs are managed via Pull Request +Labels, which are applied automatically on PR creation. -Here is a partial (but incomplete) sampling of some of the relevant -sections of the CircleCI configuration file: +To ensure the new hub is deployed, all that needs to be done is add a new entry +(alphabetically) in `.github/labeler.yml` under the `# add hub-specific labels +for deployment changes` stanza: ``` yaml -- run: - name: Deploy - command: | - hubploy deploy hub ${CIRCLE_BRANCH} - -- hubploy/build-image: - deployment: - name: image build - filters: - branches: - ignore: - - staging - - prod - - - - hubploy/build-image: - deployment: - name: image build - push: true - filters: - branches: - only: - - staging - - - - image build +"hub: ": + - "deployments//**" ``` -Review hubploy.yaml file inside your project directory and update the -images section. Example from a11y hub: +#### Hubs using a custom single-user server image -``` yaml -images: - images: - - name: us-central1-docker.pkg.dev/ucb-datahub-2018/user-images/a11y-user-image - path: image/ - repo2docker: - base_image: docker.io/library/buildpack-deps:jammy -``` +If this hub will be using its own image, then follow the +[instructions here](https://docs.datahub.berkeley.edu/admins/howto/new-image.html) +to create the new image and repository. In this case, the image tag will be +`PLACEHOLDER` and will be updated AFTER your PR to datahub is merged. + +*NOTE:* The changes to the `datahub` repo are required to be merged BEFORE the new +image configuration is pushed to `main` in the image repo. This is due to +the image building/pushing workflow requiring this deployment's +`hubploy.yaml` to be present in the `deployments//` subdirectory, as +it updates the image tag. + +#### Hubs inheriting an existing single-user server image -### Add hub to the github labeler workflow +If this hub will inherit an existing image, you can just copy `hubploy.yaml` +from an existing deployment which will contain the latest image hash. -The new hub will now need to be added to the github labeler workflow. +#### Review the deployment's `hubploy.yaml` -Edit the file `.github/labeler.yml` and add an entry for this hub -(alphabetically) in the -`# add hub-specific labels for deployment changes` block: +Next, review `hubploy.yaml` inside your project directory to confirm that +looks cromulent. An example from the `a11y` hub: ``` yaml -"hub: ": - - "deployments//**" +images: + images: + - name: us-central1-docker.pkg.dev/ucb-datahub-2018/user-images/a11y-user-image: ``` ### Create placeholder node pool @@ -399,25 +376,37 @@ events](calendar-scaler.qmd). This is useful for large courses which can have placeholder nodes set aside for predicatable periods of heavy ramp up. -### Commit and deploy staging +### Commit and deploy to `staging` Commit the hub directory, and make a PR to the the `staging` branch in -the GitHub repo. Once tests pass, merge the PR to get a working staging -hub! It might take a few minutes for HTTPS to work, but after that you +the GitHub repo. + +#### Hubs using a custom single-user server image + +If this hub is using a custom image, and you're using `PLACEHOLDER` for the +image tag in `hubploy.yaml`, be sure to remove the hub-specific Github +label that is automatically attached to this pull request. It will look +something like `hub: `. If you don't do this the deployment will +fail as the image sha of `PLACEHOLDER` doesn't exist. + +After this PR is merged, perform the `git push` in your image repo. This will +trigger the workflow that builds the image, pushes it to the Artifact Registry, +and finally creates a commit that updates the image hash in `hubploy.yaml` and +pushes to the datahub repo. Once this is merged in to `staging`, the +deployment pipeline will run and your hub will finally be deployed. + +#### Hubs inheriting an existing single-user server image + +Your hub's deployment will proceed automatically through the CI/CD pipeline. + +It might take a few minutes for HTTPS to work, but after that you can log into it at \-staging.datahub.berkeley.edu. Test it out and make sure things work as you think they should. -1. Make a PR from the `staging` branch to the `prod` branch. When this - PR is merged, it'll deploy the production hub. It might take a few - minutes for HTTPS to work, but after that you can log into it at - \.datahub.berkeley.edu. Test it out and make - sure things work as you think they should. -2. You may want to customize the docker image for the hub based on your - unique requirements. Navigate to deployments/'Project Name'/image - and review environment.yml file and identify packages that you want - to add from the `conda repository` \<\>. You - can copy the image manifest files from another deployment. It is - recommended to use a repo2docker-style image build, without a - Dockerfile, if possible. That format will probably serve as the - basis for self-service user-created images in the future. -3. All done. +### Commit and deploy to `prod` + +Make a PR from the `staging` branch to the `prod` branch. When this +PR is merged, it'll deploy the production hub. It might take a few +minutes for HTTPS to work, but after that you can log into it at +\.datahub.berkeley.edu. Test it out and make +sure things work as you think they should. diff --git a/docs/admins/howto/new-image.qmd b/docs/admins/howto/new-image.qmd index 21b582d6a..03c014693 100644 --- a/docs/admins/howto/new-image.qmd +++ b/docs/admins/howto/new-image.qmd @@ -19,61 +19,103 @@ As always, create a feature branch for your changes, and submit a PR when done. ## Use an existing image as a template -Browse through our `deployments/` directory to find a hub that is similar to -the one you are trying to create. This will give you a good starting point. +Browse through our [image repos](https://github.com/orgs/berkeley-dsep-infra/repositories?language=&q=image&sort=&type=all) +to find a hub that is similar to the one you are trying to create. This will +give you a good starting point. -## Create the image directory +## Create the image repos -Create a new `image/` directory in the deployment. Then copy the contents (and -any subdirectories) of the source `image/` directory in to the new directory. +Create a new image repo from the [hub-user-image-template](https://github.com/berkeley-dsep-infra/hub-user-image-template). +Click "Use this template" > "Create a new repository". -## Modify `hubploy.yaml` for the hub +Be sure to follow convention and name the repo `-user-image`, and the +owner needs to be `berkeley-dsep-infra`. When that is done, create your own +fork of the new repo. -In the deployment\'s `hubploy.yaml` file, -add or modify the `name`, `path` and `base_image` fields to configure -the image build and where it\'s stored in the Google Artifcat Registry. +### Configuring the root image repo -`name` should contain the path to the image in the Google Artifact -Registry and the name of the image. `path` points to the directory -containing the image configuration (typically :file::`image/`. `base_image` is -the base Docker image to use for the image build. +There are now a few steps to set up the CI/CD for the new image repo. In the +`berkeley-dsep-infra` image repo, click on `Settings`, and under `General`, +scroll down to `Pull Requests` and check the box labeled `Automatically delete +head branches`. -For example, `hubploy.yaml` for the data100 image looks like this: +Scroll back up to the top of the settings, and in the left menu bar, click on +`Secrets and variables`, and then `Actions`. -``` yaml +From there, click on the `Variables` tab and then `New repository variable`. We +will be adding two new variables: + +1. `HUB`: the name of the hub (eg: datahub) + +1. `IMAGE`: the Google Artifact Registry path and image name. The path will +always be `ucb-datahub-2018/user-images/` and the +image name will always be the same as the repo: `-user-image`. + +### Configure your fork + +Now you will want to disable Github Actions for your fork of the image repo. +If you don't, whenever you push PRs to the root repo the workflows *in your +fork* will attempt to run, but don't have the proper permissions to +successfully complete. This will then send you a nag email about a workflow +failure. + +To disable this for your fork, click on `Settings`, `Actions` and `General`. +Check the `Disable actions` box and click save. + +### Add the root image repo to the list of allowed repos in the `berkeley-dsep-infra` secrets. + +Now, go to the `berkeley-dsep-infra` [Secrets and Variables](https://github.com/organizations/berkeley-dsep-infra/settings/secrets/actions). +You will need to give your repo permissions to push to the Artifact Registry, +as well as to push a branch to the [datahub repo](https://github.com/berkeley-dsep-infra/datahub). + +Edit both `DATAHUB_CREATE_PR` and `GAR_SECRET_KEY`, and click on the gear icon, +search for your repo name, check the box and save. + +### Update your deployment's `hubploy.yaml` and add the image to the primary list of repos. + +You need to let `hubploy` know the specifics of the image. Change the `name` of the image in +`deployments//hubploy.yaml` to point to your new image name, and after the name add +`:PLACEHOLDER` in place of the image sha. This will be automatically updated after your new image +is built and pushed to the Artifact Registry. + +Example: + +``` images: - images: - - name: us-central1-docker.pkg.dev/ucb-datahub-2018/user-images/data100-user-image - path: image/ - repo2docker: - base_image: docker.io/library/buildpack-deps:jammy - registry: - provider: gcloud - gcloud: - project: ucb-datahub-2018 - service_key: gcr-key.json + images: + - name: us-central1-docker.pkg.dev/ucb-datahub-2018/user-images/fancynewhub-user-image:PLACEHOLDER cluster: -provider: gcloud -gcloud: - project: ucb-datahub-2018 - service_key: gke-key.json - cluster: spring-2024 - zone: us-central1 + provider: gcloud + gcloud: + project: ucb-datahub-2018 + service_key: gke-key.json + cluster: spring-2024 + zone: us-central1 ``` -## Modify the image configuration as necessary +Next, add the ssh clone path of the root image repo to [repos.txt](https://github.com/berkeley-dsep-infra/datahub/blob/staging/scripts/user-image-management/repos.txt). + +Create a PR and merge to staging. You can cancel the +[`Deploy staging and prod hubs` job in Actions](https://github.com/berkeley-dsep-infra/datahub/actions/workflows/deploy-hubs.yaml), +or just let it fail. + +## Add a github bot notification in Slack + +Go to the #ucb-datahubs-bots channel, and run the following command: + +``` +/github subscribe berkeley-dsep-infra/ +``` -This step is straightforward: edit/modify/delete/add any files in the -`image/` directory to configure the image -as needed. +## Modify the image configuration as necessary -## Update CI/CD configuration +This step is straightforward: create a feature branch, edit/modify/delete/add +any files in the image repo to configure the image as needed. -Next, ensure that this image will be built and deployed by updating the -`.circleci/config.yml` file in the root -of the repository. Add new steps under the `jobs/deploy:`, -`workflows/test-build-images:` and `workflows/deploy:` stanzas. +We also strongly recommend copying `README-template.md` over the default +`README.md`, and modifying it to replace all occurrences of `` with +the name of your image. ## Submitting a pull request @@ -84,27 +126,30 @@ fork of the [datahub staging branch](https://github.com/berkeley-dsep-infra/datahub). 1. Set up your git/dev environment by [following the instructions - here](https://github.com/berkeley-dsep-infra/datahub/#setting-up-your-fork-and-clones). - -2. Create a new branch for this PR. + here](https://github.com/berkeley-dsep-infra/hub-user-image-template/blob/main/CONTRIBUTING.md). + : - This guide is also located in your image repo! -3. +2. Test the changes locally using `repo2docker`, then submit a PR to `staging`. - Test the changes locally using `repo2docker`, then submit a PR to `staging`. + : - To use `repo2docker`, be sure that you are inside the image + repo directory on your device, and then run `repo2docker .`. - : - To use `repo2docker`, you have to point it at the correct - image directory. For example, to build the data100 image, - you would run `repo2docker deployments/data100/image` from - the base datahub directory. - -4. Commit and push your changes to your fork of the datahub repo, and +3. Commit and push your changes to your fork of the image repo, and create a new pull request at - . - -5. Once the PR is merged to staging and the new image is built and - pushed to Artifact Registry, you can test it out on - `-staging.datahub.berkeley.edu`. - -6. Changes are only deployed to prod once the relevant CI job is - completed. See - to view CircleCI job statuses. + https://github.com/berkeley-dsep-infra/. + +4. After the build passes, merge your PR in to `main` and the image will + be built again and pushed to the Artifact Registry. If that succeeds, + then a commit will be crafted that will update the `PLACEHOLDER` field in + `hubploy.yaml` with the image's SHA and pushed to the datahub repo. + You can check on the progress of this workflow in your root image repo's + `Actions` tab. + +5. After 4 is completed successfully, go to the Datahub repo and click on + the [New pull request](https://github.com/berkeley-dsep-infra/datahub/compare) + button. Next, click on the `compare: staging` drop down, and you should see + a branch named something like `update--image-tag-`. Select that, + and create a new pull request. + +6. Once the checks has passed, merge to `staging` and your new image will be + deployed! You can watch the progress [here](https://github.com/berkeley-dsep-infra/datahub/actions/workflows/deploy-hubs.yaml). diff --git a/docs/admins/howto/new-packages.qmd b/docs/admins/howto/new-packages.qmd index 782380d31..06e83a697 100644 --- a/docs/admins/howto/new-packages.qmd +++ b/docs/admins/howto/new-packages.qmd @@ -30,7 +30,8 @@ To avoid complicated errors, make sure you always specify a version. You can find the latest version by searching on [pypi.org](https://pypi.org). -Find current version of a python package =============================== +Find current version of a python package +=============================== To find the current version of a particular installed package, you can run the following in a notebook. @@ -74,48 +75,52 @@ its current version. Familiarize yourself with [pull requests](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests) and [repo2docker](https://github.com/jupyter/repo2docker) , and create a -fork of the [datahub staging -branch](https://github.com/berkeley-dsep-infra/datahub). +fork of the the image repo. 1. Set up your git/dev environment by [following the instructions - here](https://github.com/berkeley-dsep-infra/datahub/#setting-up-your-fork-and-clones). + here](https://github.com/berkeley-dsep-infra/hub-user-image-template/blob/main/CONTRIBUTING.md). 2. Create a new branch for this PR. 3. Find the correct `environment.yml`{.interpreted-text role="file"} - file for your class. This should be under - `datahub/deployments//image` + file for your class. This should be in the root of the image repo. 4. In `environment.yml`{.interpreted-text role="file"}, packages listed under `dependencies` are installed using `conda`, while packages under `pip` are installed using `pip`. Any packages that need to be installed via `apt` must be added to either - `datahub/deployments//image/apt.txt` or - `datahub/deployments//image/Dockerfile`. + `apt.txt` or + `Dockerfile`. 5. Add any packages necessary. We typically prefer using `conda` packages, and `pip` only if necessary. Please pin to a specific version (no wildards, etc). - Note that package versions for `conda` are specified using `=`, while in `pip` they are specified using `==` -6. Test the changes locally using `repo2docker`, then submit a PR to `staging`. +6. Test the changes locally using `repo2docker`, then submit a PR to `main`. - - To use `repo2docker`, you have to point it at the right - Dockerfile for your class. For example, to test the data100 - datahub, you would run `repo2docker deployments/data100/image` from the - base datahub directory. + - To use `repo2docker`, be sure that you are inside the image + repo directory on your device, and then run `repo2docker .`. -7. Commit and push your changes to your fork of the datahub repo, and +7. Commit and push your changes to your fork of the image repo, and create a new pull request at - . - -8. Once the PR is merged to staging, you can test it out on - `class-staging.datahub.berkeley.edu`. - -9. Changes are only deployed to datahub once the relevant Travis CI job - is completed. See - to view Travis - CI job statuses. + https://github.com/berkeley-dsep-infra/``. + +8. After the build passes, merge your PR in to `main` and the image will + be built again and pushed to the Artifact Registry. If that succeeds, + then a commit will be crafted that will update the `PLACEHOLDER` field in + `hubploy.yaml` with the image's SHA and pushed to the datahub repo. + You can check on the progress of this workflow in your root image repo's + `Actions` tab. + +9. After 4 is completed successfully, go to the Datahub repo and click on + the [New pull request](https://github.com/berkeley-dsep-infra/datahub/compare) + button. Next, click on the `compare: staging` drop down, and you should see + a branch named something like `update--image-tag-`. Select that, + and create a new pull request. + +10. Once the checks has passed, merge to `staging` and your new image will be + deployed! You can watch the progress [here](https://github.com/berkeley-dsep-infra/datahub/actions/workflows/deploy-hubs.yaml). ## Tips for Upgrading Package