Skip to content

Commit

Permalink
Merge pull request #21 from shaneknapp/update-readme
Browse files Browse the repository at this point in the history
[DH-301] update the repo docs
  • Loading branch information
shaneknapp authored Oct 17, 2024
2 parents ffe820a + 43e53fb commit 46ddfce
Show file tree
Hide file tree
Showing 7 changed files with 141 additions and 55 deletions.
81 changes: 69 additions & 12 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# How to contribute and make changes to your user image

## Setting up your fork and clones
First, go to your [github profile settings](https://github.com/settings/keys)
First, go to your [GitHub profile settings](https://github.com/settings/keys)
and make sure you have an SSH key uploaded.

Next, go to the github repo of the image that you'd like to work on and create
Next, go to the GitHub repo of the image that you'd like to work on and create
a fork. To do this, click on the `fork` button and then `Create fork`.

![Forking](images/create-fork.png)


After you create your fork of the new image repository, you should disable Github Actions **only for your fork**. To do this, navigate to `Settings` --> `Actions` --> `General` and select `Disable actions`. Then click `Save`:
After you create your fork of the new image repository, you should disable GitHub Actions **only for your fork**. To do this, navigate to `Settings` --> `Actions` --> `General` and select `Disable actions`. Then click `Save`:

![Disable fork actions](images/disable-fork-actions.png)

Expand All @@ -32,13 +32,15 @@ Now `cd` in to `<image-name>` and set up your local repo to point both at the pr
image repo (`upstream`) and your fork (`origin`). After the initial clone,
`origin` will be pointing to the main repo and we'll need to change that.

```
cd <image-name>
git remote rename origin upstream # rename origin to upstream
git remote add origin [email protected]:<your github username>/<image-name>.git # add your fork as origin
```

To confirm these changes, run `git remote -v` and see if everything is correct:
```
$ cd <image-name>
$ git remote -v # confirm that origin points to the primary repo
origin [email protected]:berkeley-dsep-infra/<image-name>.git (fetch)
origin [email protected]:berkeley-dsep-infra/<image-name>.git (push)
$ git remote rename origin upstream # rename origin to upstream
$ git remote add origin [email protected]:<your github username>/<image-name>.git # add your fork as origin
$ git remote -v # confirm the settings
origin [email protected]:<your github username>/<image-name>.git (fetch)
origin [email protected]:<your github username>/<image-name>.git (push)
Expand Down Expand Up @@ -81,7 +83,7 @@ what's been modified and check out the diffs: `git status` and `git diff`.

### Building the image locally

You should use [repo2docker](https://repo2docker.readthedocs.io/en/latest/) to build and use/test the image on your own device before you push and create a PR. It's better (and typically faster) to do this first before using CI/CD. There's no need to waste Github Action minutes to test build images when you can do this on your own device!
You should use [repo2docker](https://repo2docker.readthedocs.io/en/latest/) to build and use/test the image on your own device before you push and create a PR. It's better (and typically faster) to do this first before using CI/CD. There's no need to waste GitHub Action minutes to test build images when you can do this on your own device!

Run `repo2docker` from inside the cloned image repo. To run on a linux/WSL2 linux shell:
```
Expand All @@ -100,11 +102,13 @@ jupyter-repo2docker --user-id=1000 --user-name=jovyan \

If you just want to see if the image builds, but not automatically launch the server, add `--no-run` to the arguments (before the final `.`).

### Pushing the modified files to your fork

When you're ready to push these changes, first you'll need to stage them for a
commit:

```
git add <file1> <file2> <etc>
git add <file1> <file2> ...
```

Commit these changes locally:
Expand All @@ -119,6 +123,8 @@ Now push to your fork:
git push origin <branch name>
```

### Creating a pull request

Once you've pushed to your fork, you can go to the image repo and there should
be a big green button on the top that says `Compare and pull request`.
Click on that, check out the commits and file diffs, edit the title and
Expand All @@ -128,16 +134,18 @@ description if needed and then click `Create pull request`.

![Create PR](images/create-pr.png)

If you're having issues, you can refer to the [github documentation for pull
If you're having issues, you can refer to the [GitHub documentation for pull
requests](https://help.github.com/articles/about-pull-requests/).
Keep the choice for `base` in the GitHub PR user interface, while the choice
for `head` is your fork.

Once this is complete and if there are no problems, a github action will
Once this is complete and if there are no problems, a GitHub action will
automatically [build and test](https://github.com/berkeley-dsep-infra/hub-user-image-template/blob/main/.github/workflows/build-test-image.yaml)
the image. If this fails, please check the output of the workflow in the
action, and make any changes required to get the build to pass.

### Code reviews and merging the pull request

Once the image build has completed successfully, you can request that
someone review the PR before merging, or you can merge yourself if you are
confident. This merge will trigger a [second giuthub workflow](https://github.com/berkeley-dsep-infra/hub-user-image-template/blob/main/.github/workflows/build-push-image-commit.yaml)
Expand All @@ -146,5 +154,54 @@ Google Artifact Registry and finally creates and pushes a commit to the
[Datahub](https://github.com/berkeley-dsep-infra/datahub) repo updating the
image hash of the deployment to point at the newly built image.

### Creating a pull request in the Datahub repository

You will now need to create a pull request in the Datahub repo to merge these changes
and deploy them to `staging` for testing.

After the image is built and pushed to the Artifact registry, and the commit
that modifies that deployment's `hubploy.yaml` is pushed to the Datahub repo,
there are a couple of different ways you can create the pull request.

#### Via the workflow output

The workflow that builds and pushes has two jobs, and the second job is called
`update-deployment-image-tag`. When completed, it will display the output from
the `git push` command. This contains a clickable link that takes you directly
to the page to create a pull requests. You can navigate to it by clicking on
`Actions` in the image repo (not your fork!), and clicking on the latest
completed job.

![Navigating to the workflow](images/navigate-to-workflow.png)

After you've clicked on the appropriate workflow run, select the
`udpate-deployment-image-tag` job, and expand
`Create feature branch, add, commit and push changes` step. You will see a
link there that will direct you to the Datahub repo and create a pull request.

![Create PR from workflow output](images/create-pr-from-workflow.png)

#### Via the Datahub repo

You can also visit the Datahub repo and manually create the pull request there.
Go to the
[Pull Request tab](https://github.com/berkeley-dsep-infra/datahub/pulls) and
click on `New pull request`. You should see a new branch in the list that
will be from your most recent build. The branch will be named based on
the image that was updated, and will look something like this:

```
update-<hubname>-iamge-tag-<new hash of the image>
```

![Create PR from Datahub repo](images/create-pr-from-datahub-repo.png)

Click on that link, and then on `Create pull request`.

#### Notify Datahub staff
Let the Datahub staff know that you've created this pull request and they will review and merge it
into the `staging` branch. You can notify them via a corresponding GitHub Issue, or on the UCTech
#datahubs slack channel.

Once it's been merged to `staging`, it will automatically deploy the new image to the hub's
staging environment in a few minutes and you'll be able to test it out!
115 changes: 72 additions & 43 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,82 +1,111 @@
# hub-user-image-template :paperclip:
# hub-user-image-template

This is a template repository for creating dedicated user images for UC Berkeley hubs.
This is a template repository for creating dedicated single-user server images
for UC Berkeley Jupyterhubs.

## Overall workflow :gear:
## Overall workflow

The overall workflow is to:
The basic workflow for creating a new hub user image is as follows:

1. Create a new repository using this one as a template. Be sure to set the owner as `berkeley-dsep-infra`.
1. Create a new repository using this one as a template. Be sure to set the
owner as `berkeley-dsep-infra`.

2. Fork that repository to create your image repository (optional, but recommended).
2. In the new repo, set the appropriate values in the Actions repository
variables for `HUB` and `IMAGE`.

3. Set the appropriate values in the Actions environment variables for `HUB` and `IMAGE`.
3. Give the new repo access to the `berkeley-dsep-infra` organization-level
secrets: `GAR_SECRET_KEY` and `DATAHUB_CREATE_PR`.

4. Customize the image by editing repo2docker files in your image repository.
4. Fork that repository to create your image repository.

Changes can either be done by direct commits to main on your image repository, or through a pull request from a fork of your image repository. Direct commits will build the image and push it to Google Artifact Registry (GAR) on merge. PRs will also build the image and offer a link to test it using Binder (currently disabled). Merging the PR will also create and push a commit to the [datahub repo](https://github.com/berkeley-dsep-infra/datahub/), which requires a human to open a PR to merge said commit and deploy that image to the proper hub(s).
5. Configure your Hub to use this new image by modifying that deployment's
`hubploy.yaml` and add the parent repo's git information to
[`repos.txt`](https://github.com/berkeley-dsep-infra/datahub/blob/staging/scripts/user-image-management/repos.txt)

5. Configure your Hub to use this new image
6. Customize the image by editing `repo2docker` configuration files in your
fork of the image repository, and then open a pull request to merge these
changes to the `main` branch of the parent repo in the
`berkeley-dsep-infra` organization.

More detailed instructions are [located below](https://github.com/berkeley-dsep-infra/hub-user-image-template/#in-depth-guide).
These steps are just a summary, and much more detailed instructions are
[located here](https://docs.datahub.berkeley.edu/admins/howto/new-image.html).

### Modifying the new image

Detailed instructions showing the workflow to modify an image and push it
the CI/CD workflow are located in the [contribution guide](CONTRIBUTING.md)

### In-depth guide
In addition, we also provide a template for a simplified `README.md`
[here](https://github.com/berkeley-dsep-infra/hub-user-image-template/blob/main/README-template.md).

Check out the 2i2c docs for an in-depth guide on how to use this template repository to create a custom user image and use it for your hub :arrow_right: https://infrastructure.2i2c.org/howto/update-env/#split-up-an-image-for-use-with-the-repo2docker-action.
### Modifying the new image

Here's a rough guide on how to create your own fresh user image :arrow_right: https://docs.datahub.berkeley.edu/en/latest/admins/howto/new-image.html.
The process to modify and push an image to the Google Artifact Registry via the
CI/CD pipeline is located in the [contribution guide](CONTRIBUTING.md)

After creating a new image repo from here as a template, and bringing in the commit history (if any) of the image, you will need to set two [Github Actions Repository Variables](https://docs.github.com/en/actions/learn-github-actions/variables) for the image: `HUB` and `IMAGE`.
### Moving an existing image into the `berkeley-dsep-infra` organization

`HUB` is the short name of the hub (eg: `data100`, `datahub`, etc).
`IMAGE` is the path to the image in the Artifact Registry (eg: `ucb-datahub-2018/user-images/<hubname>-user-image`)
If you have an existing image repository, and would like to bring it in to the
`berkeley-dsep-infra` organization and retain the `git` history, please refer
to our documentation :arrow_right:
https://docs.datahub.berkeley.edu/admins/howto/transition-image.html

Next, you will need to give the newly created repo access to two organizational-level [secrets in the berkeley-dsep-infra repo](https://github.com/organizations/berkeley-dsep-infra/settings/secrets/actions): `GAR_SECRET_KEY` (to allow pushes to the Artifact Registry) and `DATAHUB_USER_IMAGE_BRANCH_PUSH` (to allow commits to be pushed to the [datahub](https://github.com/berkeley-dsep-infra/datahub) repo).
Our documentation is based on helpful guide put together by 2i2c :arrow_right:
https://infrastructure.2i2c.org/howto/update-env/#split-up-an-image-for-use-with-the-repo2docker-action

## About this template repository :information_source:
## About this template repository

This template repository enables [jupyterhub/repo2docker-action](https://github.com/jupyterhub/repo2docker-action).
This GitHub action builds a Docker image using the contents of this repo and pushes it to the [Google Artifact Registry](https://cloud.google.com/artifact-registry) registry.
This template repository uses the
[jupyterhub/repo2docker-action](https://github.com/jupyterhub/repo2docker-action)
to build a Docker image using the contents of this repo, and pushes it to our
[Google Artifact Registry](https://cloud.google.com/artifact-registry) when
a pull request is merged to `main`.

### The environment

It provides an example of a `environment.yml` conda configuration file for repo2docker to use.
This file can be used to list all the conda packages that need to be installed by `repo2docker` in your environment.
The `repo2docker-action` will update the [base repo2docker](https://github.com/jupyterhub/repo2docker/blob/HEAD/repo2docker/buildpacks/conda/environment.yml) conda environment with the packages listed in this `environment.yml` file.
The repo provides a default `environment.yml` conda configuration file for
`repo2docker` to use to define and build a single-user server image. This file
is used to define the python packages that will be installed during the image
build process, either via `conda` or `pip`.

**Note:**
A complete list of possible configuration files that can be added to the repository and be used by repo2docker to build the Docker image, can be found in the [repo2docker docs](https://repo2docker.readthedocs.io/en/latest/config_files.html#configuration-files).
A complete list of configuration files that can be added to the
repository and used by `repo2docker` to build the Docker image can be found in
the [repo2docker documentation](https://repo2docker.readthedocs.io/en/latest/config_files.html#configuration-files).

### Making changes to a single user server image

Once you've created the new image repo from this template, please refer to [the contribution instructions](CONTRIBUTING.md) located in the repo for detailed instructions.
Once you've created the new image repo from this template, please refer to
[the contribution instructions](CONTRIBUTING.md) located in the repo for
detailed instructions.

### The GitHub Action workflows

This template repository provides some GitHub Action workflows that can build and push the image to Google Artifact Repository when configured, and test the image on Binder.

![Workflows](images/workflows.png)
This template repository provides GitHub Action workflows that can build
and push the image to Google Artifact Repository when configured, and push a
commit to the [datahub](https://github.com/berkeley-dsep-infra/datahub)
repository that modifies `hubploy.yaml` for any hubs using this image with the
new SHA tag.

#### 1. Build and test container image :arrow_right: [test.yaml](https://github.com/berkeley-dsep-infra/hub-user-image-template/blob/main/.github/workflows/test.yaml)

This workflow is triggered when a Pull Request is opened against the default branch (`main`)..
During PR builds, the image is **only** built and **not** pushed, unless explicitly configured to do so.
This workflow is triggered when a pull request is opened against the default
branch (`main`). During PR builds, the image is **only** built and **not**
pushed to the Google Artifact Registry.

#### 2. Test this PR on Binder Badge :arrow_right: [binder.yaml](https://github.com/berkeley-dsep-infra/hub-user-image-template/blob/main/.github/workflows/binder.yaml.disable)
Please note that the image will not be built for documentation changes
(markdown files or any graphic images in the `images/` subdirectory).

*Temporarily disabled*
#### 2. YAML linting :arrow:_right: [yaml-lint.yaml](https://github.com/berkeley-dsep-infra/hub-user-image-template/blob/main/.github/workflows/yaml-lint.yaml)

Since our images are typically large and take > 10m to build, this means that Binderhub builds will currently time out.
This workflow is triggered when a pull request is opened against the default
branch (`main`). It uses [yamllint](https://yamllint.readthedocs.io/en/stable/)
to check all yaml files in the repo for correctness.

This workflow posts a comment inside a pull request, every time a pull request gets opened. The comment contains a "Test this PR on Binder" badge, which can be used to access the image defined by the PR in [mybinder.org](https://mybinder.org/).
#### 3. **Temporarily disabled:** Test this PR on Binder Badge :arrow_right: [binder.yaml](https://github.com/berkeley-dsep-infra/hub-user-image-template/blob/main/.github/workflows/binder.yaml.disable)

![Test this PR on Binder](images/binder-badge.png)
Since our images are typically large and take > 10m to build, this means that
Binderhub builds will currently time out.

#### 3. Build, test and push container image :arrow_right: [build-push-open-pr.yaml](https://github.com/berkeley-dsep-infra/hub-user-image-template/blob/main/.github/workflows/build-push-image-commit.yaml)
#### 4. Build, test and push container image :arrow_right: [build-push-open-pr.yaml](https://github.com/berkeley-dsep-infra/hub-user-image-template/blob/main/.github/workflows/build-push-image-commit.yaml)

After a PR is merged to `main`, this workflow builds the image again, pushes to the Artifact Registry and will create a push to the [Datahub repo](https://github.com/berkeley-dsep-infra/datahub) to update the image tag for any hubs that use this image. The PR there will need to be created manually.
After a PR is merged to `main`, this workflow builds the image again, pushes it
to the Google Artifact Registry and then creates a commit that updates the image tag
for any hubs that use this image. That commit is then pushed to the
[Datahub repo](https://github.com/berkeley-dsep-infra/datahub), and you will
then need to manually create a pull requests to merge and deploy the new image.
Binary file modified images/create-fork.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/create-pr-from-datahub-repo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/create-pr-from-workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified images/disable-fork-actions.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/navigate-to-workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 46ddfce

Please sign in to comment.