Skip to content

Commit

Permalink
Github action deploy to kubeflow (kubeflow#730)
Browse files Browse the repository at this point in the history
* Updated the azurepipeline example. 

I believe there is a small bug in the script, use tmp variable to solve the issue.

* updated with github actions example

* Update README.md

Updated the readme further.

* Update README.md

* Update README.md

* Update data.py

* specifing version of ubuntu and updateing text

* updating spelling misstake

* update the linting

* updated with github actions example

* Update README.md

Updated the readme further.

* Update README.md

* Update README.md

* Update data.py

* specifing version of ubuntu and updateing text

* updating spelling misstake

* update the linting

* updated yaml

* Update data.py

Co-authored-by: JohanWork <[email protected]>
  • Loading branch information
NikeNano and JohanWork authored Feb 18, 2020
1 parent cc93a80 commit d4f7845
Show file tree
Hide file tree
Showing 7 changed files with 218 additions and 0 deletions.
117 changes: 117 additions & 0 deletions pipelines/github_action/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Compile, deploy and run Kubeflow Pipeline using Github Actions.

This tutorial will go through how to use [Github Actions](https://github.com/features/actions) togheter with kubeflow for MLOps. The goal with this set up is to improve set up speed, simplify deployments, improve versioning and reproducibility.

The tutorial will be based upon [this](https://github.com/marketplace/actions/kubeflow-compile-deploy-and-run) Github Action.

## Initial setup
Before starting with this tutorial the following things have to be in place:
- A GCP account.
- [Kubeflow set up on GKE](https://www.kubeflow.org/docs/gke/deploy/deploy-cli/) using [IAP](https://www.kubeflow.org/docs/gke/deploy/oauth-setup/).
- A service account with access to your Kubeflow deployment, see [here](https://github.com/kubeflow/examples/blob/cookbook/cookbook/pipelines/notebooks/kfp_remote_deploy-IAP.ipynb) section "Setup and configuration" for example and needed accesses.
- The source code in a GitHub repository

## Add secrets to Github repository

In order for the Github action to have access to the kubeflow deployment, [secrets to github has to be added](https://help.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets).

The following secrets has to be added:
- KUBEFLOW_URL - The url to your kubeflow deployment
- ENCODED_GOOGLE_APPLICATION_CREDENTIALS - Service account with access to kubeflow and rights to deploy, see [here](http://amygdala.github.io/kubeflow/ml/2019/08/22/remote-deploy.html) for example, the credentials needs to be bas64 encode:

``` bash
cat path-to-key.json | base64
```
- CLIENT_ID - The IAP client id secret.

[Here](https://help.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets) you can find how to add secrets.

## Github action

To run the github action a github workflow has to be added to the following folder from the root of the repository:
```
.github/workflows/your_github_action_file.yml
```

This file should follow the convention of [github workflows](https://help.github.com/en/actions/reference/workflow-syntax-for-github-actions)

The following is an example of a workflow file(can also be found in the file: "example_workflow.py").

```yaml
name: Compile, Deploy and Run on Kubeflow
on: [push]

# Set environmental variables

jobs:
build:
runs-on: ubuntu-18.04
steps:
- name: checkout files in repo
uses: actions/checkout@master


- name: Submit Kubeflow pipeline
id: kubeflow
uses: NikeNano/kubeflow-github-action@master
with:
KUBEFLOW_URL: ${{ secrets.KUBEFLOW_URL }}
ENCODED_GOOGLE_APPLICATION_CREDENTIALS: ${{ secrets.GKE_KEY }}
GOOGLE_APPLICATION_CREDENTIALS: /tmp/gcloud-sa.json
CLIENT_ID: ${{ secrets.CLIENT_ID }}
PIPELINE_CODE_PATH: "example_pipeline.py"
PIPELINE_FUNCTION_NAME: "flipcoin_pipeline"
PIPELINE_PARAMETERS_PATH: "parameters.yaml"
EXPERIMENT_NAME: "Default"
RUN_PIPELINE: True
VERSION_GITHUB_SHA: False

```

Github workflows can be given names, and in the example the name is set to: "Compile, Deploy and Run on Kubeflow". This name will then be the name of the action when it runs on Github.

The ON arguments is replate to which actions on github should this workflow be triggered on. For more info see [here](https://help.github.com/en/actions/reference/workflow-syntax-for-github-actions#on)

"Runs on" defines which type of machine should the workflow be executed on, in this case it dont matter since we will use a action(NikeNano/kubeflow-github-action@master) which are containerize.

A Github workflow is splitted to steps,. where each step can run a command (python, bash, or whatever is installed on the machine) or a action. In this example the first step will check out the code. This is needed in order to access the source code from the repository. The firts step uses an action named "uses: actions/checkout@master", master here refers to the master branch of [the repository](https://github.com/actions/checkout) where this action is open sourced.

The following step, named: "Submit Kubeflow pipeline" is the most interesting part for this tutorial. Within this step the connection to kubeflow is set up and depending on the user specified values. (see "with"). If you like to check the source code for the action used in this step you can find it [here](https://github.com/NikeNano/kubeflow-github-action)(you can find more info on how to build actions [here](https://help.github.com/en/actions/building-actions)).

For the action you need to specify the followng values in the "with" part:
- KUBEFLOW_URL: The URL to your kubeflow deployment
- GKE_KEY: Service account with access to kubeflow and rights to deploy, see [here](http://amygdala.github.io/kubeflow/ml/2019/08/22/remote-deploy.html) for example, the credentials needs to be bas64 encode:

``` bash
cat path-to-key.json | base64
```
- GOOGLE_APPLICATION_CREDENTIALS: The path to where you like to store the secrets, which needs to be decoded from GKE_KEY
- CLIENT_ID: The IAP client secret
- PIPELINE_CODE_PATH: The full path to the python file containing the pipeline
- PIPELINE_FUNCTION_NAME: The name of the pipeline function the PIPELINE_CODE_PATH file
- PIPELINE_PARAMETERS_PATH: The pipeline parameters, path to yaml with the paramters, see file parameters.yaml for example.
- EXPERIMENT_NAME: The name of the kubeflow experiment within which the pipeline should run
- RUN_PIPELINE: If you like to also run the pipeline set "True"
- VERSION_GITHUB_SHA: If the pipeline containers are versioned with the github hash. Set to False. Will be update with example later.


## Usage

If you use the github workflow defined above, the workflow will be triggered on a push. You can see the workflow running on the tab "Actions".

![Alt text](actions_ower_view.png?raw=true "Title")
_Figure 1_

Figure 1 shows how the Github Actions view looks, here all historical and current runs are presented. A seperate run can be selected which will forward you as a user to the view presented in Figure 2.


![Alt text](check_action.png?raw=true "Title")
_Figure 2_

Figure 2 shows the steps in the action workflow and its execution, the green checkmarks indicates that it was succesfull. Each step in the workflow can be futher explored, see Figure 3 in which the "Submit Kubeflow Pipeline" step where selected.


![Alt text](deep_dive.png?raw=true "Title")
_Figure 3_

In step 3 the outputs from the step is presented. Here you can see some of the logging for the executed action.
Binary file added pipelines/github_action/actions_ower_view.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pipelines/github_action/check_action.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added pipelines/github_action/deep_dive.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
69 changes: 69 additions & 0 deletions pipelines/github_action/example_pipeline.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
#!/usr/bin/env python3
# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from kfp import dsl


def random_num_op(low, high):
"""Generate a random number between low and high."""
return dsl.ContainerOp(
name='Generate random number',
image='python:alpine3.6',
command=['sh', '-c'],
arguments=['python -c "import random; print(random.randint($0, $1))" | tee $2',
str(low), str(high), '/tmp/output'],
file_outputs={'output': '/tmp/output'}
)


def flip_coin_op():
"""Flip a coin and output heads or tails randomly."""
return dsl.ContainerOp(
name='Flip coin',
image='python:alpine3.6',
command=['sh', '-c'],
arguments=['python -c "import random; result = \'heads\' if random.randint(0,1) == 0 '
'else \'tails\'; print(result)" | tee /tmp/output'],
file_outputs={'output': '/tmp/output'}
)


def print_op(msg):
"""Print a message."""
return dsl.ContainerOp(
name='Print',
image='alpine:3.6',
command=['echo', msg])


@dsl.pipeline(
name='Conditional execution pipeline',
description='Shows how to use dsl.Condition().'
)
def flipcoin_pipeline():
flip = flip_coin_op()
with dsl.Condition(flip.output == 'heads'):
random_num_head = random_num_op(0, 9)
with dsl.Condition(random_num_head.output > 5):
print_op('heads and %s > 5!' % random_num_head.output)
with dsl.Condition(random_num_head.output <= 5):
print_op('heads and %s <= 5!' % random_num_head.output)

with dsl.Condition(flip.output == 'tails'):
random_num_tail = random_num_op(10, 19)
with dsl.Condition(random_num_tail.output > 15):
print_op('tails and %s > 15!' % random_num_tail.output)
with dsl.Condition(random_num_tail.output <= 15):
print_op('tails and %s <= 15!' % random_num_tail.output)
24 changes: 24 additions & 0 deletions pipelines/github_action/example_workflow.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: Compile, Deploy and Run on Kubeflow
on: [push]

jobs:
build:
runs-on: ubuntu-18.04
steps:
- name: Checkout files in repo
uses: actions/checkout@master

- name: Submit Kubeflow pipeline
id: kubeflow
uses: NikeNano/kubeflow-github-action@master
with:
KUBEFLOW_URL: ${{ secrets.KUBEFLOW_URL }}
ENCODED_GOOGLE_APPLICATION_CREDENTIALS: ${{ secrets.GKE_KEY }}
GOOGLE_APPLICATION_CREDENTIALS: /tmp/gcloud-sa.json
CLIENT_ID: ${{ secrets.CLIENT_ID }}
PIPELINE_CODE_PATH: "example_pipeline.py"
PIPELINE_FUNCTION_NAME: "flipcoin_pipeline"
PIPELINE_PARAMETERS_PATH: "parameters.yaml"
EXPERIMENT_NAME: "Default"
RUN_PIPELINE: True
VERSION_GITHUB_SHA: False
8 changes: 8 additions & 0 deletions pipelines/github_action/parameters.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
gcp_bucket:
github_action
project:
kubeflow-github
train_data:
train_data.csv
forecast_data:
forecat_data.csv

0 comments on commit d4f7845

Please sign in to comment.