diff --git a/rsts/community/contribute.rst b/rsts/community/contribute.rst index 942fee29b1..5640cb1475 100644 --- a/rsts/community/contribute.rst +++ b/rsts/community/contribute.rst @@ -284,68 +284,434 @@ To understand how the below components interact with each other, refer to :ref:` * - **Guidelines**: Refer to the `FlyteCTL Contribution Guide `__ -๐Ÿ”ฎ Recommended iteration cycle -============================== - -As you may have already read in other parts of the documentation, the `Flyte repository `__ includes Go code -that integrates all backend components (admin, propeller, data catalog, console) into a single executable. -The Flyte team is currently working on consolidating the core backend repositories into one repository, which is expected to be completed by 2023. -In the meantime, you can contribute to the individual repositories and then merge your changes into the `Flyte repository `__. -This setup is suitable for Go-based backend development, but it has not been tested for Flyteconsole development, which has a different development cycle. -Nonetheless, this setup allows you to run the Flyte binary from your IDE, enabling you to debug your code effectively by setting breakpoints. -Additionally, this setup connects you to all other resources in the demo environment, such as PostgreSQL and RDS. - -Dev mode cluster -**************** +๐Ÿ”ฎ Development Environment Setup Guide +====================================== + +This guide provides a step-by-step approach to setting up a local development environment for +`flyteidl `_, `flyteadmin `_, +`flyteplugins `_, `flytepropeller `_, +`flytekit `_ , `flyteconsole `_, +`datacatalog `_, and `flytestdlib `_. + +Requirements +************ + +This guide has been tested and used on AWS EC2 with an Ubuntu 22.04 +image. The following tools are required: + +- `Docker `__ +- `Kubectl `__ +- `Go `__ + +Content +******* + +- `How to setup dev environment for flyteidl, flyteadmin, flyteplugins, + flytepropeller, datacatalog and flytestdlib? <#how-to-setup-dev-environment-for-flyteidl-flyteadmin-flyteplugins-flytepropeller-datacatalog-and-flytestdlib>`__ + +- `How to setup dev environment for + flytekit? <#how-to-setup-dev-environment-for-flytekit>`__ + +- `How to setup dev environment for + flyteconsole? <#how-to-setup-dev-environment-for-flyteconsole>`__ + +- `How to access Flyte UI, minio, postgres, k3s, and endpoints? + <#how-to-access-flyte-ui-minio-postgres-k3s-and-endpoints>`__ + +How to setup dev environment for flyteidl, flyteadmin, flyteplugins, flytepropeller, datacatalog and flytestdlib? +****************************************************************************************************************************** + +**1. Install flytectl** + + +`Flytectl `__ is a portable and lightweight command-line interface to work with Flyte. + +.. code:: shell + + # Step1: Install the latest version of flytectl + curl -sL https://ctl.flyte.org/install | bash + # flyteorg/flytectl info checking GitHub for latest tag + # flyteorg/flytectl info found version: 0.6.39 for v0.6.39/Linux/x86_64 + # flyteorg/flytectl info installed ./bin/flytectl + + # Step2: Export flytectl path based on the previous log "flyteorg/flytectl info installed ./bin/flytectl" + export PATH=$PATH:/home/ubuntu/bin # replace with your path + +**2. Build a k3s cluster that runs minio and postgres Pods.** + + +| `Minio `__ is an S3-compatible object store that will be used later to store task output, input, etc. +| `Postgres `__ is an open-source object-relational database that will later be used by flyteadmin/dataCatalog to + store all Flyte information. + +.. code:: shell + + # Step1: Start k3s cluster, create Pods for postgres and minio. Note: We cannot access Flyte UI yet! but we can access the minio console now. + flytectl demo start --dev + # ๐Ÿ‘จโ€๐Ÿ’ป Flyte is ready! Flyte UI is available at http://localhost:30080/console ๐Ÿš€ ๐Ÿš€ ๐ŸŽ‰ + # โ‡๏ธ Run the following command to export demo environment variables for accessing flytectl + # export FLYTECTL_CONFIG=/home/ubuntu/.flyte/config-sandbox.yaml + # ๐Ÿ‹ Flyte sandbox ships with a Docker registry. Tag and push custom workflow images to localhost:30000 + # ๐Ÿ“‚ The Minio API is hosted on localhost:30002. Use http://localhost:30080/minio/login for Minio console + + # Step2: Export FLYTECTL_CONFIG as the previous log indicated. + FLYTECTL_CONFIG=/home/ubuntu/.flyte/config-sandbox.yaml + + # Step3: The kubeconfig will be automatically copied to the user's main kubeconfig (default is `/.kube/config`) with "flyte-sandbox" as the context name. + # Check that we can access the K3s cluster. Verify that postgres and minio are running. + kubectl get pod -n flyte + # NAME READY STATUS RESTARTS AGE + # flyte-sandbox-docker-registry-85745c899d-dns8q 1/1 Running 0 5m + # flyte-sandbox-kubernetes-dashboard-6757db879c-wl4wd 1/1 Running 0 5m + # flyte-sandbox-proxy-d95874857-2wc5n 1/1 Running 0 5m + # flyte-sandbox-minio-645c8ddf7c-sp6cc 1/1 Running 0 5m + # flyte-sandbox-postgresql-0 1/1 Running 0 5m + + +**3. Run all Flyte components (flyteadmin, flytepropeller, datacatalog, flyteconsole, etc) in a single binary.** + +The `Flyte repository `__ includes Go code +that integrates all Flyte components into a single binary. + +.. code:: shell + + # Step1: Clone flyte repo + git clone https://github.com/flyteorg/flyte.git + cd flyte + + # Step2: Build a single binary that bundles all the Flyte components. + # The version of each component/library used to build the single binary are defined in `go.mod`. + sudo apt-get -y install jq # You may need to install jq + go mod tidy + make compile + + # Step3: Edit the config file: ./flyte-single-binary-local.yaml. + # Replace occurrences of $HOME with the actual path of your home directory. + sed -i "s|\$HOME|${HOME}|g" ./flyte-single-binary-local.yaml + + # Step4: Running the single binary. + # The POD_NAMESPACE environment variable is necessary for the webhook to function correctly. + # You may encounter an error due to `ERROR: duplicate key value violates unique constraint`. Running the command again will solve the problem. + POD_NAMESPACE=flyte ./flyte start --config flyte-single-binary-local.yaml + # All logs from flyteadmin, flyteplugins, flytepropeller, etc. will appear in the terminal. + + +**4. Build single binary with your own code.** + + +The following instructions provide guidance on how to build single binary with your customized code, using ``flyteadmin`` as an example. + + +- **Note** Although we'll use ``flyteadmin`` as an example, these steps can be applied to other Flyte components or libraries as well. + ``{flyteadmin}`` below can be substituted with other Flyte components/libraries: ``flyteidl``, ``flyteplugins``, ``flytepropeller``, ``datacatalog``, or ``flytestdlib``. + +- **Note** If modifications are needed in multiple components/libraries, the steps will need to be repeated for each component/library. + +.. code:: shell + + # Step1: Fork and clone the {flyteadmin} repository, modify the source code accordingly. + git clone https://github.com/flyteorg/flyteadmin.git + cd flyteadmin + + # Step2.1: {Flyteadmin} uses Go 1.19, so make sure to switch to Go 1.19. + export PATH=$PATH:$(go env GOPATH)/bin + go install golang.org/dl/go1.19@latest + go1.19 download + export GOROOT=$(go1.19 env GOROOT) + export PATH="$GOROOT/bin:$PATH" + + # Step2.2: You may need to install goimports to fix lint errors. + # Refer to https://pkg.go.dev/golang.org/x/tools/cmd/goimports + go install golang.org/x/tools/cmd/goimports@latest + export PATH=$(go env GOPATH)/bin:$PATH + + # Step 3.1: Review the go.mod file in the {flyteadmin} directory to identify the Flyte components/libraries that {flyteadmin} relies on. + # If you have modified any of these components/libraries, use `go mod edit -replace` in the {flyteadmin} repo to replace original components/libraries with your customized ones. + # For instance, if you have also modified `flytepropeller`, run the following commands: + go mod edit -replace github.com/flyteorg/flytepropeller=/home/ubuntu/flytepropeller #replace with your own local path to flytepropeller + + # Step 3.2: Generate code, fix lint errors and run unit tests for {flyteadmin}. + # Note, flyteidl does not have unit tests, so you can skip the `make test_unit` command. + # Note, flytestdlib only have `make generate` command. + make generate + make lint + make test_unit + + # Step4: Now, you can build the single binary. Go back to Flyte directory, run `go mod edit -replace` to replace the {flyteadmin} code with your own. + go mod edit -replace github.com/flyteorg/flyteadmin=/home/ubuntu/flyteadmin #replace with your own local path to {flyteadmin} + + # Step5: Rebuild and rerun the single binary based on your own code. + go mod tidy + make compile + POD_NAMESPACE=flyte ./flyte start --config flyte-single-binary-local.yaml + +**5. Test by running a hello world workflow.** + + +.. code:: shell + + # Step1: Install flytekit + pip install flytekit && export PATH=$PATH:/home/ubuntu/.local/bin + + # Step2: The flytesnacks repository provides a lot of useful examples. + git clone https://github.com/flyteorg/flytesnacks && cd flytesnacks + + # Step3: Before running the hello world workflow, create the flytesnacks-development namespace. + # This is necessary because, by default (without creating a new project), task Pods will run in the flytesnacks-development namespace. + kubectl create namespace flytesnacks-development + + # Step4: Run a hello world example + pyflyte run --remote examples/basics/basics/hello_world.py my_wf + # Go to http://localhost:30080/console/projects/flytesnacks/domains/development/executions/fd63f88a55fed4bba846 to see execution in the console. + +**6. Tear down the k3s cluster after finishing developing.** + + +.. code:: shell + + flytectl demo teardown + # context removed for "flyte-sandbox". + # ๐Ÿงน ๐Ÿงน Sandbox cluster is removed successfully. + # โ‡๏ธ Run the following command to unset sandbox environment variables for accessing flytectl + # unset FLYTECTL_CONFIG + +How to setup dev environment for flytekit? +******************************************* + +**1. Set up local Flyte Cluster.** + + +If you are also modifying the code for flyteidl, flyteadmin, flyteplugins, flytepropeller datacatalog, or flytestdlib, +refer to the instructions in the `previous section <#how-to-setup-dev-environment-for-flyteidl-flyteadmin-flyteplugins-flytepropeller-datacatalog-and-flytestdlib>`__ to set up a local Flyte cluster. + +If not, we can start backends with a single command. + +.. code:: shell + + # Step1: Install the latest version of flytectl, a portable and lightweight command-line interface to work with Flyte. + curl -sL https://ctl.flyte.org/install | bash + # flyteorg/flytectl info checking GitHub for latest tag + # flyteorg/flytectl info found version: 0.6.39 for v0.6.39/Linux/x86_64 + # flyteorg/flytectl info installed ./bin/flytectl + + # Step2: Export flytectl path based on the previous log "flyteorg/flytectl info installed ./bin/flytectl" + export PATH=$PATH:/home/ubuntu/bin # replace with your path + + # Step3: Starts the Flyte demo cluster. This will setup a k3s cluster running minio, postgres Pods, and all Flyte components: flyteadmin, flyteplugins, flytepropeller, etc. + # See https://docs.flyte.org/projects/flytectl/en/latest/gen/flytectl_demo_start.html for more details. + flytectl demo start + # ๐Ÿ‘จโ€๐Ÿ’ป Flyte is ready! Flyte UI is available at http://localhost:30080/console ๐Ÿš€ ๐Ÿš€ ๐ŸŽ‰ + # โ‡๏ธ Run the following command to export demo environment variables for accessing flytectl + # export FLYTECTL_CONFIG=/home/ubuntu/.flyte/config-sandbox.yaml + # ๐Ÿ‹ Flyte sandbox ships with a Docker registry. Tag and push custom workflow images to localhost:30000 + # ๐Ÿ“‚ The Minio API is hosted on localhost:30002. Use http://localhost:30080/minio/login for Minio console + +**2. Run workflow locally.** + + +.. code:: shell + + # Step1: Build a virtual environment for developing Flytekit. This will allow your local changes to take effect when the same Python interpreter runs `import flytekit`. + git clone https://github.com/flyteorg/flytekit.git # replace with your own repo + cd flytekit + virtualenv ~/.virtualenvs/flytekit + source ~/.virtualenvs/flytekit/bin/activate + make setup + pip install -e . + pip install gsutil awscli + # If you are also developing the plugins, execute the following: + cd plugins + pip install -e . + + # Step2: Modify the source code for flytekit, then run unit tests and lint. + make lint + make test + + # Step3: Run a hello world sample to test locally + git clone https://github.com/flyteorg/flytesnacks + cd flytesnacks + python3 examples/basics/basics/hello_world.py my_wf + # Running my_wf() hello world + +**3. Run workflow in sandbox.** + + +Before running your workflow in the sandbox, make sure you're able to successfully run it locally. +To deploy the workflow in the sandbox, you'll need to build a Flytekit image. +Create a Dockerfile in your Flytekit directory with the minimum required configuration to run a task, as shown below. +If your task requires additional components, such as plugins, you may find it useful to refer to the construction of the `officail flitekit image `__ + +.. code:: Dockerfile + + FROM python:3.9-slim-buster + USER root + WORKDIR /root + ENV PYTHONPATH /root + RUN apt-get update && apt-get install build-essential -y + RUN apt-get install git -y + # The following line is an example of how to install your modified plugins. In this case, it demonstrates how to install the 'deck' plugin. + # RUN pip install -U git+https://github.com/Yicheng-Lu-llll/flytekit.git@"demo#egg=flytekitplugins-deck-standard&subdirectory=plugins/flytekit-deck-standard" # replace with your own repo and branch + RUN pip install -U git+https://github.com/Yicheng-Lu-llll/flytekit.git@demo # replace with your own repo and branch + ENV FLYTE_INTERNAL_IMAGE "localhost:30000/flytekit:demo" # replace with your own image name and tag + +The instructions below explain how to build the image, push the image to +the Flyte cluster, and finally submit the workflow. + +.. code:: shell + + # Step1: Ensure you have pushed your changes to the remote repo + # In the flytekit folder + git add . && git commit -s -m "develop" && git push + + # Step2: Build the image + # In the flytekit folder + export FLYTE_INTERNAL_IMAGE="localhost:30000/flytekit:demo" # replace with your own image name and tag + docker build --no-cache -t "${FLYTE_INTERNAL_IMAGE}" -f ./Dockerfile . + + # Step3: Push the image to the Flyte cluster + docker push ${FLYTE_INTERNAL_IMAGE} + + # Step4: Submit a hello world workflow to the Flyte cluster + git clone https://github.com/flyteorg/flytesnacks + cd flytesnacks + # Note, create the flytesnacks-development namespace if not exists: + # This is necessary because, by default (without creating a new project), task Pods will run in the flytesnacks-development namespace. + # kubectl create namespace flytesnacks-development + pyflyte run --image ${FLYTE_INTERNAL_IMAGE} --remote examples/basics/basics/hello_world.py my_wf + # Go to http://localhost:30080/console/projects/flytesnacks/domains/development/executions/f5c17e1b5640c4336bf8 to see execution in the console. + +How to setup dev environment for flyteconsole? +********************************************** + +**1. Set up local Flyte cluster.** + +Depending on your needs, refer to one of the following guides to setup up the Flyte cluster: + +- If you do not need to change the backend code, refer to the section on `How to Set Up a Dev Environment for Flytekit? <#how-to-setup-dev-environment-for-flytekit>`__ +- If you need to change the backend code, refer to the section on `How to setup dev environment for flyteidl, flyteadmin, flyteplugins, flytepropeller, datacatalog and flytestdlib? <#how-to-setup-dev-environment-for-flyteidl-flyteadmin-flyteplugins-flytepropeller-datacatalog-and-flytestdlib>`__ + + +**2. Start flyteconsole.** + + +.. code:: shell + + # Step1: Clone the repo and navigate to the Flyteconsole folder + git clone https://github.com/flyteorg/flyteconsole.git + cd flyteconsole + + # Step2: Install Node.js 18. Refer to https://github.com/nodesource/distributions/blob/master/README.md#using-ubuntu-2. + curl -fsSL https://deb.nodesource.com/setup_18.x | sudo -E bash - &&\ + sudo apt-get install -y nodejs + + # Step3: Install yarn. Refer to https://classic.yarnpkg.com/lang/en/docs/install/#debian-stable. + curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add - + echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list + sudo apt update && sudo apt install yarn + + # Step4: Add environment variables + export BASE_URL=/console + export ADMIN_API_URL=http://localhost:30080 + export DISABLE_AUTH=1 + export ADMIN_API_USE_SSL="http" + + # Step5: Generate SSL certificate + # Note, since we will use HTTP, SSL is not required. However, missing an SSL certificate will cause an error when starting Flyteconsole. + make generate_ssl + + # Step6: Install node packages + yarn install + yarn build:types # It is fine if seeing error `Property 'at' does not exist on type 'string[]'` + yarn run build:prod + + # Step7: Start flyteconsole + yarn start + +**3. Install the Chrome plugin:** `Moesif Origin & CORS Changer `__. + + +We need to disable `CORS `__ to load resources. + +:: + + 1. Activate plugin (toggle to "on") + 2. Open 'Advanced Settings': + 3. set Access-Control-Allow-Credentials: true + +**4. Go to** http://localhost:3000/console/. + + +How to access Flyte UI, minio, postgres, k3s, and endpoints? +************************************************************************* + + +This section presumes a local Flyte cluster is already setup. If it isn't, refer to either: + +- `How to setup dev environment for flytekit? <#how-to-setup-dev-environment-for-flytekit>`__ +- `How to setup dev environment for flyteidl, flyteadmin, flyteplugins, flytepropeller, datacatalog and flytestdlib? <#how-to-setup-dev-environment-for-flyteidl-flyteadmin-flyteplugins-flytepropeller-datacatalog-and-flytestdlib>`__ + + +**1. Access the Flyte UI.** + + +`Flyte UI `__ is a web-based user interface for Flyte that lets you interact with Flyte objects and build directed acyclic graphs (DAGs) for your workflows. + +You can access it via http://localhost:30080/console. + +**2. Access the minio console.** + + +Core Flyte components, such as admin, propeller, and datacatalog, as well as user runtime containers rely on an object store (in this case, minio) to hold files. +During development, you might need to examine files such as `input.pb/output.pb `__, or `deck.html `__ stored in minio. + +Access the minio console at: http://localhost:30080/minio/login. The default credentials are: + +- Username: ``minio`` +- Password: ``miniostorage`` + + +**3. Access the postgres.** + + +FlyteAdmin and datacatalog use postgres to store persistent records, and you can interact with postgres on port ``30001``. Here is an example of using `psql` to connect: + +.. code:: shell + + # Step1: Install the PostgreSQL client. + sudo apt-get update + sudo apt-get install postgresql-client -To launch the dependencies, teardown any old sandboxes you may have, and then run: + # Step2: Connect to the PostgreSQL server. The password is "postgres". + psql -h localhost -p 30001 -U postgres -d flyte -.. code-block:: - flytectl demo start --dev +**4. Access the k3s dashboard.** -This command will launch the demo environment without running Flyte. By doing so, developers can run Flyte later on their host machine. -Set up Flyte configuration -************************** +Access the k3s dashboard at: http://localhost:30080/kubernetes-dashboard. -#. Copy the file ``flyte-single-binary-local.yaml`` to ``~/.flyte/local-dev-config.yaml``. -#. Replace occurrences of ``$HOME`` with the actual path of your home directory. +**5. Access the endpoints.** -Cluster resources -***************** -One of the configuration entries you will notice is ``cluster_resources.templatePath``. -This folder should contain the templates that the cluster resource controller will use. -To begin, you can create a file called ``~/.flyte/cluster-resource-templates/00_namespace.yaml`` with the following content: +Service endpoints are defined in the `flyteidl` repository under the `service` directory. You can browse them at `here `__. -.. literalinclude:: ../../charts/flyte-binary/eks-production.yaml - :lines: 81-87 +For example, the endpoint for the `ListTaskExecutions `__ API is: -Pull console artifacts -********************** +.. code:: shell -Run the following command from the base folder of the Flyte repository to pull in the static assets for Flyteconsole: + /api/v1/task_executions/{node_execution_id.execution_id.project}/{node_execution_id.execution_id.domain}/{node_execution_id.execution_id.name}/{node_execution_id.node_id} -.. code-block:: +You can access this endpoint at: - make cmd/single/dist +.. code:: shell -Build and iterate -***************** + # replace with your specific task execution parameters + http://localhost:30080/api/v1/task_executions/flytesnacks/development/fe92c0a8cbf684ad19a8/n0?limit=10000 -To bring in the code of the component you are testing, use the command go get ``github.com/flyteorg/&gitsha``. -Once you have done that, you can run the following command: -.. code-block:: - POD_NAMESPACE=flyte go run -tags console cmd/main.go start --config ~/.flyte/local-dev-config.yaml -The ``POD_NAMESPACE`` environment variable is necessary for the webhook to function correctly. -You can also create a build target in your IDE with the same command. -Once it is up and running, you can access Flyte hosted by your local machine by going to ``localhost:30080/console``. -The Docker host mapping is used to obtain the correct IP address for your local host. ๐Ÿž File an issue ================