Skip to content

Commit

Permalink
Merge image and process tutorial into adv cluster tutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
carolineechen committed Dec 30, 2024
1 parent e7d3cf5 commit dcd0817
Show file tree
Hide file tree
Showing 4 changed files with 103 additions and 60 deletions.
3 changes: 1 addition & 2 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -112,9 +112,8 @@ Table of Contents
:caption: API Basics

tutorials/api-clusters
tutorials/api-clusters-adv
tutorials/api-modules
tutorials/api-process
tutorials/api-images
tutorials/api-folders
tutorials/api-secrets
tutorials/api-resources
Expand Down
Original file line number Diff line number Diff line change
@@ -1,11 +1,74 @@
Processes
=========
Clusters - Advanced
===================

.. raw:: html

<p><a href="https://colab.research.google.com/github/run-house/notebooks/blob/stable/docs/api-process.ipynb">
<p><a href="https://colab.research.google.com/github/run-house/notebooks/blob/stable/docs/api-clusters-adv.ipynb">
<img height="20px" width="117px" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a></p>

This tutorial assumes that you are already familiar with the cluster
basics mentioned in `Cluster API
tutorial <https://www.run.house/docs/tutorials/api-clusters>`__, which
covers Runhouse cluster creation, running a basic function, and some
ulitity commands for running on the cluster.

This tutorial covers some more advanced features, like setting up
Cluster state with a Runhouse Image, using processes on the cluster, and
other ways to interact with the cluster.

Base Image
----------

As you saw in the Cluster API tutorial, Runhouse clusters expose various
functions that allow you to set up state, dependencies, and whatnot on
all nodes of your cluster, including ``install_packages``, ``rsync``,
``set_env_vars``, and ``run_bash``.

A Runhouse “Image” is an abstraction that allows you to run these setup
steps *before* we install runhouse and bring up the Runhouse daemon and
initial set up on your cluster’s nodes. You can also specify a machine
or Docker image_id to the Runhouse image.

.. code:: ipython3
import runhouse as rh
image = (
rh.Image(name="sample_image")
.from_docker("python:3.12.8-bookworm")
.install_packages(["numpy", "pandas"])
.sync_secrets(["huggingface"])
.set_env_vars({"RH_LOG_LEVEL": "debug"})
)
cluster = rh.cluster(name="ml_ready_cluster", image=image, instance_type="CPU:2+", provider="aws").up_if_not()
.. parsed-literal::
:class: code-output
I 12-17 12:04:55 provisioner.py:560] Successfully provisioned cluster: ml_ready_cluster
I 12-17 12:04:57 cloud_vm_ray_backend.py:3402] Run commands not specified or empty.
Clusters
AWS: Fetching availability zones mapping...NAME LAUNCHED RESOURCES STATUS AUTOSTOP COMMAND
ml_ready_cluster a few secs ago 1x AWS(m6i.large, image_id={'us-east-1': 'docker:python:3.12.8-bookwor... UP (down) /Users/rohinbhasin/minico...
[?25h
The example above will launch a cluster with the base docker image
``python:3.12.8-bookworm``, install the given packages, sync over your
local huggingface token, and set the Runhouse log level env var, prior
to starting the Runhouse Daemon. To continue installing packages,
running commands, etc after the Runhouse server is already started, you
can directly use the cluster commands.

The growing list of setup steps available for runhouse images is
available in the `API
Reference <https://www.run.house/docs/main/en/api/python/image>`__.

Processes
---------

On your Runhouse cluster, whether you have one node or multiple nodes,
you may want to run things in different processes on the cluster.

Expand Down Expand Up @@ -213,3 +276,20 @@ running functions in a process.
'compute': {'GPU': 1},
'runtime_env': {},
'env_vars': {'LOG_LEVEL': 'DEBUG'}}}
Interacting with the Cluster
----------------------------

Beyond interacting with the cluster through Python APIs, Runhouse also
provides other ways of working with the cluster, including easy ways to
SSH directly onto the cluster, or creating a notebook tunnel, which will
let you locally develop on a notebook that runs on the cluster.

To SSH, you can either use the Python API ``cluster.ssh()``, or the CLI
command ``runhouse cluster ssh <cluster_name>``

To create a notebook, run ``cluster.notebook()``, optionally providing
the notebook port. This will tunnel into and launch a notebook from the
cluster, and provide a link to use for local development.
20 changes: 19 additions & 1 deletion docs/tutorials/api-clusters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -247,9 +247,19 @@ the machine.
Useful Cluster Functions
------------------------

There are many actions that can be performed on the cluster directly
through it’s APIs, for instance: \* running commands - ``run_bash``
(over HTTP server), ``run_bash_over_ssh`` (over SSH), ``run_python`` \*
installing pacakges - ``install_packages`` \* set env vars -
``set_process_env_vars`` \* syncing up or down local files - ``rsync``

We show a few examples below, and for a more comprehensive list of these
functions and example usage, please refer to the `Cluster Python
API <https://www.run.house/docs/api/python/cluster>`__.

.. code:: ipython3
tls_cluster.run(['pip install numpy && pip freeze | grep numpy'])
tls_cluster.run_bash(['pip install numpy && pip freeze | grep numpy'])
.. parsed-literal::
Expand Down Expand Up @@ -293,3 +303,11 @@ Useful Cluster Functions
:class: code-output
[(0, '1.26.4\n', '')]
Dig Deeper
----------

For a more advanced usage tutorial of clusters, you can look at
`Clusters -
Advanced <https://www.run.house/docs/tutorials/api-clusters-adv>`__.
54 changes: 0 additions & 54 deletions docs/tutorials/api-images.rst

This file was deleted.

0 comments on commit dcd0817

Please sign in to comment.