-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dask aurora #583
base: main
Are you sure you want to change the base?
Dask aurora #583
Conversation
|
||
In this example, we will [estimate Pi using a Monte Carlo method](https://en.wikipedia.org/wiki/Pi#Monte_Carlo_methods). | ||
|
||
Paste the following python script into a file called `pi_dask_gpu.py`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Paste the following python script into a file called `pi_dask_gpu.py`. | |
Paste the following Python script into a file called `pi_dask_gpu.py`. |
- generate random points uniformly inside the unit square | ||
- return the number of points that are inside the unit circle | ||
1. When the results from the workers are ready, they are aggregated to compute Pi. | ||
1. A total of 5 Pi calculations are performed and timed (the very first iterations will incur in initialization and warmup costs). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. A total of 5 Pi calculations are performed and timed (the very first iterations will incur in initialization and warmup costs). | |
1. A total of 5x Pi calculations are performed and timed. Note, the very first iterations will incur initialization and warmup costs. |
conda activate dask | ||
jupyter lab --no-browser --port=23456 | ||
``` | ||
- Copy the line starting with `http://localhost:23456/lab?token=<TOKEN>` at the end of the jupyter command's output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Copy the line starting with `http://localhost:23456/lab?token=<TOKEN>` at the end of the jupyter command's output. | |
- Copy the line starting with `http://localhost:23456/lab?token=<TOKEN>` at the end of the Jupyter command's output. |
jupyter lab --no-browser --port=23456 | ||
``` | ||
- Copy the line starting with `http://localhost:23456/lab?token=<TOKEN>` at the end of the jupyter command's output. | ||
- **On your local machine**, open a browser window and go to that url. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **On your local machine**, open a browser window and go to that url. | |
- **On your local machine**, open a browser window and go to that URL. |
- Then, [start a Dask cluster](#start-a-cluster-with-gpu-workers) and wait about 10 seconds for the cluster to start. | ||
- **On your local machine**, open a ssh tunnel to the compute node (`COMPUTE_NODE` is the compute node's hostname and `YOUR_ALCF_USERNAME` is your ALCF username): | ||
```bash | ||
ssh -t -L 23456:localhost:23456 -L 8787:localhost:8787 [email protected] ssh -t -L 23456:localhost:23456 -L 8787:localhost:8787 login.aurora.alcf.anl.gov ssh -t -L 23456:localhost:23456 -L 8787:localhost:8787 COMPUTE_NODE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get nervous when I see this multi-hop SSH tunnel line in our tutorials and docs, as there are several pitfalls as we saw during the scikit-learn
hands-on exercise back in October: argonne-lcf/ALCF_Hands_on_HPC_Workshop#56
Are any of those pitfalls possible here, i.e. should we suggest installing the SSH keypair in ~/.ssh/authorized_keys
for the compute node jump?
```bash | ||
ssh -t -L 23456:localhost:23456 -L 8787:localhost:8787 [email protected] ssh -t -L 23456:localhost:23456 -L 8787:localhost:8787 login.aurora.alcf.anl.gov ssh -t -L 23456:localhost:23456 -L 8787:localhost:8787 COMPUTE_NODE | ||
``` | ||
- **On the compute node** where you land with the above ssh command, start JupyterLab: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **On the compute node** where you land with the above ssh command, start JupyterLab: | |
- **On the compute node** where you land with the above SSH command, start JupyterLab: |
or
- **On the compute node** where you land with the above ssh command, start JupyterLab: | |
- **On the compute node** where you land with the above `ssh` command, start JupyterLab: |
No description provided.