Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes and clarifications for scikit-learn tutorial #56

Open
2 tasks
felker opened this issue Oct 31, 2024 · 0 comments
Open
2 tasks

Fixes and clarifications for scikit-learn tutorial #56

felker opened this issue Oct 31, 2024 · 0 comments

Comments

@felker
Copy link
Member

felker commented Oct 31, 2024

Before tagging this year's version of the repo at the conclusion of the workshop, we should fix some of these problems and clarify any ambiguities that led to user issues during the walkthrough.

(mostly copied from Slack workspace)

Some pitfalls of the Dask-RAPIDS scikit-learn tutorial, specifically with the ./open_jupyterlab_polaris.sh script:

  • If you get the username@x3008c0s19b0n0: Permission denied (publickey,keyboard-interactive,hostbased). error, it is likely because you do not have any SSH keypair created on Polaris and installed in the ~/.ssh/authorized_keys list; the compute node is therefore rejecting your direct SSH jump from the login node. From a Polaris login node, run ssh-keygen -t ed25519 and ssh-copy-id polaris
  • if you get past that, then get no control path specified for "-O" command, it could be an issue with the multiplexed connection socket set up in the earlier step ssh -M -S ~/.ssh/multiplex:polaris.rapids [email protected]. The DNS resolution and other default settings can be very OS dependent and error prone. Suggested alternative below
  • If you finally get through all of that, and then get an error like port already in use , then try modifying PORTD=8787 to some other number in the script

For that second pitfall, I prefer to use a different global SSH multiplexing setup for all Polaris connections. First make:
mkdir ~/.ssh/cm_socket/
Then edit ~/.ssh/config:

Host polaris
HostName polaris.alcf.anl.gov
User <INSERT username>
ControlMaster auto
ControlPath ~/.ssh/cm_socket/%r@%h:%p
ControlPersist 10m

Then the second step turns from ssh -M -S ~/.ssh/multiplex:polaris.rapids [email protected] to simply ssh polaris

May need to fix

since that environment variable is unused.

Also:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant