Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to Use TLS in Dask-Gateway? #344

Open
rileyhun opened this issue Nov 2, 2020 · 8 comments
Open

How to Use TLS in Dask-Gateway? #344

rileyhun opened this issue Nov 2, 2020 · 8 comments

Comments

@rileyhun
Copy link

rileyhun commented Nov 2, 2020

I understand that it is recommended to use TLS in a production environment as per the docs, so I'm trying to set that up. Here are the steps I followed for my attempt at doing this:

[1] I added the paths to the self-signed certificate and key files in a Dockerfile
[2] Pushed that Docker Image to Google Cloud Image Repository
[3] Replace image names from "daskgateway/dask-gateway-server" to the image in Google Cloud Image Repo in the helm config file
[4] Added the paths to the self-signed key and cert files in the extraConfig field

What happened:
Nothing changed. No errors. The internal load balancer for the traefik proxy server still using HTTP.

What you expected to happen:
I expected the internal load balancer to use HTTPS

Minimal Complete Verifiable Example:

Dockerfile:

FROM daskgateway/dask-gateway-server:latest

ADD certs /certs/

Helm Config:

extraConfig:
    security: |
      c.Proxy.tls_cert = "certs/myca.pem"
      c.Proxy.tls_key = "certs/mykey.pem"
    clusteroptions: |
      from dask_gateway_server.options import Options, Integer, Float, String

      c.KubeClusterConfig.idle_timeout = 3600

      def option_handler(options):
        return {
        "worker_cores": options.worker_cores,
        "worker_memory": "%fG" % options.worker_memory,
        "image": options.image,
        }

      c.Backend.cluster_options = Options(
        Integer("worker_cores", 2, min=1, max=8, label="Worker Cores"),
        Float("worker_memory", 4, min=1, max=16, label="Worker Memory (GiB)"),
        String("image", default="daskgateway/dask-gateway:latest", label="Image"),
        handler=option_handler,
      )

Environment:

  • Helm version: 0.8.0
@jcrist
Copy link
Member

jcrist commented Nov 4, 2020

The c.Proxy.* settings don't affect k8s users, as the proxy used on k8s is different (we should update our docs to better reflect this). Currently we don't expose configuring HTTPS for the traefik proxy - most users run with JupyterHub and piggyback on JupyterHub's TLS proxy by registering dask-gateway as a JupyterHub service. This obviously doesn't help users not running with JupyterHub.

There's a few ways we could enable configuring TLS for use with k8s. I'd probably mimic how JupyterHub exposes things:

  • User configures a secret to mount in the traefik pods that contains preloaded TLS credentials
  • User configures the certs directly in the helm values.yaml, the helm chart manages the secret directly
  • User selects letsencrypt, enabling automatic https. The free version of traefik doesn't support this for multi-pod deployments, but we could rig up support ourselves with some effort.

The first two are the easiest to setup, just require some helm chart munging.

@rileyhun
Copy link
Author

rileyhun commented Nov 4, 2020

The c.Proxy.* settings don't affect k8s users, as the proxy used on k8s is different (we should update our docs to better reflect this). Currently we don't expose configuring HTTPS for the traefik proxy - most users run with JupyterHub and piggyback on JupyterHub's TLS proxy by registering dask-gateway as a JupyterHub service. This obviously doesn't help users not running with JupyterHub.

Thanks Jim,

This is very helpful information.

I should clarify that I was going to set-up Dask Gateway as a service of JupyterHub because that simplifies a lot of things from an authentication perspective, but we also have AI Platform Notebooks which our users are used to, so using JupyterHub seems kind of redundant and would most likely confuse them.

@cdibble
Copy link
Contributor

cdibble commented May 27, 2021

I've got a similar use case with Dask Gateway, k8s, and jupyterhub. My k8s is deployed on AWS EKS.

I've set up Dask Gateway using a Jupyterhub api token for auth. My JupyterHub deployment uses tls and is behind a VPN. Everything with Dask Gateway deployed via the helm chart works great.

However (re: this thread), I've found I can't figure out how to set up tls certificates for the Dask dashboard that is generated when clusters are deployed. Additionally, traffic to those dashboards ends up not being protected by the VPN.

I can use the ClusterIP traefik service type (see #304 ), but then I can't get to Dask Dashboards at all.

Ideally, I'd like to have both tls and use an internal IP so that I can still get Dask dashboards but they'll only be exposed to internal network traffic. One of two wouldn't be bad either.

Is this possible by configuring, perhaps gateway.backend.scheduler/extraPodConfig? I've tried setting the following annotations on various service in the values.yaml:


service:
    annotations: #{}
      kubernetes.io/ingress.class: alb
      alb.ingress.kubernetes.io/target-type: ip
      alb.ingress.kubernetes.io/scheme: internal

Any thoughts or direction would be appreciated and I'd be happy to include more info. I'm not sure this warrants a new Issue, but if so, I'll open one.

@droctothorpe
Copy link
Contributor

droctothorpe commented May 27, 2021

Easy fix: use annotations + cloud provider to provision an ELB for Dask Gateway's Traefik service, and in the annotations configure that ELB to terminate HTTPS. Granted, requests are unencrypted past the ELB, but the important stuff (client to scheduler comms) relies on mTLS and is encrypted end to end anyway.

@droctothorpe
Copy link
Contributor

Also, all of the REST traffic can stay in cluster (instead of routing out through the ELB) if you leverage the K8s DNS names for the services. The dashboard will be the only exception (you'll need to pass the public-address keyword to the gateway object or the client will display a widget with the internal, inaccessible URL) but it will use https.

@cdibble
Copy link
Contributor

cdibble commented May 27, 2021

Thanks @droctothorpe for the suggestions. I'll see if I can get the ELB to use HTTPS termination- looks pretty straightforward if these docs are enough.

As for your second comment- would you mind clarifying? I'm not sure how to take advantage of k8s DNS names to ensure that the dashboard is the only service exposed. Right now, the only service with an External-IP is the dask-gateway traefik service, so that would lead me to think that all other traffic is already limited to internal paths. But if there is some further configuration that can ensure this, I'd definitely want to implement that. Any links, resources, or suggestions are always appreciated.

@droctothorpe
Copy link
Contributor

The other services will be exposed (via the same ELB that serves the dashboard). The advantage of using the K8s DNS names is that requests from the in-cluster Dask Gateway clients to the Gateway API don't leave the cluster for no reason. It also helps with environment file consistency as well if you're provisioning to multiple discrete environments. It's a nice to have but not strictly necessary and kind of a tangent from your original question, heh.

@cdibble
Copy link
Contributor

cdibble commented May 28, 2021

Thanks for the tips! I'm still pretty new to kubernetes and looking into the DNS names has been edifying.

I wanted to post this snippet as reference for others. I was able to hide Dask Gateway, including the Dask Dashboards, behind my VPN using the following annotations on the traefik service. Note that these annotations are specific to AWS EKS with AWS Elastic Load Balancers (and AWS Load Balancer Controller as the Ingress Controller), but I'd think there are similar methods with other load balancers.

traefik:
  service:
    type: LoadBalancer # Use LoadBalancer if you want internet-facing ingress.
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-type: alb
      service.beta.kubernetes.io/aws-load-balancer-internal: <CIDR-block-for-local-VPC-traffic>

[edit- fixed indentations]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants