Skip to content

Running Jupyter Notebooks in a Docker container

Raman Gupta edited this page Oct 31, 2020 · 11 revisions

I often like to put applications that I use regularly into Docker containers so that they are accessible from multiple computers as I regularly switch between client machines and like to continue from where I left off. Setting up Jupyter in a Docker container that would also support pyscript was a bit complicated, so the following is a reproduction of the steps I took to get this working.

Configuration Steps

  1. Decide which Jupyter stack you want to use by reading the Jupyter Docker Stacks documentation. At a minimum, I would recommend you use jupyter/minimal-notebook, but feel free to use any of the other images if you are planning to use this installation for other purposes. I personally chose to use jupyter/datascience-notebook and that will be reflected in the docker run command further below, but all of the steps should remain the same regardless of which Stack image you choose to use.

  2. Setup folders on your host machine to persist data from the container. The docker run command below assumes you use 2i so adjust it accordingly based on your choice.

    i. Create minimally four folders for the following four target paths:

Path to Target in Container Purpose
/home/jovyan Your notebooks will be persisted to path/to/host/folder/work. Your settings will be persisted to path/to/host/folder/.jupyter
/opt/conda/share/jupyter/kernels/pyscript This is where your pyscript kernel files will be stored (/opt/conda/share/jupyter/kernels/ is the KERNEL_DIRECTORY referred to in the README
/usr/local/bin/start-notebook.d/ See here for more info: https://jupyter-docker-stacks.readthedocs.io/en/latest/using/common.html#startup-hooks
/usr/local/bin/before-notebook.d/ See here for more info: https://jupyter-docker-stacks.readthedocs.io/en/latest/using/common.html#startup-hooks
ii. If you wanted to be more targeted about storing data from `/home/jovyan`, you could replace that line in the table with the following two:
Path to Target in Container Purpose
/home/jovyan/work Your notebooks will be persisted here.
/home/jovyan/.jupyter Your settings will be persisted here.
  1. Follow the steps in the README to copy the pyscript jupyter kernel to the host folder that will be linked to /opt/conda/share/jupyter/kernels/pyscript. Remember as you go through the instructions that KERNEL_DIRECTORY is /opt/conda/share/jupyter/kernels/.

  2. (This step is only needed if the pyscript kernel version you are using includes a requirements.txt) Create an executable file in the host folder that will be linked to /usr/local/bin/start-notebook.d/ with the following content:

#!/usr/bin/bash
pip install -r /opt/conda/share/jupyter/kernels/pyscript/requirements.txt

Having this file there will ensure that the dependencies that the pyscript kernel needs are installed before the notebook server gets started, even if you destroy and rebuild your container.

  1. Create and run a Docker container using either docker run or docker-compose depending on your preference. I use docker run, so here's what my run command:
sudo docker run -d --restart unless-stopped --name=jupyter \
    -p 8888:8888 \
    --env TZ="US/Eastern" \
    --env JUPYTER_ENABLE_LAB=yes \
    --env RESTARTABLE=yes \
    --user $UID --group-add users \
    -v </path/to/host/folder/1>:/home/jovyan \
    -v </path/to/host/folder/2>:/opt/conda/share/jupyter/kernels/pyscript \
    -v </path/to/host/folder/3>:/usr/local/bin/start-notebook.d/ \
    -v </path/to/host/folder/4>:/usr/local/bin/before-notebook.d/ \
    jupyter/datascience-notebook:latest

Notes:

  • You do not need to include JUPYTER_ENABLE_LAB=yes. This just tells the container to use jupyter lab instead of jupyter notebook

  • Replace the TZ value with your timezone using the TZ database name from here: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List

  • If port 8888 is already taken on your host machine, replace the first 8888 with the port you want to use.

  • When I first tried running this container, docker logs showed that the container wasn't able to start due to ownership/permissions for the host folders I created. I created a new user on my host machine, ran chown -R <path/to/host/folder> for each host folder, and then added the --user $UID --group-add users line with $UID being the uid of the user I created (on non-Windows systems you can get this by running id <username>)

  1. Assuming permissions and folders are set up correctly, after starting the container, you should be able to navigate to <host IP/name>:<8888 or alternate port you picked>. You should see a screen asking you for a token.

  2. Access the docker logs for your container (you can run docker logs jupyter to see them). Look for the following log entry:

[C 15:33:01.160 NotebookApp]

    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://localhost:8888/?token=112bb073331f1460b73768c76dffb2f87ac1d4ca7870d46a

DO NOT ENTER THE TOKEN IN YET!

  1. If you scroll down on the page, you will see a section where you can enter the token and also set a password. I would recommend you take this approach because your password will persist even if you recreate the container, making it easier to log in the next time without having to check your logs. Alternatively, you can use the original token input form if you would rather use the locally generated token each time.

  2. If everything is set up correctly, you should see the Jupyter interface, where you can create and access Jupyter notebooks using the pyscript hass kernel

Troubleshooting

  • As mentioned in the configuration steps, Jupyter is a bit finicky with user permissions on the host machine. If the notes in the configuration steps don't help you, you can refer to the Docker Options section of the Jupyter Docker Stacks documentation to try to troubleshoot.

  • I used to only have an External URL configured that I used for Home Assistant both from within and outside my home. When trying to run commands in my Jupyter notebook using the external URL for hass_host and hass_url, I was not able to connect to my instance. I was able to get around this by creating an Internal URL and using that for hass_host and hass_url in pyscript.conf for the kernel.

Clone this wiki locally