diff --git a/README.md b/README.md index a7dcbdb13..434ccae73 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ DIGITS (the **D**eep Learning **G**PU **T**raining **S**ystem) is a webapp for t | Installation method | Supported platform[s] | Available versions | Instructions | | --- | --- | --- | --- | -| Deb packages | Ubuntu 14.04 | [14.04 repo](http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1404/x86_64) | [docs/UbuntuInstall.md](docs/UbuntuInstall.md) | +| Deb packages | Ubuntu 14.04, 16.04 | [14.04 repo](http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1404/x86_64), [16.04 repo](http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64) | [docs/UbuntuInstall.md](docs/UbuntuInstall.md) | | Docker | Linux | [DockerHub tags](https://hub.docker.com/r/nvidia/digits/tags/) | [nvidia-docker wiki](https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS) | | Source | Ubuntu 14.04, 16.04 | [GitHub tags](https://github.com/NVIDIA/DIGITS/releases) | [docs/BuildDigits.md](docs/BuildDigits.md) | diff --git a/digits/config/store_option.py b/digits/config/store_option.py index 91c9c08c7..474152e5d 100644 --- a/digits/config/store_option.py +++ b/digits/config/store_option.py @@ -29,7 +29,7 @@ def load_url_list(): if 'DIGITS_MODEL_STORE_URL' in os.environ: url_list = os.environ['DIGITS_MODEL_STORE_URL'] else: - url_list = "" + url_list = "http://developer.download.nvidia.com/compute/machine-learning/modelstore/5.0" return validate(url_list).split(',') diff --git a/docs/Configuration.md b/docs/Configuration.md index 80cab5571..c57573612 100644 --- a/docs/Configuration.md +++ b/docs/Configuration.md @@ -15,4 +15,4 @@ DIGITS uses environment variables for configuration. | `DIGITS_LOGFILE_FILENAME` | ~/digits.log | File for saving log messages. Default is `$DIGITS_ROOT/digits/digits.log`. | | `DIGITS_LOGFILE_LEVEL` | DEBUG | Minimum log message level to be saved (DEBUG/INFO/WARNING/ERROR/CRITICAL). Default is INFO. | | `DIGITS_SERVER_NAME` | The Big One | The name of the server (accessible in the UI under "Info"). Default is the system hostname. | -| `DIGITS_MODEL_STORE_URL` | http://localhost/modelstore | A list of URL's, separated by comma. | +| `DIGITS_MODEL_STORE_URL` | http://localhost/modelstore | A list of URL's, separated by comma. Default is the official NVIDIA store. | diff --git a/docs/UbuntuInstall.md b/docs/UbuntuInstall.md index 6d67d10da..bce1028a2 100644 --- a/docs/UbuntuInstall.md +++ b/docs/UbuntuInstall.md @@ -1,6 +1,6 @@ # Ubuntu Installation -Deb packages for major releases (i.e. v3.0 and v4.0 but not v3.1) are provided for easy installation on Ubuntu 14.04. +Deb packages for major releases (i.e. v3.0 and v4.0 but not v3.1) are provided for easy installation on Ubuntu 14.04 and 16.04. If these packages don't meet your needs, then you can follow [these instructions](BuildDigits.md) to build DIGITS and its dependencies from source. ## Prerequisites @@ -9,17 +9,17 @@ You need an NVIDIA driver ([details and instructions](InstallCuda.md#driver)). Run the following commands to get access to some package repositories: ```sh -# Access to CUDA packages -CUDA_REPO_PKG=cuda-repo-ubuntu1404_7.5-18_amd64.deb -wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/${CUDA_REPO_PKG} -O /tmp/${CUDA_REPO_PKG} -sudo dpkg -i /tmp/${CUDA_REPO_PKG} -rm -f /tmp/${CUDA_REPO_PKG} - -# Access to Machine Learning packages -ML_REPO_PKG=nvidia-machine-learning-repo-ubuntu1404_4.0-2_amd64.deb -wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1404/x86_64/${ML_REPO_PKG} -O /tmp/${ML_REPO_PKG} -sudo dpkg -i /tmp/${ML_REPO_PKG} -rm -f /tmp/${ML_REPO_PKG} +# For Ubuntu 14.04 +CUDA_REPO_PKG=http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_8.0.61-1_amd64.deb +ML_REPO_PKG=http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1404/x86_64/nvidia-machine-learning-repo-ubuntu1404_4.0-2_amd64.deb + +# For Ubuntu 16.04 +CUDA_REPO_PKG=http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb +ML_REPO_PKG=http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb + +# Install repo packages +wget "$CUDA_REPO_PKG" -O /tmp/cuda-repo.deb && sudo dpkg -i /tmp/cuda-repo.deb && rm -f /tmp/cuda-repo.deb +wget "$ML_REPO_PKG" -O /tmp/ml-repo.deb && sudo dpkg -i /tmp/ml-repo.deb && rm -f /tmp/ml-repo.deb # Download new list of packages sudo apt-get update @@ -46,27 +46,38 @@ Now that you're up and running, check out the [Getting Started Guide](GettingSta If you have another server running on port 80 already, you may need to reconfigure DIGITS to use a different port. ```sh -% sudo dpkg-reconfigure digits +sudo dpkg-reconfigure digits ``` -To make other configuration changes, try this (you probably want to leave most options as "unset" or "default" by hitting `ENTER` repeatedly): -```sh -% cd /usr/share/digits -# set new config -% sudo python -m digits.config.edit -v -# restart server -% sudo stop nvidia-digits-server -% sudo start nvidia-digits-server -``` +All other configuration is done with environment variables. +See [Configuration.md](Configuration.md) for detailed information about which variables you can change. + +* Ubuntu 14.04 + * Edit `/etc/init/digits.conf` + * Add/remove/edit lines that start with `env` + * Restart with `sudo service digits restart` + +* Ubuntu 16.04 + * Edit `/lib/systemd/system/digits.service` + * Add/remove/edit lines that start with `Environment=` in the `[Service]` section + * Restart with `sudo systemctl daemon-reload && sudo systemctl restart digits` #### Driver installations If you try to install a new driver while the DIGITS server is running, you'll get an error about CUDA being in use. -Shut down the server before installing a driver, and then restart it afterwards: +Shut down the server before installing a driver, and then restart it afterwards. + +Ubuntu 14.04: +```sh +sudo service digits stop +# (install driver) +sudo service digits start +``` +Ubuntu 16.04: ```sh -% sudo stop nvidia-digits-server +sudo systemctl stop digits # (install driver) -% sudo start nvidia-digits-server +sudo systemctl start digits ``` #### Permissions @@ -83,8 +94,21 @@ If you see this error: ``` The simplest fix is to manually install the missing library: ```sh -% sudo apt-get install cuda-cusparse-7-5 -% sudo ldconfig +sudo apt-get install cuda-cusparse-7-5 +sudo ldconfig +``` + +#### Torch and HDF5 + +There is at least one Torch package which is missing a required dependency on libhdf5-dev. +If you see this error: +``` +ERROR: /usr/share/lua/5.1/trepl/init.lua:384: /usr/share/lua/5.1/trepl/init.lua:384: /usr/share/lua/5.1/hdf5/ffi.lua:29: libhdf5.so: cannot open shared object file: No such file or directory +``` +The simplest fix is to manually install the missing library: +```sh +sudo apt-get install libhdf5-dev +sudo ldconfig ``` #### Other