forked from argonne-lcf/user-guides
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request argonne-lcf#317 from saforem2/main
Update `docs/polaris/data-science-workflows/python.md`
- Loading branch information
Showing
1 changed file
with
79 additions
and
56 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,88 +1,111 @@ | ||
# Python | ||
|
||
## Conda | ||
We provide prebuilt `conda` environments containing GPU-supported builds of `torch`, `tensorflow` (both with `horovod` support for multi-node calculations), `jax`, and many other commonly-used Python modules. | ||
We provide prebuilt `conda` environments containing GPU-supported builds of | ||
`torch`, `tensorflow` (both with `horovod` support for multi-node | ||
calculations), `jax`, and many other commonly-used Python modules. | ||
|
||
Users can activate this environment by first loading the `conda` module, and then activating the base environment. | ||
Users can activate this environment by first loading the `conda` module, and | ||
then activating the base environment. | ||
|
||
Explicitly (either from an interactive job, or inside a job script): | ||
|
||
```bash | ||
$ module load conda | ||
$ conda activate base | ||
(base) $ which python3 | ||
/soft/datascience/conda/2022-09-08/mconda3/bin/python3 | ||
module load conda ; conda activate base | ||
``` | ||
In one line, `module load conda; conda activate`. This can be performed on a compute node, as well as a login node. | ||
|
||
As of writing, the latest `conda` module on Polaris is built on Miniconda3 version 4.14.0 and contains Python 3.8.13. Future modules may contain entirely different major versions of Python, PyTorch, TensorFlow, etc.; however, the existing modules will be maintained as-is as long as feasible. | ||
This will load and activate the base environment. | ||
|
||
While the shared Anaconda environment encapsulated in the module contains many of the most commonly used Python libraries for our users, you may still encounter a scenario in which you need to extend the functionality of the environment (i.e. install additional packages) | ||
## Virtual environments via `venv` | ||
|
||
There are two different approaches that are currently recommended. | ||
To install additional packages that are missing from the `base` environment, | ||
we can build a `venv` on top of it. | ||
|
||
### Virtual environments via `venv` | ||
!!! success "Conda `base` environment + `venv`" | ||
|
||
Creating your own (empty) virtual Python environment in a directory that is writable to you is simple: | ||
```bash | ||
python3 -m venv /path/to/new/virtual/environment | ||
``` | ||
This creates a new folder that is fairly lightweight folder (<20 MB) with its own Python interpreter where you can install whatever packages you'd like. First, you must activate the virtual environment to make this Python interpreter the default interpreter in your shell session. | ||
If you need a package that is **not** already | ||
installed in the `base` environment, | ||
this is generally the recommended approach. | ||
|
||
You activate the new environment whenever you want to start using it via running the activate script in that folder: | ||
```bash | ||
/path/to/new/virtual/environment/bin/activate | ||
``` | ||
We can create a `venv` on top of the base | ||
Anaconda environment (with | ||
`#!bash --system-site-packaes` to inherit | ||
the `base` packaes): | ||
|
||
```bash | ||
module load conda; conda activate | ||
VENV_DIR="venvs/polaris" | ||
mkdir -p "${VENV_DIR}" | ||
python -m venv "${VENV_DIR}" --system-site-packages | ||
source "${VENV_DIR}/bin/activate" | ||
``` | ||
|
||
In many cases, you do not want an empty virtual environment, but instead want to start from the `conda` base environment's installed packages, only adding and/or changing a few modules. | ||
You can always retroactively change the `#!bash --system-site-packages` flag | ||
state for this virtual environment by editing `#!bash ${VENV_DIR}/pyvenv.cfg` and | ||
changing the value of the line `#!bash include-system-site-packages=false`. | ||
|
||
To extend the base Anaconda environment with `venv` (e.g. `my_env` in the current directory) and inherit the base enviroment packages, one can use the `--system-site-packages` flag: | ||
To install a different version of a package that is already installed in the | ||
base environment, you can use: | ||
|
||
```bash | ||
module load conda; conda activate | ||
python -m venv --system-site-packages my_env | ||
source my_env/bin/activate | ||
# Install additional packages here... | ||
python3 pip install --ignore-installed <package> # or -I | ||
``` | ||
You can always retroactively change the `--system-site-packages` flag state for this virtual environment by editing `my_env/pyvenv.cfg` and changing the value of the line `include-system-site-packages = false`. | ||
|
||
To install a different version of a package that is already installed in the base | ||
environment, you can use: | ||
``` | ||
pip install --ignore-installed ... # or -I | ||
``` | ||
The shared base environment is not writable, so it is impossible to remove or uninstall | ||
packages from it. The packages installed with the above `pip` command should shadow those | ||
installed in the base environment. | ||
The shared base environment is not writable, so it is impossible to remove or | ||
uninstall packages from it. The packages installed with the above `pip` command | ||
should shadow those installed in the base environment. | ||
|
||
## Cloning the base Anaconda environment | ||
|
||
!!! warning | ||
|
||
### Cloning the base Anaconda environment | ||
This approach is generally not recommended as it can be quite slow and can | ||
use significant storage space. | ||
|
||
If you need more flexibility, you can clone the conda environment into a custom path, which would then allow for root-like installations via `conda install <module>` or `pip install <module>`. Unlike the `venv` approach, using a cloned Anaconda environment requires you to copy the entirety of the base environment, which can use significant storage space. | ||
If you need more flexibility, you can clone the conda environment into a custom | ||
path, which would then allow for root-like installations via `#!bash conda install | ||
<module>` or `#!bash pip install <module>`. | ||
|
||
This can be performed by: | ||
Unlike the `venv` approach, using a cloned Anaconda environment requires you to | ||
copy the entirety of the base environment, which can use significant storage | ||
space. | ||
|
||
To clone the `base` environment: | ||
|
||
```bash | ||
$ module load conda | ||
$ conda activate base | ||
(base) $ conda create --clone base --prefix /path/to/envs/base-clone | ||
(base) $ conda activate /path/to/envs/base-clone | ||
(base-clone) $ which python3 | ||
/path/to/base-clone/bin/python3 | ||
module load conda ; conda activate base | ||
conda create --clone base --prefix /path/to/envs/base-clone | ||
conda activate /path/to/envs/base-clone | ||
``` | ||
The cloning process can be quite slow. | ||
|
||
!!! warning | ||
where, `#!bash path/to/envs/base-clone` should be replaced by a suitably chosen | ||
path. | ||
|
||
In the above commands, `path/to/envs/base-clone` should be replaced by a | ||
suitably chosen path. | ||
**Note**: The cloning process can be _quite_ slow. | ||
|
||
### Using `pip install --user` (not recommended) | ||
With the conda environment setup, one can install common Python modules using `pip install --users <module-name>` which will install packages in `$PYTHONUSERBASE/lib/pythonX.Y/site-packages`. The `$PYTHONUSERBASE` environment variable is automatically set when you load the base conda module, and is equal to `/home/$USER/.local/polaris/conda/YYYY-MM-DD`. | ||
## Using `pip install --user` (not recommended) | ||
|
||
Note, Python modules installed this way that contain command line binaries will not have those binaries automatically added to the shell's `$PATH`. To manually add the path: | ||
``` | ||
export PATH=$PYTHONUSERBASE/bin:$PATH | ||
!!! danger | ||
|
||
This is typically _not_ recommended. | ||
|
||
With the conda environment setup, one can install common Python modules using | ||
`#!bash python3 pip install --users '<module-name>'` which will install | ||
packages in `#!bash $PYTHONUSERBASE/lib/pythonX.Y/site-packages`. | ||
|
||
The `#!bash $PYTHONUSERBASE` environment variable is automatically set when you | ||
load the base conda module, and is equal to `#!bash | ||
/home/$USER/.local/polaris/conda/YYYY-MM-DD`. | ||
|
||
Note, Python modules installed this way that contain command line binaries will | ||
not have those binaries automatically added to the shell's `#!bash $PATH`. To | ||
manually add the path: | ||
|
||
```bash | ||
export PATH="$PYTHONUSERBASE/bin:$PATH" | ||
``` | ||
Be sure to remove this location from `$PATH` if you deactivate the base Anaconda environment or unload the module. | ||
|
||
Cloning the Anaconda environment, or using `venv` are both more flexible and transparent when compared to `--user` installs. | ||
Be sure to remove this location from `#!bash $PATH` if you deactivate the base | ||
Anaconda environment or unload the module. | ||
|
||
Cloning the Anaconda environment, or using `venv` are both more flexible and | ||
transparent when compared to `#!bash --user` installs. |