From f86ed1cbc5c6bf8671983403962d317eada2ec67 Mon Sep 17 00:00:00 2001 From: thalassemia Date: Thu, 12 Dec 2024 01:51:33 -0800 Subject: [PATCH] Fix uv conflict with pyenv on Sherlock --- doc/gcloud.rst | 27 +++----------------- doc/workflows.rst | 34 +++++++++++-------------- runscripts/container/build-runtime.sh | 2 +- runscripts/jenkins/setup-environment.sh | 18 ++----------- runscripts/workflow.py | 1 + 5 files changed, 23 insertions(+), 59 deletions(-) diff --git a/doc/gcloud.rst b/doc/gcloud.rst index 717b0d70a..2ca37d416 100644 --- a/doc/gcloud.rst +++ b/doc/gcloud.rst @@ -121,27 +121,8 @@ right service account and project. Next, install Git and clone the vEcoli reposi sudo apt update && sudo apt install git git clone https://github.com/CovertLab/vEcoli.git -Try running ``python3 -m venv vEcoli-env`` and read the error message to find -what version of ``venv`` you need to ``sudo apt install``. Once installed, -run ``python3 -m venv vEcoli-env`` to create a virtual environment. Activate -this virtual environment by running ``source vEcoli-env/bin/activate``. - -.. tip:: - Instead of doing this manually every time you start your VM, you can append - ``source $HOME/vEcoli-env/bin/activate`` to your ``~/.bashrc``. - -With the virtual environment activated, navigate into the cloned vEcoli -repository and install the required Python packages (check README.md and -requirements.txt for correct versions):: - - cd vEcoli - pip install --upgrade pip setuptools==73.0.1 wheel - pip install numpy==1.26.4 - pip install -r requirements.txt - make clean compile - -Then, install Java (through SDKMAN) and Nextflow following -`these instructions `_. +Now follow the installation instructions from the README starting with +installing ``uv`` and finishing with installing Nextflow. .. note:: The only requirements to run :mod:`runscripts.workflow` on Google Cloud @@ -185,7 +166,7 @@ current state of your repository, with the built images being automatically uploaded to the ``vecoli`` Artifact Registry repository of your project. - ``build-runtime.sh`` builds a base Docker image containing the Python packages -necessary to run vEcoli as listed in ``requirements.txt`` +necessary to run vEcoli as listed in ``uv.lock`` - ``build-wcm.sh`` builds on the base image created by ``build-runtime.sh`` by copying the files in the cloned vEcoli repository, honoring ``.gitignore`` @@ -201,7 +182,7 @@ keys in your configuration JSON:: "gcloud": { # Name of image build-runtime.sh built/will build "runtime_image_name": "" - # Boolean, can put false if requirements.txt did not change since the last + # Boolean, can put false if uv.lock did not change since the last # time a workflow was run with this set to true "build_runtime_image": true, # Name of image build-wcm.sh built/will build diff --git a/doc/workflows.rst b/doc/workflows.rst index 4db8c1eb3..42c2df5f7 100644 --- a/doc/workflows.rst +++ b/doc/workflows.rst @@ -501,29 +501,24 @@ lines to your ``~/.bash_profile``, then close and reopen your SSH connection: # Load newer Git and Java for nextflow module load system git java/21.0.4 + # Include shared nextflow installation on PATH + export PATH=$PATH:$GROUP_HOME/vEcoli_env + # Load virtual environment with PyArrow + source $GROUP_HOME/vEcoli_env/.venv/bin/activate - # Set PYTHONPATH to root of repo so imports work - export PYTHONPATH="$HOME/vEcoli" - # Use one thread for OpenBLAS (better performance and reproducibility) - export OMP_NUM_THREADS=1 - - # Initialize pyenv - export PYENV_ROOT="${GROUP_HOME}/pyenv" - if [ -d "${PYENV_ROOT}" ]; then - export PATH="${PYENV_ROOT}/bin:${PATH}" - eval "$(pyenv init -)" - eval "$(pyenv virtualenv-init -)" - fi - -Inside the cloned repository, run ``pyenv local vEcoli``. This loads a virtual -environment with PyArrow, the only Python package required to start a workflow -with :mod:`runscripts.workflow`. Once a workflow is started, vEcoli will build +Once a workflow is started with :mod:`runscripts.workflow`, vEcoli will build an Apptainer image with all the other model dependencies using ``runscripts/container/build-runtime.sh``. This image will then be used to start containers to run the steps of the workflow. To run or interact with the model without using :mod:`runscripts.workflow`, start an interactive container by following the steps in :ref:`sherlock-interactive`. +.. tip:: + To update the version of PyArrow in the shared ``vEcoli_env`` virtual environment, + install ``uv`` on your Sherlock account + (`instructions `_), + navigate to ``$GROUP_HOME/vEcoli_env``, and run ``uv sync``. + .. _sherlock-config: Configuration @@ -534,7 +529,7 @@ options to your configuration JSON (note the top-level ``sherlock`` key):: { "sherlock": { - # Boolean, whether to build a fresh Apptainer runtime image. If requirements.txt + # Boolean, whether to build a fresh Apptainer runtime image. If uv.lock # did not change since your last build, you can set this to false "build_runtime_image": true, # Absolute path (including file name) of Apptainer runtime image to either @@ -578,7 +573,8 @@ To run and develop the model on Sherlock outside a workflow, run:: runscripts/container/interactive.sh -w runtime_image_path -a Replace ``runtime_image_path`` with the path of an Apptainer image built with -the latest ``requirements.txt``. If you are not sure if ``requirements.txt`` +the latest ``uv.lock``, which contains the version of the Python packages that +``uv`` will install. If you are not sure if ``uv.lock`` changed since the last time you ran a workflow with ``build_runtime_image`` set to true (or if you have never run a workflow), run the following to build a runtime image, picking any path:: @@ -613,7 +609,7 @@ to debug with interactive containers (see :ref:`sherlock-interactive`). This can be done using the ``-p`` argument for ``runscripts/container/interactive.sh``. If your HPC cluster does not have Apptainer installed, you can follow the -local setup instructions in the README assuming your pyenv installation and +local setup instructions in the README assuming your uv installation and virtual environments are accessible from all nodes. Then, delete the following lines from ``runscripts/nextflow/config.template`` and always set ``build_runtime_image`` to false in your config JSONs (see :ref:`sherlock-config`):: diff --git a/runscripts/container/build-runtime.sh b/runscripts/container/build-runtime.sh index 49f482bf3..796b667bc 100755 --- a/runscripts/container/build-runtime.sh +++ b/runscripts/container/build-runtime.sh @@ -1,6 +1,6 @@ #!/bin/sh # Use Google Cloud Build, local Docker, or HPC cluster Apptainer to build -# a personalized image with requirements.txt installed. If using Cloud Build, +# a personalized image with uv.lock packages installed. If using Cloud Build, # store the built image in the "vecoli" repository in Artifact Registry. # # ASSUMES: The current working dir is the vEcoli/ project root. diff --git a/runscripts/jenkins/setup-environment.sh b/runscripts/jenkins/setup-environment.sh index 9400802fb..5402a77ad 100644 --- a/runscripts/jenkins/setup-environment.sh +++ b/runscripts/jenkins/setup-environment.sh @@ -3,19 +3,5 @@ set -e # Load newer Git and Java for nextflow module load system git java/21.0.4 -# Set PYTHONPATH to root of repo so imports work -export PYTHONPATH=$PWD -# Use one thread for OpenBLAS (better performance and reproducibility) -export OMP_NUM_THREADS=1 - -# Initialize pyenv -export PYENV_ROOT="${GROUP_HOME}/pyenv" -if [ -d "${PYENV_ROOT}" ]; then - export PATH="${PYENV_ROOT}/bin:${PATH}" - eval "$(pyenv init -)" - eval "$(pyenv virtualenv-init -)" -fi - -### Edit this line to make this branch use another pyenv -pyenv local vEcoli -pyenv activate +export PATH=$PATH:$GROUP_HOME/vEcoli_env +source $GROUP_HOME/vEcoli_env/.venv/bin/activate diff --git a/runscripts/workflow.py b/runscripts/workflow.py index 30bb9e39d..91fb4140f 100644 --- a/runscripts/workflow.py +++ b/runscripts/workflow.py @@ -435,6 +435,7 @@ def main(): f"{repo_dir}:{repo_dir}", "--cwd", repo_dir, + "--writable-tmpfs", "-e", runtime_image_name, "uv",