Repository containing scaffolding for a Python 3-based data science project based on the TensorFlow Federated ecosystem.
Simply follow the instructions to create a new project repository from this template.
Project organization is based on ideas from Good Enough Practices for Scientific Computing.
- Put each project in its own directory, which is named after the project.
- Put external scripts or compiled programs in the
bin
directory. - Put raw data and metadata in a
data
directory. - Put text documents associated with the project in the
doc
directory. - Put all Docker related files in the
docker
directory. - Install the Conda environment into an
env
directory. - Put all notebooks in the
notebooks
directory. - Put files generated during cleanup and analysis in a
results
directory. - Put project source code in the
src
directory. - Name all files to reflect their content or function.
After adding any necessary dependencies to the Conda environment.yml
file you can create the
environment in a sub-directory of your project directory by running the following command.
$ conda env create --prefix ./env --file environment.yml
Once the new environment has been created you can activate the environment with the following command.
$ conda activate ./env
Note that the env
directory is not under version control as it can always be re-created from
the environment.yml
file as necessary.
If you wish to use any JupyterLab extensions included in the environment.yml
file
then you need to activate the environment and rebuild the JupyterLab application using
the following commands to source the postBuild
script.
$ conda activate $ENV_PREFIX # optional if environment already active
(/path/to/env) $ . postBuild
For convenience the above steps have been encapsulated by the bin/create-conda-env.sh
script
which can be run as follows.
./bin/create-conda-env.sh
If you add (remove) dependencies to (from) the environment.yml
file after the environment has
already been created, then you can update the environment with the following command.
$ conda env create --prefix ./env --file environment.yml --force
If you add any additional JupyterLab extensions, then the easiest way to update the environment
is to re-run bin/create-conda-env.sh
script to insure that JupyterLab is re-built with the
new extensions.
To list all of the packages installed in the environment run the following command.
conda list --prefix ./env
In order to build Docker images for your project and run containers you will need to install Docker and Docker Compose.
Detailed instructions for using Docker to build and image and launch containers can be found in
the docker/README.md
.