Skip to content

Commit

Permalink
Merge pull request #32 from OpenFreeEnergy/docs
Browse files Browse the repository at this point in the history
[WIP] Docs
  • Loading branch information
dwhswenson authored Sep 28, 2023
2 parents c391a8e + f5862a2 commit 747be2e
Show file tree
Hide file tree
Showing 9 changed files with 227 additions and 1 deletion.
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
9 changes: 9 additions & 0 deletions docs/api/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _api:

Exorcist API Reference
======================

.. toctree::
:maxdepth: 2

taskstatusdb
10 changes: 10 additions & 0 deletions docs/api/taskstatusdb.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Task Status Database
====================

.. module:: exorcist.taskdb

.. autosummary::
:toctree: generated/

AbstractTaskStatusDB
TaskStatusDB
29 changes: 29 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

project = 'Exorcist'
copyright = '2023, Open Free Energy'
author = 'Open Free Energy'

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

extensions = [
"sphinx.ext.autodoc",
"sphinx.ext.autosummary",
"sphinx.ext.napoleon",
]

templates_path = ['_templates']
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']

# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = 'alabaster'
html_static_path = ['_static']
17 changes: 17 additions & 0 deletions docs/guide/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
User Guide
==========

This guide is designed to introduce developers to the ideas, architecture,
and terminology used by Exorcist. We will use the term "client" to refer to
developers who use Exorcist in their own projects, in order to distinguish
from the "users" of the client application.

Exorcist is designed to be directly used by client code developers. It is
not intended for direct usage by end users, although we provide several
conveniences that client developers can use in their code to improve the end
user experience.


.. toctree::

intro
73 changes: 73 additions & 0 deletions docs/guide/intro.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
Introduction to Exorcist
========================

Tasks form a directed acyclic graph (DAG)
-----------------------------------------

Any large-scale campaign can be described as a directed acyclic graph of
tasks, where edges represent the flow of information from an earlier stage
to a later stage. This graph is directed because the outputs of one task can
be the inputs of a future task, and it is acyclic because it is not possible
for a task to require its own outputs as an input.

Task-based frameworks, like Exorcist, implement efficient methods for
executing these task graphs.

Three databases in Exorcist
---------------------------

The central idea of Exorcist is to separate three types of data storage:

* **Task Status Database**: The task status database is the core of
Exorcist, and fully implemented in Exorcist. This database contains the
task identifiers, and information on the execution status of those tasks.
It is intentionally small and simple, to enable better concurrency.
* **Task Details Database**: The task details database is a key-value store
that describes the tasks to be performed; this is specific to the
client application. The client must define how these tasks are serialized
and deserialized into the database.
* **Results Store**: The results store is a generic storage of result data
from the specific application. The only thing Exorcist needs to know is
whether the received result was a success or a failure (in which case,
Exorcist can take responsibility for ensuring that it gets retried).

Some aspects that distinguish Exorcist from similar tools are the separation
of task status from task descriptions, and the ability for the user to
customize how task descriptions or task status are stored.

Practical usage for users
-------------------------

When thinking about the end user's experience, Exorcist tends to assume that
this occurs in two stages: planning the initial campaign, and the running
the campaign. The general assumption is that these will beqtwo different
software tools (typically two executables) with different user experiences.

Preparing the campaign: the planner
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For many use cases, planning the campaign happens on the user's workstation
or on the head node of a cluster. While this can, in principle, be a stage
that must run on a compute node, it is more frequently something that is
done interactively by the user.

The "campaign planner" tool must have write access to the *task status
database* and to the *task details database*.


Running a campaign: the worker
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The second tool that the user interacts with is the worker. Typically, the
user will launch many identical workers through jobs submitted to their
queueing system.

The worker is the workhorse of the campaign. It does the real computational
effort. It needs read/write access to the *task status database*, read access
to the *task details database*, and write access to the *results store*.

In most cases, it will also have read access to the results store, which
allows things like analysis as a separate task. If it can also have write
access to the the task details database, this allows on-the-fly creation of
new tasks, e.g., focusing simulation effort in a different direction based
on the results of the campaign so far.
33 changes: 33 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
Exorcist
========

*Task execution/orchestration, without the daemons*

Exorcist is a tool for execution and orchestration of many-task
computing/high-throughput computing. It is specifically designed for a
common case in the world of simulation, where a large simulation campaign
may include many loosely coupled individual simulations, each of which may
require hours to days to run, and the results need to be gathered into a
large storage backend.

At small to moderate scale, Exorcist can run without setting up any
long-running daemon. At larger scales, Exorcist can interface with standard
database backends, e.g. PostgreSQL. In this way, Exorcist offers the easy
set-up procedure of a daemonless solution, while offering a smooth
transition to a highly scalable solution when needed.

.. toctree::
:maxdepth: 2
:caption: Contents:

guide/index
api/index



Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
35 changes: 35 additions & 0 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=.
set BUILDDIR=_build

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)

if "%1" == "" goto help

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
2 changes: 1 addition & 1 deletion exorcist/taskdb.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ class TaskStatusDB(AbstractTaskStatusDB):
"""Database for managing execution and orchestration of tasks.
This implementation is built on SQLAlchemy. For simple usage, the
recommendation is to use the :method:`.from_filename` method to create
recommendation is to use the :meth:`.from_filename` method to create
this object, rather than its ``__init__``. The ``__init__`` method takes
a SQLAlchemy engine, which provides much more flexibility in choice of
backend.
Expand Down

0 comments on commit 747be2e

Please sign in to comment.