cooperative-computing-lab · lucaspar · Dec 20, 2024 · Dec 20, 2024 · Dec 20, 2024 · Dec 20, 2024
diff --git a/README.md b/README.md
@@ -5,6 +5,7 @@ Shepherd is designed to manage local workflows that involve both actions that ru
 Shepherd is particularly useful for applications that require persistent services as part of a traditional task-based workflow. For example, consider a scenario where a web server should start only after the database service has completed its initialization and is ready to handle queries. The database does not perform a single action that completes; instead, it reaches an internal state that should dynamically trigger the launch of the web server. Shepherd can monitor the database service's logs for a message indicating successful startup. Upon detecting this state, Shepherd triggers the initiation of the web server, ensuring efficient workflow execution. Moreover, Shepherd wraps the entire service workflow into a single task that terminates upon completion, making it easy to integrate into larger task-based workflows.
 
 ## Key Features
+
 - **Services as Tasks:** Shepherd treats persistent services as first-class tasks within a workflow, enabling them to be seamlessly integrated into traditional task-based workflow managers.
 
 - **Dependency Management:** Shepherd allows tasks (both actions and services) to start based on the internal states of other services or actions, enabling complex state-based dependencies. It supports `any` and `all` dependency modes, allowing for flexible dependency configurations.
@@ -17,37 +18,42 @@ Shepherd is particularly useful for applications that require persistent service
 
 - **Integration with Larger Workflows:** By encapsulating service workflows into single tasks, Shepherd enables easy integration with larger distributed workflow managers like Makeflow, enhancing workflow flexibility and reliability.
 
-
 ## Program State Transition Overview
 
 The Shepherd tool manages program execution through a series of defined states, ensuring dependencies are met and  states are recorded. Every program has default states (`Initialized`, `Started`, and `Final`) and can have optional user-defined states. Programs transition from `Initialized` to `Started` once dependencies are satisfied, then move through user-defined states. Actions return `Action Success` on a zero return code and `Action Failure` otherwise, while services  transition to `Service Failure` if they stop unexpectedly. Any program receiving a stop signal is marked as `Stopped`, and all programs ultimately transition to a `Final` state, reflecting their execution outcome.
 
 ![Test](diagram/dot/shepherd-state-machine.svg)
 
-## Installation
+## Quick start
 
-To install Shepherd, clone the repository and install using pip:
+### Using `uv` (easiest and fastest)
 
 ```bash
 git clone https://github.com/cooperative-computing-lab/shepherd.git
-cd shepherd
-pip install .
+uv sync --dev
+uv run examples/example3/run_test.sh
 ```
 
-Optionally, create a virtual environment before installing Shepherd to avoid conflicts with other Python packages:
+### Using pip
 
 ```bash
-python3 -m venv venv
+git clone https://github.com/cooperative-computing-lab/shepherd.git
+cd shepherd
+python --version # must be >=3.10
+# use uv or pyenv to install a more recent python version if needed
+
+python -m venv venv
 source venv/bin/activate
 pip install .
+examples/example3/run_test.sh
 ```
 
-
 ## Getting Started with Shepherd: A Hello World Example
-Shepherd simplifies complex application workflows. Here’s a simple example to demonstrate how to use Shepherd for 
+
+Shepherd simplifies complex application workflows. Here’s a simple example to demonstrate how to use Shepherd for
 scheduling dependent programs. In this example, we have two shell scripts: `program1.sh` and `program2.sh`.  `program2` should start only after `program1` has successfully completed its execution.
 
-#### 1. Create Sample Scripts
+### 1. Create Sample Scripts
 
 Create two shell scripts named `program1.sh` and `program2.sh` with the following content:
 
@@ -60,49 +66,56 @@ echo "Program completed"
 ```
 
 Make sure to make the scripts executable:
-
-  ```shell 
-  chmod +x program1.sh program2.sh
-  ```
 
-#### 2. Create a Shepherd Configuration File
-Create a Shepherd configuration file named `shepherd-config.yml` with the following content:
+```shell
+chmod +x program1.sh program2.sh
+```
+
+### 2. Create a Shepherd Configuration File
+
+Create a Shepherd configuration file named `shepherd-config.yaml` with the following content:
 
 ```yaml
-tasks:
-  program1:
-    command: "./program1.sh"
-  program2:
-    command: "./program2.sh"
-    dependency:
-      items:
-        program1: "action_success"  # Start program2 only after program1 succeeds
+services:
+    program1:
+        command: "./program1.sh"
+    program2:
+        command: "./program2.sh"
+        dependency:
+            items:
+                program1: "action_success"  # Start program2 only after program1 succeeds
 output:
-  state_times: "state_transition_times.json"
+    state_times: "state_transition_times.json"
 max_run_time: 60  # Optional: Limit total runtime to 60 seconds
 ```
 
-#### 3. Run Shepherd
+### 3. Run Shepherd
+
 Run Shepherd with the configuration file:
+
 ```shell
-shepherd -c shepherd-config.yml
+shepherd -c shepherd-config.yaml
 ```
 
 If you are running the python source, then run
+
 ```shell
-python3 shepherd.py -c shepherd-config.yml
+python3 run_shepherd.py -c shepherd-config.yaml
 ```
 
 If you are running shepherd executable, then run
+
 ```shell
-shepherd -c shepherd-config.yml
+shepherd -c shepherd-config.yaml
 ```
 
-#### Understanding the workflow
+### Understanding the workflow
+
 With this simple configuration, Shepherd will:
+
 1. Execute `program1.sh`.
 2. Monitor the internal states of the program.
-3. Start `program2.sh` only after `program1.sh `succeeds.
+3. Start `program2.sh` only after `program1.sh` succeeds.
 4. Create `state_transition_times.json`, which will look similar to this:
 
 ```json
@@ -124,9 +137,11 @@ With this simple configuration, Shepherd will:
 ```
 
 ## Monitoring User-Defined States in Shepherd
+
 Shepherd can monitor standard output (stdout) or any other file to detect user-defined states. These states can then be used as dependencies for other programs. This feature allows you to define complex workflows based on custom application states.
 
-#### Example Scenario: Dynamic Dependencies
+### Example Scenario: Dynamic Dependencies
+
 Suppose you have a service that becomes 'ready' after some initialization, and other tasks depend on it being ready.
 
 Example Service script (`service.sh`):
@@ -141,6 +156,7 @@ tail -f /dev/null  # Keep the service running
 ```
 
 Action script (`action.sh`):
+
 ```bash
 #!/bin/bash
 
@@ -150,16 +166,17 @@ echo "Action completed"
 ```
 
 Make sure to make the scripts executable:
-  
-  ```shell 
+
+  ```shell
   chmod +x service.sh action.sh
   ```
 
-#### Shepherd Configuration with user-defined states
+### Shepherd Configuration with user-defined states
+
 Below is a Shepherd configuration file that monitors the standard output of the service script to detect the 'ready' state. The action script starts only after the service is ready.
 
 ```yaml
-tasks:
+services:
   my_service:
     type: "service"
     command: "./service.sh"
@@ -177,7 +194,8 @@ output:
 max_run_time: 60
 ```
 
-#### How This Configuration Works
+### How This Configuration Works
+
 1. Shepherd starts the service script `service.sh`.
 2. Shepherd monitors the standard output of the service script for the message "Service is ready".
 3. Once the service is ready, Shepherd starts the action script `action.sh`.
@@ -202,9 +220,11 @@ max_run_time: 60
 ```
 
 ## Configuration Options
+
 Shepherd uses a YAML configuration file to define the workflow. Here are some key configuration options:
 
 ### Defining Tasks
+
 Tasks are defined under the tasks section. Each task can be an action or a service:
 
 - **Action:** A task that runs to completion and exits.
@@ -213,7 +233,7 @@ Tasks are defined under the tasks section. Each task can be an action or a servi
 The default type is action. Here is an example configuration with an action and a service:
 
 ```yaml
-tasks:
+services:
   my_action:
     type: "action"
     command: "python process_data.py"
@@ -223,12 +243,13 @@ tasks:
 ```
 
 ### Dependencies
+
 Dependencies specify when a task should start, based on the states of other tasks.
 
 - **Mode:** Specifies whether all dependencies must be met (`all`, the default) or any one (`any`).
 
 ```yaml
-tasks:
+services:
   task2:
     type: "action"
     command: "./task2.sh"
@@ -240,11 +261,13 @@ tasks:
 ```
 
 ### Monitoring User-Defined States
+
 Shepherd can monitor standard output or files to detect user-defined states. This allows you to control the workflow based on custom application states.
 
 Example of monitoring standard output:
+
 ```yaml
-tasks:
+services:
   my_program:
     command: "./my_program.sh"
     state:
@@ -256,7 +279,7 @@ tasks:
 Example of monitoring a file:
 
 ```yaml
-tasks:
+services:
   my_task:
     type: "action"
     command: "./my_task.sh"
@@ -268,6 +291,7 @@ tasks:
 ```
 
 ### Output Options
+
 Shepherd can generate output files containing state transition times and other logs. You can specify the output file paths in the configuration:
 
 ```yaml
@@ -278,6 +302,7 @@ output:
 ```
 
 ### Shutdown Conditions
+
 Shepherd can be configured to stop all tasks based on specific conditions, such as a stop signal, maximum runtime, or success criteria:
 
 - **Stop Signal:** A file that, when created, triggers a controlled shutdown.

diff --git a/examples/.gitignore b/examples/.gitignore
@@ -0,0 +1,3 @@
+*.log
+**/outputs/**
+state_transition_times.json
diff --git a/examples/example1/program1.sh b/examples/example1/program1.sh
@@ -2,4 +2,4 @@
 
 echo "Starting program..."
 sleep 5
-echo "Program completed"
+echo "Program completed"
diff --git a/examples/example1/program2.sh b/examples/example1/program2.sh
@@ -2,4 +2,4 @@
 
 echo "Starting program..."
 sleep 5
-echo "Program completed"
+echo "Program completed"
diff --git a/examples/example1/shepherd-config.yaml b/examples/example1/shepherd-config.yaml
@@ -0,0 +1,11 @@
+services:
+    program1:
+        command: "./program1.sh"
+    program2:
+        command: "./program2.sh"
+        dependency:
+            items:
+                program1: "action_success" # Start program2 only after program1 succeeds
+output:
+    state_times: "state_transition_times.json"
+max_run_time: 60 # Optional: Limit total runtime to 60 seconds
diff --git a/examples/example1/shepherd-config.yml b/examples/example1/shepherd-config.yml
diff --git a/examples/example2/action.sh b/examples/example2/action.sh
@@ -2,4 +2,4 @@
 
 echo "Action is running..."
 sleep 5
-echo "Action completed"
+echo "Action completed"
diff --git a/examples/example2/service.sh b/examples/example2/service.sh
@@ -3,4 +3,4 @@
 echo "Service is starting..."
 sleep 5
 echo "Service is ready"
-tail -f /dev/null  # Keep the service running
+tail -f /dev/null # Keep the service running
diff --git a/examples/example2/shepherd-config.yaml b/examples/example2/shepherd-config.yaml
@@ -0,0 +1,16 @@
+services:
+    my_service:
+        type: "service"
+        command: "./service.sh"
+        state:
+            log:
+                ready: "Service is ready"
+    my_action:
+        type: "action"
+        command: "./action.sh"
+        dependency:
+            items:
+                my_service: "ready"
+output:
+    state_times: "state_transition_times.json"
+max_run_time: 60
diff --git a/examples/example2/shepherd-config.yml b/examples/example2/shepherd-config.yml
diff --git a/examples/example3/cleanup.sh b/examples/example3/cleanup.sh
@@ -1 +1,13 @@
-rm file-created-by-program3.log
+#!/bin/bash
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" &>/dev/null && pwd)"
+cd "${SCRIPT_DIR}"
+
+rm -f ./*.log
+rm -f ./outputs/file-created-by-program3.log
+rm -f ./outputs/logs/shepherd.log
+rm -f ./state_transition_times.json
+
+rmdir ./outputs/logs/ || true
+rmdir ./outputs/ || true
diff --git a/examples/example3/program3.sh b/examples/example3/program3.sh
@@ -18,7 +18,7 @@ run_duration=30
 while true; do
     echo "$(date +%s) - program is running"
     sleep 0.5
-    echo "File created by program3" > ./file-created-by-program3.log
+    echo "File created by program3" >"./outputs/file-created-by-program3.log"
     if [[ $(date +%s) -gt $((READY_TIME + run_duration)) ]]; then
         echo "$(date +%s) - program is completed"
         break

diff --git a/examples/example3/run_test.sh b/examples/example3/run_test.sh
@@ -1,7 +1,12 @@
 #!/bin/bash
+set -euo pipefail
 
 echo "Starting test ..."
 
-shepherd -c shepherd-config.yml
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" &>/dev/null && pwd)"
+echo "Script directory: ${SCRIPT_DIR}"
+shepherd --config "${SCRIPT_DIR}/shepherd-config.yaml" \
+    --work-dir "${SCRIPT_DIR}" \
+    --run-dir "${SCRIPT_DIR}/outputs"
 
 echo "Completed test"