ukaea-rse-training · JBello1610 · Jul 27, 2023 · Oct 7, 2024 · Oct 7, 2024 · Oct 7, 2024
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
@@ -0,0 +1,38 @@
+name: CI
+
+# We can specify which Github events will trigger a CI build
+on: push
+
+# now define a single job 'build' (but could define more)
+jobs:
+
+  build:
+    strategy:
+      matrix:
+        os: [ubuntu-latest, macos-latest, windows-latest]
+        python-version: ["3.8", "3.9", "3.10"]
+
+    runs-on: ${{ matrix.os }}
+
+    # a job is a seq of steps
+    steps:
+
+    # Next we need to checkout out repository, and set up Python
+    # A 'name' is just an optional label shown in the log - helpful to clarify progress - and can be anything
+    - name: Checkout repository
+      uses: actions/checkout@v2
+
+    - name: Set up Python
+      uses: actions/setup-python@v2
+      with:
+        python-version: ${{ matrix.python-version }}
+
+    - name: Install Python dependencies
+      run: |
+        python3 -m pip install --upgrade pip
+        pip3 install -r requirements.txt
+
+    - name: Test with PyTest
+      run: |
+        python -m pytest --cov=inflammation.models tests/test_models.py
+...
diff --git a/.gitignore b/.gitignore
@@ -12,3 +12,8 @@
 *.pyc
 *.egg-info
 .pytest_cache
+
+
+# Virtual environments
+venv/
+.venv/
diff --git a/README.md b/README.md
@@ -1,17 +1,62 @@
-# Introduction
+# Inflam
+![Continuous Integration build in GitHub Actions](https://github.com/mjjq/python-intermediate-inflammation/workflows/CI/badge.svg?branch=main)
 
-This is a template software project repository used by the [Intermediate Research Software Development Skills In Python](https://github.com/carpentries-incubator/python-intermediate-development).
+Inflam is a data management system written in Python that manages trial data used in clinical inflammation studies.
 
-## Purpose
+## Main features
+Here are some key features of Inflam:
 
-This repository is intended to be used as a code template which is copied by learners at [Intermediate Research Software Development Skills In Python](https://github.com/carpentries-incubator/python-intermediate-development) course.
-This can be done using the `Use this template` button towards the top right of this repo's GitHub page.
+- Provide basic statistical analyses over clinical trial data
+- Ability to work on trial data in Comma-Separated Value (CSV) format
+- Generate plots of trial data
+- Analytical functions and views can be easily extended based on its Model-View-Controller architecture
 
-This software project is not finished, is currently failing to run and contains some code style issues. It is used as a starting point for the course - issues will be fixed and code will be added in a number of places during the course by learners in their own copies of the repository, as course topics are introduced.
+## Prerequisites
+Inflam requires the following Python packages:
 
-## Tests
+- [NumPy](https://www.numpy.org/) - makes use of NumPy's statistical functions
+- [Matplotlib](https://matplotlib.org/stable/index.html) - uses Matplotlib to generate statistical plots
 
-Several tests have been implemented already, some of which are currently failing.
-These failing tests set out the requirements for the additional code to be implemented during the workshop.
+The following optional packages are required to run Inflam's unit tests:
 
-The tests should be run using `pytest`, which will be introduced during the workshop.
+- [pytest](https://docs.pytest.org/en/stable/) - Inflam's unit tests are written using pytest
+- [pytest-cov](https://pypi.org/project/pytest-cov/) - Adds test coverage stats to unit testing
+
+## Installation
+Clone the repository
+
+Install requirements (preferably in a virtual environment) 
+
+```
+python -m pip3 install requirements.txt
+```
+
+## Usage
+Open a terminal to the root of the repository on your local machine. Run the code with
+
+```
+python3 inflammation-analysis.py <path-to-your-data>
+```
+
+## License
+MIT License
+
+Copyright (c) 2024 Marcus Quantrill
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/inflammation-analysis.py b/inflammation-analysis.py
@@ -1,5 +1,8 @@
 #!/usr/bin/env python3
-"""Software for managing and analysing patients' inflammation data in our imaginary hospital."""
+"""
+Software for managing and analysing patients' 
+inflammation data in our imaginary hospital.
+"""
 
 import argparse
 import os
@@ -19,28 +22,33 @@ def main(args):
     if not isinstance(InFiles, list):
         InFiles = [args.infiles]
 
-
     if args.full_data_analysis:
         analyse_data(os.path.dirname(InFiles[0]))
         return
 
     for filename in InFiles:
         inflammation_data = models.load_csv(filename)
 
-        view_data = {'average': models.daily_mean(inflammation_data), 'max': models.daily_max(inflammation_data), 'min': models.daily_min(inflammation_data)}
+        view_data = {'average': models.daily_mean(inflammation_data), 'max': models.daily_max(inflammation_data), 'min': models.daily_min(inflammation_data), **(models.standard_deviation(inflammation_data))}
+
 
         views.visualize(view_data)
 
+
 if __name__ == "__main__":
     parser = argparse.ArgumentParser(
-        description='A basic patient inflammation data management system')
+        description="A basic patient inflammation data management system"
+    )
 
     parser.add_argument(
-        'infiles',
-        nargs='+',
-        help='Input CSV(s) containing inflammation series for each patient')
+        "infiles",
+        nargs="+",
+        help="Input CSV(s) containing inflammation series for each patient",
+    )
 
-    parser.add_argument('--full-data-analysis', action='store_true', dest='full_data_analysis')
+    parser.add_argument(
+        "--full-data-analysis", action="store_true", dest="full_data_analysis"
+    )
 
     args = parser.parse_args()
 

diff --git a/inflammation/models.py b/inflammation/models.py
@@ -2,21 +2,27 @@
 
 The Model layer is responsible for the 'business logic' part of the software.
 
-Patients' data is held in an inflammation table (2D array) where each row contains 
-inflammation data for a single patient taken over a number of days 
+Patients' data is held in an inflammation table (2D array) where each row contains
+inflammation data for a single patient taken over a number of days
 and each column represents a single day across all patients.
 """
 
 import json
 import numpy as np
 
 
+class Patient:
+    def __init__(self, name):
+        self.name = name
+
+
 def load_csv(filename):
     """Load a Numpy array from a CSV
 
     :param filename: Filename of CSV to load
     """
-    return np.loadtxt(fname=filename, delimiter=',')
+    return np.loadtxt(fname=filename, delimiter=",")
+
 
 def load_json(filename):
     """Load a numpy array from a JSON document.
@@ -34,23 +40,85 @@ def load_json(filename):
     :param filename: Filename of CSV to load
 
     """
-    with open(filename, 'r', encoding='utf-8') as file:
+    with open(filename, "r", encoding="utf-8") as file:
         data_as_json = json.load(file)
-        return [np.array(entry['observations']) for entry in data_as_json]
-
+        return [np.array(entry["observations"]) for entry in data_as_json]
 
 
 def daily_mean(data):
-    """Calculate the daily mean of a 2d inflammation data array."""
+    """Calculate the daily mean of a 2D inflammation data array.
+
+    :param data: 2D array of values to perform mean
+    :returns: 1D array of values contains means along first axis of data
+    """
     return np.mean(data, axis=0)
 
 
 def daily_max(data):
-    """Calculate the daily max of a 2d inflammation data array."""
+    """Calculate the daily max of a 2D inflammation data array.
+
+    :param data: Array of values to perform max
+    :returns: 1D array of values contains maxes along first axis of data
+    """
     return np.max(data, axis=0)
 
 
 def daily_min(data):
-    """Calculate the daily min of a 2d inflammation data array."""
+    """Calculate the daily min of a 2D inflammation data array.
+
+    :param data: Array of values to perform min
+    :returns: 1D array of values contains mins along first axis of data
+    """
     return np.min(data, axis=0)
 
+
+def patient_normalise(data):
+    """
+    Normalise patient data from a 2D inflammation data array.
+
+    NaN values are ignored, and normalised to 0.
+
+    Negative values are rounded to 0.
+
+    :param data: 2D array to be normalised
+    :returns: Normalised 2D array
+    """
+    maxima = np.nanmax(data, axis=1)
+    with np.errstate(invalid="ignore", divide="ignore"):
+        normalised = data / maxima[:, np.newaxis]
+    normalised[np.isnan(normalised)] = 0
+    normalised[normalised < 0] = 0
+    return normalised
+
+
+def standard_deviation(data):
+    """Computes and returns standard deviation for data."""
+    if len(data) == 0:
+        return {"standard deviation": 0.0}
+
+    mmm = np.mean(data, axis=0)
+    devs = []
+    for entry in data:
+        devs.append((entry - mmm) * (entry - mmm))
+
+    standard_dev = sum(devs) / len(data)
+    return {"standard deviation": np.sqrt(standard_dev)}
+
+
+def patient_normalise(data):
+    """
+    Normalise patient data from a 2D inflammation data array.
+
+    NaN values are ignored, and normalised to 0.
+
+    Negative values are rounded to 0.
+
+    :param data: 2D array to be normalised
+    :returns: Normalised 2D array
+    """
+    maxima = np.nanmax(data, axis=1)
+    with np.errstate(invalid="ignore", divide="ignore"):
+        normalised = data / maxima[:, np.newaxis]
+    normalised[np.isnan(normalised)] = 0
+    normalised[normalised < 0] = 0
+    return normalised
diff --git a/inflammation/views.py b/inflammation/views.py
@@ -1,7 +1,6 @@
 """Module containing code for plotting inflammation data."""
 
 from matplotlib import pyplot as plt
-import numpy as np
 
 
 def visualize(data_dict):