feat: Add initial version of anamnesis.ai with rago (#19)

osl-incubator · Dec 7, 2024 · 77f7e3c · 77f7e3c
1 parent 0aebb6d
commit 77f7e3c
Show file tree

Hide file tree

Showing 294 changed files with 56,418 additions and 1,275 deletions.
diff --git a/.github/workflows/main.yaml b/.github/workflows/main.yaml
@@ -36,9 +36,12 @@ jobs:
           - "3.10"
           - "3.11"
           - "3.12"
+          # note: srsly cannot be installed
+          # - "3.13"
         os:
           - "ubuntu"
-          - "macos"
+          # note: Unable to find installation candidates for torch (2.5.1+cpu)
+          # - "macos"
     runs-on: ${{ matrix.os }}-latest
     concurrency:
       group: ci-tests-${{ matrix.os }}-${{ matrix.python_version }}-${{ github.ref }}
@@ -48,6 +51,9 @@ jobs:
       run:
         shell: bash -l {0}
 
+    env:
+      OPENAI_API_KEY: "dummy-key"
+
     steps:
       - uses: actions/checkout@v4
 
@@ -61,6 +67,12 @@ jobs:
           conda-solver: libmamba
           python-version: "${{ matrix.python_version }}"
 
+      - name: Create environment variables for tests
+        run: |
+          pushd tests
+          envsubst < .env.tpl > .env
+          popd
+
       - name: Check Poetry lock
         run: poetry check
 
@@ -70,7 +82,7 @@ jobs:
           poetry install
 
       - name: Run unit tests
-        run: makim tests.unit
+        run: makim tests.unit --params '-m "not skip_on_ci" -vvv'
 
       - name: Test jupyter notebooks
         run: makim tests.notebooks

diff --git a/.github/workflows/release.yaml b/.github/workflows/release.yaml
@@ -0,0 +1,63 @@
+name: Release
+
+on:
+  workflow_dispatch:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+env:
+  GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+  PYPI_TOKEN: ${{ secrets.PYPI_TOKEN }}
+
+jobs:
+  release:
+    runs-on: ubuntu-latest
+    timeout-minutes: 15
+
+    defaults:
+      run:
+        shell: bash -l {0}
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: conda-incubator/setup-miniconda@v3
+        with:
+          miniforge-version: latest
+          environment-file: conda/release.yaml
+          channels: conda-forge,nodefaults
+          activate-environment: anamnesisai
+          auto-update-conda: true
+          conda-solver: libmamba
+          python-version: "3.11"
+
+      - name: Install deps
+        run: |
+          poetry config virtualenvs.create false
+          poetry install
+
+      - name: Run semantic release (for tests)
+        if: ${{ github.event_name != 'workflow_dispatch' }}
+        run: makim --verbose release.dry
+
+      - name: Run semantic release
+        if: ${{ github.event_name == 'workflow_dispatch' }}
+        run: |
+          poetry config pypi-token.pypi ${PYPI_TOKEN}
+          makim --verbose release.ci
+
+      - name: Generate documentation with changes from semantic-release
+        if: ${{ github.event_name == 'workflow_dispatch' }}
+        run: makim --verbose docs.build
+
+      - name: GitHub Pages action
+        if: ${{ github.event_name == 'workflow_dispatch' }}
+        uses: peaceiris/[email protected]
+        with:
+          github_token: ${{ secrets.GITHUB_TOKEN }}
+          publish_dir: ./build/
+
+      - name: Setup tmate session
+        if: "${{ failure() && (contains(github.event.pull_request.labels.*.name, 'ci:enable-debugging')) }}"
+        uses: mxschmitt/action-tmate@v3
diff --git a/.gitignore b/.gitignore
@@ -158,3 +158,5 @@ cython_debug/
 #  and can be added to the global gitignore or merged into this file.  For a more nuclear
 #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
 .idea/
+
+tests/data/hospital-triage-and-patient-history.parquet
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -16,10 +16,7 @@ repos:
       - id: ruff-format
         name: ruff-format
         entry: ruff format
-        exclude: |
-          (?x)(
-            docs
-          )
+        exclude: "docs/|scripts/.*"
         language: system
         pass_filenames: true
         types:
@@ -29,7 +26,7 @@ repos:
         name: ruff-linter
         entry: ruff check
         language: system
-        exclude: "docs/"
+        exclude: "docs/|scripts/.*"
         pass_filenames: true
         types:
           - python

diff --git a/README.md b/README.md
@@ -4,12 +4,10 @@
 
 #### Overview
 
-This project aims to develop a Minimum Viable Product (MVP) for an AI-driven
-anamnesis collection system in the healthcare domain. The system will leverage
-the capabilities of FHIR (Fast Healthcare Interoperability Resources), the
-ChatGPT API, Flask (a micro web framework written in Python), and SQLite (a
-lightweight database) to facilitate an interactive, user-friendly platform for
-collecting patient medical history (anamnesis) through conversational AI.
+This project aims to develop an AI-driven anamnesis collection system in the
+healthcare domain. The system will leverage the capabilities of FHIR (Fast
+Healthcare Interoperability Resources), the ChatGPT API for collecting patient
+medical history (anamnesis) through conversational AI.
 
 #### Technical Components
 
@@ -27,36 +25,6 @@ collecting patient medical history (anamnesis) through conversational AI.
    - Intelligent and natural language processing capabilities enhance user
      experience and data collection accuracy.
 
-3. **Flask (Python Web Framework)**:
-
-   - Serves as the backend framework for the web application.
-   - Manages HTTP requests, routing, and web page rendering.
-   - Lightweight and easy to integrate with Python-based tools like SQLite and
-     the ChatGPT API.
-
-4. **SQLite (Database)**:
-   - Stores user interactions and anamnesis data.
-   - Lightweight database, ideal for the MVP stage of the project.
-   - Easy integration with Flask, facilitating seamless data operations.
-
-#### Business Logic
-
-1. **User Interface**:
-
-   - A simple, intuitive web interface developed using Flask.
-   - No user login required for the MVP phase; designed for a single user
-     interaction.
-   - Provides a chat window where the user can interact with the ChatGPT-powered
-     bot.
-
-2. **Conversational Data Collection**:
-
-   - The user initiates a conversation by describing symptoms or medical
-     concerns.
-   - The ChatGPT bot responds with follow-up questions to gather detailed
-     anamnesis information (e.g., symptom onset, severity, duration, associated
-     conditions).
-
 3. **Anamnesis Data Handling**:
 
    - Responses from the user are processed and structured into FHIR-compliant
@@ -72,17 +40,20 @@ collecting patient medical history (anamnesis) through conversational AI.
    - These observations are linked to a mock `Patient` resource for the sake of
      the MVP, facilitating a structured and standardized anamnesis record.
 
-5. **Future Expansion Potential**:
-   - While initially designed for a single user, the system architecture allows
-     for scalability to handle multiple users with authentication and more
-     complex data management.
-   - The use of FHIR ensures that future developments could integrate with
-     broader healthcare systems and EHR (Electronic Health Records) platforms.
+## Test Data
+
+The test data was obtained from
+https://springernature.figshare.com/collections/A_dataset_of_simulated_patient-physician_medical_interviews_with_a_focus_on_respiratory_cases/5545842/1
+
+Source: Smith, Christopher William; Fareez, Faiha; Parikh, Tishya; Wavell,
+Christopher; Shahab, Saba; Chevalier, Meghan; et al. (2022). A dataset of
+simulated patient-physician medical interviews with a focus on respiratory
+cases. figshare. Collection. https://doi.org/10.6084/m9.figshare.c.5545842.v1
 
-#### Conclusion
+## Conclusion
 
-This MVP serves as a foundational step towards a more comprehensive AI-driven
-healthcare data collection system. By combining the latest in AI conversational
-technology with standardized healthcare data protocols, it aims to streamline
-the anamnesis process, thereby enhancing patient care and healthcare data
-management.
+This project serves as a foundational step towards a more comprehensive
+AI-driven healthcare data collection system. By combining the latest in AI
+conversational technology with standardized healthcare data protocols, it aims
+to streamline the anamnesis process, thereby enhancing patient care and
+healthcare data management.
diff --git a/conda/dev.yaml b/conda/dev.yaml
@@ -11,3 +11,4 @@ dependencies:
   - pip:
       # distlib issue
       - paginate
+      # pip wheel --no-cache-dir --use-pep517 "srsly (==2.4.8)"
diff --git a/conda/release.yaml b/conda/release.yaml
@@ -0,0 +1,14 @@
+name: anamnesisai
+channels:
+  - nodefaults
+  - conda-forge
+dependencies:
+  - python
+  - pip
+  - poetry
+  - nodejs >=18.17 # used by semantic-release
+  - shellcheck
+  - pip:
+      # distlib issue
+      - paginate
+      # pip wheel --no-cache-dir --use-pep517 "srsly (==2.4.8)"
diff --git a/docs/api/references.md b/docs/api/references.md
diff --git a/docs/api/references.rst b/docs/api/references.rst
diff --git a/docs/background_info/history.md b/docs/background_info/history.md
diff --git a/docs/contributing.md b/docs/contributing.md
@@ -61,39 +61,50 @@ If you are proposing a feature:
 
 ## Get Started!
 
-Ready to contribute? Here’s how to set up `anamnesisai` for local development.
+Ready to contribute? Here’s how to set up `anamnesis.ai` for local development.
 
-1.  Fork the `anamnesisai` repo on GitHub.
+1.  Fork the `anamnesis.ai` repo on GitHub.
 
 2.  Clone your fork locally::
 
-    $ git clone [email protected]:your_name_here/anamnesisai.git
-
-3.  Install your local copy into a virtualenv. Assuming you have
-    virtualenvwrapper installed, this is how you set up your fork for local
-    development::
-
-    $ mkvirtualenv anamnesisai $ cd anamnesisai/ $ python setup.py develop
-
-4.  Create a branch for local development::
-
+    $ git clone [email protected]:your_name_here/anamnesis.ai.git
+
+3.  Install your local copy into a conda environment. Assuming you have conda or
+    mamba installed, this is how you set up your fork for local development
+    (ensure you are already in the anamnesis.ai folder):
+    ```bash
+    $ mamba env create --file conda/dev.yaml
+    $ conda activate anamnesisai
+    ```
+4.  Install the dependencies in the new environment anamnesisai:
+    ```bash
+    $ poetry install
+    ```
+5.  Create a branch for local development::
+
+    ```bash
     $ git checkout -b name-of-your-bugfix-or-feature
+    ```
 
     Now you can make your changes locally.
 
-5.  When you’re done making changes, check that your changes pass flake8 and the
+6.  When you’re done making changes, check that your changes pass flake8 and the
     tests, including testing other Python versions with tox::
 
-    $ make lint $ make test
-
-    To get flake8 and tox, just pip install them into your virtualenv.
+    ```bash
+    $ makim tests.linter
+    $ makim tests.unit
+    ```
 
-6.  Commit your changes and push your branch to GitHub::
+7.  Commit your changes and push your branch to GitHub::
 
-    $ git add . $ git commit -m “Your detailed description of your changes.” $
-    git push origin name-of-your-bugfix-or-feature
+    ```bash
+    $ git add .
+    $ git commit -m "Your detailed description of your changes."
+    $ git push origin name-of-your-bugfix-or-feature
+    ```
 
-7.  Submit a pull request through the GitHub website.
+8.  Submit a pull request through the GitHub website.
 
 ## Pull Request Guidelines
 
@@ -103,13 +114,13 @@ Before you submit a pull request, check that it meets these guidelines:
 2.  If the pull request adds functionality, the docs should be updated. Put your
     new functionality into a function with a docstring, and add the feature to
     the list in README.rst.
-3.  The pull request should work for Python >= 3.8.
+3.  The pull request should work for Python >= `3.9`.
 
 ## Tips
 
 To run a subset of tests::
 
-```
+```bash
 $ pytest tests.test_anamnesisai
 ```
 
@@ -147,7 +158,7 @@ The table below shows which commit message gets you which release type when
 | `fix(pencil): stop graphite breaking when pressure is applied` | Fix Release      |
 | `feat(pencil): add 'graphiteWidth' option`                     | Feature Release  |
 | `perf(pencil): remove graphiteWidth option`                    | Chore            |
-| `BREAKING CHANGE: The graphiteWidth option has been removed`   | Breaking Release |
+| `feat(pencil)!: The graphiteWidth option has been removed`     | Breaking Release |
 
 source:
 <https://github.com/semantic-release/semantic-release/blob/master/README.md#commit-message-format>