enriching NG tube

SAFEHR-data · Apr 18, 2024 · 7379b98 · 7379b98
1 parent f04cbd4
commit 7379b98
Show file tree

Hide file tree

Showing 34 changed files with 9,781 additions and 76 deletions.
diff --git a/_projects/uclh_ngtube_s0/_01_detail.md b/_projects/uclh_ngtube_s0/_01_detail.md
diff --git a/_projects/uclh_ngtube_s0/_02_appendix.md b/_projects/uclh_ngtube_s0/_02_appendix.md
diff --git a/_projects/uclh_ngtube_s0/_03_data.html b/_projects/uclh_ngtube_s0/_03_data.html
diff --git a/_projects/uclh_ngtube_s0/_03_data.md b/_projects/uclh_ngtube_s0/_03_data.md
@@ -0,0 +1,56 @@
+# Example (synthetic) Electronic Health Record data
+
+These data are modelled using the OMOP Common Data Model v5.3.
+
+## CSV files
+
+The name of the file corresponds to the table in the OMOP CDM.
+
+- [person](data/person.csv)
+- [procedure_occurrence](data/procedure_occurrence.csv)
+- [visit_occurrence](data/visit_occurrence.csv)
+
+## Correlated Data Source
+
+- NG tube vocabularies
+
+## Generation Rules
+
+- The patient's age should be between 18 and 100 at the moment of the visit.
+- Ethnicity data is using 2021 census data in England and Wales ([Census in England and Wales 2021](https://www.ons.gov.uk/peoplepopulationandcommunity/culturalidentity/ethnicity/bulletins/ethnicgroupenglandandwales/census2021))  .
+- Gender is equally distributed between Male and Female (50% each).
+- Every person in the record has a link in procedure_occurrence with the concept "Checking the position of nasogastric tube using X-ray"
+- 2% of person records have a link in procedure_occurrence with the concept of "Plain chest X-ray"
+- 60% of visit_occurrence has visit concept "Inpatient Visit", while 40% have "Emergency Room Visit"
+
+## Notes
+
+- Version 0
+- Generated by man-made rule/story generator
+- Structural correct, all tables linked with the relationship
+- We used national ethnicity data to generate a realistic distribution (see below)
+
+
+
+### 2011 Race Census figure in England and Wales
+
+| Ethnic Group                                                                                   | Population(%) |
+|------------------------------------------------------------------------------------------------|---------------|
+| Asian or Asian British: Bangladeshi | 1.1 |
+| Asian or Asian British: Chinese | 0.7 |
+| Asian or Asian British: Indian | 3.1 |
+| Asian or Asian British: Pakistani | 2.7 |
+| Asian or Asian British: any other Asian background | 1.6 |
+| Black or African or Caribbean or Black British: African | 2.5 |
+| Black or African or Caribbean or Black British: Caribbean | 1 |
+| Black or African or Caribbean or Black British: other Black or African or Caribbean background | 0.5 |
+| Mixed multiple ethnic groups: White and Asian | 0.8 |
+| Mixed multiple ethnic groups: White and Black African | 0.4 |
+| Mixed multiple ethnic groups: White and Black Caribbean | 0.9 |
+| Mixed multiple ethnic groups: any other Mixed or multiple ethnic background | 0.8 |
+| White: English or Welsh or Scottish or Northern Irish or British | 74.4 |
+| White: Irish | 0.9 |
+| White: Gypsy or Irish Traveller | 0.1 |
+| White: any other White background | 6.4 |
+| Other ethnic group: any other ethnic group | 1.6 |
+| Other ethnic group: Arab | 0.6 |
diff --git a/_projects/uclh_ngtube_s0/_04_image.md b/_projects/uclh_ngtube_s0/_04_image.md
@@ -0,0 +1,29 @@
+# Example (synthetic) images
+
+## Model
+
+A Hugging Face Unconditional image generation Diffusion Model was used for training. [1] Unconditional image generation models are not conditioned on text or images during training. They only generate images that resemble the training data distribution. The model usually starts with a seed that generates a random noise vector. The model will then use this vector to create an output image similar to the images used to train the model.
+The training script initializes a UNet2DModel and uses it to train the model. [2] The training loop adds noise to the images, predicts the noise residual, calculates the loss, saves checkpoints at specified steps, and saves the generated models.
+
+## Training Dataset
+
+The RANZCR CLiP dataset was used to train the model. [3] This dataset has been created by The Royal Australian and New Zealand College of Radiologists (RANZCR) which is a not-for-profit professional organisation for clinical radiologists and radiation oncologists. The dataset has been labelled with a set of definitions to ensure consistency with labelling. The normal category includes lines that were appropriately positioned and did not require repositioning. The borderline category includes lines that would ideally require some repositioning but would in most cases still function adequately in their current position. The abnormal category included lines that required immediate repositioning. 30000 images were used during training. All training images were 512x512 in size.
+Computational Information
+Training has been conducted using RTX 6000 cards with 24GB of graphics memory. A checkpoint was created after each epoch was saved with 220 checkpoints being generated so far. Each checkpoint takes up 1GB space in memory. Generating each epoch takes around 6 hours. Machine learning libraries such as TensorFlow, PyTorch, or scikit-learn are used to run the training, along with additional libraries for data preprocessing, visualization, or deployment.
+
+![Synthetic CXR Example 01](./images/1.png)
+![Synthetic CXR Example 02](./images/2_2.png)
+![Synthetic CXR Example 03](./images/10.png)
+![Synthetic CXR Example 04](./images/20.png)
+![Synthetic CXR Example 05](./images/36_2.png)
+![Synthetic CXR Example 06](./images/37.png)
+![Synthetic CXR Example 07](./images/39.png)
+![Synthetic CXR Example 08](./images/44_2.png)
+![Synthetic CXR Example 09](./images/44.png)
+![Synthetic CXR Example 10](./images/45.png)
+
+## References
+
+1. https://huggingface.co/docs/diffusers/en/training/unconditional_training#unconditional-image-generation
+2. https://github.com/huggingface/diffusers/blob/096f84b05f9514fae9f185cbec0a4d38fbad9919/examples/unconditional_image_generation/train_unconditional.py#L356
+3. https://www.kaggle.com/competitions/ranzcr-clip-catheter-line-classification/data
diff --git a/_projects/uclh_ngtube_s0/images/1.png b/_projects/uclh_ngtube_s0/images/1.png
diff --git a/_projects/uclh_ngtube_s0/images/10.png b/_projects/uclh_ngtube_s0/images/10.png
diff --git a/_projects/uclh_ngtube_s0/images/20.png b/_projects/uclh_ngtube_s0/images/20.png
diff --git a/_projects/uclh_ngtube_s0/images/2_2.png b/_projects/uclh_ngtube_s0/images/2_2.png
diff --git a/_projects/uclh_ngtube_s0/images/36_2.png b/_projects/uclh_ngtube_s0/images/36_2.png
diff --git a/_projects/uclh_ngtube_s0/images/37.png b/_projects/uclh_ngtube_s0/images/37.png
diff --git a/_projects/uclh_ngtube_s0/images/39.png b/_projects/uclh_ngtube_s0/images/39.png
diff --git a/_projects/uclh_ngtube_s0/images/44.png b/_projects/uclh_ngtube_s0/images/44.png
diff --git a/_projects/uclh_ngtube_s0/images/44_2.png b/_projects/uclh_ngtube_s0/images/44_2.png
diff --git a/_projects/uclh_ngtube_s0/images/45.png b/_projects/uclh_ngtube_s0/images/45.png
diff --git a/_projects/uclh_ngtube_s0/index.md b/_projects/uclh_ngtube_s0/index.md
@@ -1,28 +1,29 @@
 ---
 layout: project
-title: UCLH Ngtube Synthetic Data
+title: Detecting misplaced NG tubes
 status: ongoing
+tags: imaging, safety
 authors:
-- NIHR CRIU SAFEHR Team
+- SAFEHR team
 
 tabs:
 - {
-    name: uclh-ngt-s0-detail,
-    type: md,
-    source: _01_detail.md,
-    label: Detail
+    name: uclh-ngt-s0-quarto,
+    type: html,
+    source: omop_concepts_frequency_table.html,
+    label: Data summary
   }
 - {
-    name: uclh-ngt-s0-appx,
-    type: md,
-    source: _02_appendix.md,
-    label: Appendix
+  name: uclh-ngt-s0-ehr-data,
+  type: md,
+  source: _03_data.md,
+  label:  Synthetic EHR
   }
 - {
-  name: uclh-ngt-s0-data,
-  type: html,
-  source: _03_data.html,
-  label:  Data
+  name: uclh-ngt-s0-image-data,
+  type: md,
+  source: _04_image.md,
+  label:  Synthetic CXR
   }
 ---