Skip to content

Commit

Permalink
enriching NG tube
Browse files Browse the repository at this point in the history
  • Loading branch information
docsteveharris committed Apr 18, 2024
1 parent f04cbd4 commit 7379b98
Show file tree
Hide file tree
Showing 34 changed files with 9,781 additions and 76 deletions.
27 changes: 0 additions & 27 deletions _projects/uclh_ngtube_s0/_01_detail.md

This file was deleted.

23 changes: 0 additions & 23 deletions _projects/uclh_ngtube_s0/_02_appendix.md

This file was deleted.

12 changes: 0 additions & 12 deletions _projects/uclh_ngtube_s0/_03_data.html

This file was deleted.

56 changes: 56 additions & 0 deletions _projects/uclh_ngtube_s0/_03_data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Example (synthetic) Electronic Health Record data

These data are modelled using the OMOP Common Data Model v5.3.

## CSV files

The name of the file corresponds to the table in the OMOP CDM.

- [person](data/person.csv)
- [procedure_occurrence](data/procedure_occurrence.csv)
- [visit_occurrence](data/visit_occurrence.csv)

## Correlated Data Source

- NG tube vocabularies

## Generation Rules

- The patient's age should be between 18 and 100 at the moment of the visit.
- Ethnicity data is using 2021 census data in England and Wales ([Census in England and Wales 2021](https://www.ons.gov.uk/peoplepopulationandcommunity/culturalidentity/ethnicity/bulletins/ethnicgroupenglandandwales/census2021)) .
- Gender is equally distributed between Male and Female (50% each).
- Every person in the record has a link in procedure_occurrence with the concept "Checking the position of nasogastric tube using X-ray"
- 2% of person records have a link in procedure_occurrence with the concept of "Plain chest X-ray"
- 60% of visit_occurrence has visit concept "Inpatient Visit", while 40% have "Emergency Room Visit"

## Notes

- Version 0
- Generated by man-made rule/story generator
- Structural correct, all tables linked with the relationship
- We used national ethnicity data to generate a realistic distribution (see below)



### 2011 Race Census figure in England and Wales

| Ethnic Group | Population(%) |
|------------------------------------------------------------------------------------------------|---------------|
| Asian or Asian British: Bangladeshi | 1.1 |
| Asian or Asian British: Chinese | 0.7 |
| Asian or Asian British: Indian | 3.1 |
| Asian or Asian British: Pakistani | 2.7 |
| Asian or Asian British: any other Asian background | 1.6 |
| Black or African or Caribbean or Black British: African | 2.5 |
| Black or African or Caribbean or Black British: Caribbean | 1 |
| Black or African or Caribbean or Black British: other Black or African or Caribbean background | 0.5 |
| Mixed multiple ethnic groups: White and Asian | 0.8 |
| Mixed multiple ethnic groups: White and Black African | 0.4 |
| Mixed multiple ethnic groups: White and Black Caribbean | 0.9 |
| Mixed multiple ethnic groups: any other Mixed or multiple ethnic background | 0.8 |
| White: English or Welsh or Scottish or Northern Irish or British | 74.4 |
| White: Irish | 0.9 |
| White: Gypsy or Irish Traveller | 0.1 |
| White: any other White background | 6.4 |
| Other ethnic group: any other ethnic group | 1.6 |
| Other ethnic group: Arab | 0.6 |
29 changes: 29 additions & 0 deletions _projects/uclh_ngtube_s0/_04_image.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Example (synthetic) images

## Model

A Hugging Face Unconditional image generation Diffusion Model was used for training. [1] Unconditional image generation models are not conditioned on text or images during training. They only generate images that resemble the training data distribution. The model usually starts with a seed that generates a random noise vector. The model will then use this vector to create an output image similar to the images used to train the model.
The training script initializes a UNet2DModel and uses it to train the model. [2] The training loop adds noise to the images, predicts the noise residual, calculates the loss, saves checkpoints at specified steps, and saves the generated models.

## Training Dataset

The RANZCR CLiP dataset was used to train the model. [3] This dataset has been created by The Royal Australian and New Zealand College of Radiologists (RANZCR) which is a not-for-profit professional organisation for clinical radiologists and radiation oncologists. The dataset has been labelled with a set of definitions to ensure consistency with labelling. The normal category includes lines that were appropriately positioned and did not require repositioning. The borderline category includes lines that would ideally require some repositioning but would in most cases still function adequately in their current position. The abnormal category included lines that required immediate repositioning. 30000 images were used during training. All training images were 512x512 in size.
Computational Information
Training has been conducted using RTX 6000 cards with 24GB of graphics memory. A checkpoint was created after each epoch was saved with 220 checkpoints being generated so far. Each checkpoint takes up 1GB space in memory. Generating each epoch takes around 6 hours. Machine learning libraries such as TensorFlow, PyTorch, or scikit-learn are used to run the training, along with additional libraries for data preprocessing, visualization, or deployment.

![Synthetic CXR Example 01](./images/1.png)
![Synthetic CXR Example 02](./images/2_2.png)
![Synthetic CXR Example 03](./images/10.png)
![Synthetic CXR Example 04](./images/20.png)
![Synthetic CXR Example 05](./images/36_2.png)
![Synthetic CXR Example 06](./images/37.png)
![Synthetic CXR Example 07](./images/39.png)
![Synthetic CXR Example 08](./images/44_2.png)
![Synthetic CXR Example 09](./images/44.png)
![Synthetic CXR Example 10](./images/45.png)

## References

1. https://huggingface.co/docs/diffusers/en/training/unconditional_training#unconditional-image-generation
2. https://github.com/huggingface/diffusers/blob/096f84b05f9514fae9f185cbec0a4d38fbad9919/examples/unconditional_image_generation/train_unconditional.py#L356
3. https://www.kaggle.com/competitions/ranzcr-clip-catheter-line-classification/data
Binary file added _projects/uclh_ngtube_s0/images/1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _projects/uclh_ngtube_s0/images/10.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _projects/uclh_ngtube_s0/images/20.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _projects/uclh_ngtube_s0/images/2_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _projects/uclh_ngtube_s0/images/36_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _projects/uclh_ngtube_s0/images/37.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _projects/uclh_ngtube_s0/images/39.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _projects/uclh_ngtube_s0/images/44.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _projects/uclh_ngtube_s0/images/44_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _projects/uclh_ngtube_s0/images/45.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
29 changes: 15 additions & 14 deletions _projects/uclh_ngtube_s0/index.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,29 @@
---
layout: project
title: UCLH Ngtube Synthetic Data
title: Detecting misplaced NG tubes
status: ongoing
tags: imaging, safety
authors:
- NIHR CRIU SAFEHR Team
- SAFEHR team

tabs:
- {
name: uclh-ngt-s0-detail,
type: md,
source: _01_detail.md,
label: Detail
name: uclh-ngt-s0-quarto,
type: html,
source: omop_concepts_frequency_table.html,
label: Data summary
}
- {
name: uclh-ngt-s0-appx,
type: md,
source: _02_appendix.md,
label: Appendix
name: uclh-ngt-s0-ehr-data,
type: md,
source: _03_data.md,
label: Synthetic EHR
}
- {
name: uclh-ngt-s0-data,
type: html,
source: _03_data.html,
label: Data
name: uclh-ngt-s0-image-data,
type: md,
source: _04_image.md,
label: Synthetic CXR
}
---

Expand Down
Loading

0 comments on commit 7379b98

Please sign in to comment.