Skip to content
This repository has been archived by the owner on Dec 6, 2024. It is now read-only.

Sample type by GLIMS project name / owner #144

Merged
merged 10 commits into from
Oct 18, 2023
13 changes: 13 additions & 0 deletions Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,19 @@ Changes in this log refer only to changes that make it to the 'main' branch. and

For changes in deployment, please see the [deployment changelog](deploy/cttso-ica-to-pieriandx-cdk/Changelog.md)

## 2023-10-18

> Author: Alexis Lucattini
> Email: [[email protected]](mailto:[email protected])

### Enhancement

* Add portal run id to sequencer run attribute of PierianDx Case Accession (https://github.com/umccr/cttso-ica-to-pieriandx/pull/142)
* Resolves https://github.com/umccr/cttso-ica-to-pieriandx/issues/130

* Allow both sub_panel and subpanel as valid panel types (https://github.com/umccr/cttso-ica-to-pieriandx/pull/143)
* Resolves https://github.com/umccr/cttso-ica-to-pieriandx/issues/139

## 2023-07-10

> Author: Alexis Lucattini
Expand Down
21 changes: 21 additions & 0 deletions deploy/cttso-ica-to-pieriandx-cdk/Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,27 @@
Changes in this log refer only to changes that make it to the 'main' branch and
are nested under deploy/cttso-ica-to-pieriandx-cdk.

## 2023-10-18

> Author: Alexis Lucattini
> Email: [[email protected]](mailto:[email protected])

### Enhancements

* Move to project owner / project name mapping logic (https://github.com/umccr/cttso-ica-to-pieriandx/pull/141)
* And restructure LIMS sheet
* Diagram also updated
* Resolves:
* https://github.com/umccr/cttso-ica-to-pieriandx/issues/131
* https://github.com/umccr/cttso-ica-to-pieriandx/issues/132
* https://github.com/umccr/cttso-ica-to-pieriandx/issues/134
* https://github.com/umccr/cttso-ica-to-pieriandx/issues/135

* Add deleted sheet (https://github.com/umccr/cttso-ica-to-pieriandx/pull/140)
* All cases assigned to user ToBe Deleted, are moved to a separate sheet



## 2023-08-13

> Author: Alexis Lucattini
Expand Down
141 changes: 113 additions & 28 deletions deploy/cttso-ica-to-pieriandx-cdk/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,13 @@ new_headers = [
"in_portal",
"in_redcap",
"in_pieriandx",
"glims_is_validation",
"glims_is_research",
"glims_project_owner",
"glims_project_name",
"glims_panel",
"glims_sample_type",
"glims_is_identified",
"glims_default_snomed_term",
"glims_needs_redcap",
"redcap_sample_type",
"redcap_is_complete",
"portal_wfr_id",
Expand All @@ -58,11 +63,12 @@ new_headers = [
"pieriandx_case_accession_number",
"pieriandx_case_creation_date",
"pieriandx_case_identified",
"pieriandx_assignee",
"pieriandx_panel_type",
"pieriandx_sample_type",
"pieriandx_workflow_id",
"pieriandx_workflow_status",
"pieriandx_report_status",
"pieriandx_report_status"
]

headers_df = pd.DataFrame(columns=new_headers)
Expand All @@ -87,35 +93,14 @@ print(new_spread.url)

## ctTSO LIMS Decision Tree

* Sample Types are determined by Google LIMS and RedCap
* If the ProjectName column in Google LIMS is set to _Validation_ or _Control_.
* Validation Sample goes through Validation Lambda
* If the Workflow column in Google LIMS is set to _Research_ AND Sample does not exist in RedCap
* Validation Sample goes through Validation Lambda
* If the Sample Type is Validation in RedCap
* Validation Sample goes through RedCap Lambda (but with SampleType set to Validation)

* If Sample Type is PatientCare in RedCap
* Patient Care Sample goes through RedCap Lambda (with SampleType set to PatientCare)

* If none of the above is true
* We assume this is a patient sample that is not in RedCap yet and hold off on running sample.

* Panel Types are coupled to the Sample Type
* If the Sample is a Validation Sample
* Panel Type will always be _MAIN_
* If the Sample is a Clinical Sample
* Panel Type will always be _SUBPANEL_

The following diagram(s) may be of assistance
Please see [#validation-or-clinical-script](#validation-or-clinical-script) for more information.

### Overview

![images/overview.drawio.png](images/overview.drawio.png)
> The following diagram may be of assistance

### Choose Launch Pathway
![images/overview.drawio.png](images/overview.drawio.png)

![images/choose-launch-pathway.drawio.png](images/choose-launch-pathway.drawio.png)

## Helpful scripts

Expand Down Expand Up @@ -253,7 +238,7 @@ Now change to the deployment directory (the directory this readme is in)
cd deploy/cttso-ica-to-pieriandx-cdk
```

### Wake up lamdas!
### Wake up lambdas!

Before we launch any payloads, let's ensure that the lambda (and any downstream lambdas)
are active.
Expand Down Expand Up @@ -282,6 +267,106 @@ Find the workflow with the subject id and library id of interest in the workflow
Use the Google LIMS page to check if you're sample is a validation sample (ProjectName field is either _control_ or _validation_).
Validation samples do not go through the subpanel pipeline, clinical samples go through the subpanel pipeline.

We use the following JSON logic to determine the pathway for each pieriandx sample based on it's project owner

This file can be found in `project-name-to-pieriandx-mapping.json`.

The mapping can be updated with the script `update_project_name_mapping.sh`.

This ssm parameter is NOT part of the cdk stack and MUST be updated using the script above.

```json
[
{
"project_owner": "VCCC",
"project_name": "PO",
"panel": "subpanel",
"sample_type": "patient_care_sample",
"is_identified": "identified",
"default_snomed_term":null
},
{
"project_owner": "Grimmond",
"project_name": "COUMN",
"panel": "subpanel",
"sample_type": "patient_care_sample",
"is_identified": "identified",
"default_snomed_term": null
},
{
"project_owner": "Tothill",
"project_name": "CUP",
"panel": "main",
"sample_type": "patient_care_sample",
"is_identified": "identified",
"default_snomed_term": "Disseminated malignancy of unknown primary"
},
{
"project_owner": "Tothill",
"project_name": "PPGL",
"panel": "main",
"sample_type": "patient_care_sample",
"is_identified": "identified",
"default_snomed_term": null
},
{
"project_owner": "TJohn",
"project_name": "MESO",
"panel": "subpanel",
"sample_type": "patient_care_sample",
"is_identified": "identified",
"default_snomed_term": null
},
{
"project_owner": "TJohn",
"project_name": "OCEANiC",
"panel": "subpanel",
"sample_type": "patient_care_sample",
"is_identified": "deidentified",
"default_snomed_term": null
},
{
"project_owner": "*",
"project_name": "SOLACE2",
"panel": "main",
"sample_type": "patient_care_sample",
"is_identified": "deidentified",
"default_snomed_term": "Neoplastic disease"
},
{
"project_owner": "SLuen",
"project_name": "IMPARP",
"panel": "main",
"sample_type": "patient_care_sample",
"is_identified": "deidentified",
"default_snomed_term": "Neoplastic disease"
},
{
"project_owner": "UMCCR",
"project_name": "Control",
"panel": "main",
"sample_type": "validation",
"is_identified": "deidentified",
"default_snomed_term": "Neoplastic disease"
},
{
"project_owner": "UMCCR",
"project_name": "QAP",
"panel": "subpanel",
"sample_type": "patient_care_sample",
"is_identified": "identified",
"default_snomed_term": null
},
{
"project_owner": "*",
"project_name": "*",
"panel": "main",
"sample_type": "patient_care_sample",
"is_identified": "deidentified",
"default_snomed_term": "Neoplastic disease"
}
]
```

### Creating the input payloads file

Expand Down
3 changes: 3 additions & 0 deletions deploy/cttso-ica-to-pieriandx-cdk/constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -44,3 +44,6 @@ export const SSM_TOKEN_REFRESH_LAMBDA_FUNCTION_ARN_VALUE: string = "token-refres

// Output things
export const SSM_LAMBDA_FUNCTION_ARN_VALUE: string = "cttso-ica-to-pieriandx-lambda-function"

// Project Owner mapping path
export const SSM_PROJECT_NAME_TO_PIERIANDX_CONFIG_SSM_PATH: string = "cttso-lims-project-name-to-pieriandx-mapping"
Binary file not shown.
Binary file modified deploy/cttso-ica-to-pieriandx-cdk/images/overview.drawio.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,9 @@ def lambda_handler(event, context):
"library_id": "L1234567",
"case_accession_number": "SBJID_LIBID_123",
"ica_workflow_run_id": "wfr.123abc",
"panel_type": "main"
"panel_type": "main",
"sample_type": "validation",
"is_identified": False
}
"""

Expand Down Expand Up @@ -76,17 +78,30 @@ def lambda_handler(event, context):
f"for subject id '{subject_id}' / library id '{library_id}'")
raise ValueError

# Get Panel Type (or get default if none
panel_type: str
if (panel_type := event.get("panel_type", None)) is None:
panel_type = VALIDATION_DEFAULTS["panel_type"]

# Get is identified
sample_type: str
if (sample_type := event.get("sample_type", None)) is None:
sample_type = VALIDATION_DEFAULTS["sample_type"]

# Get is identified
is_identified: str
if (is_identified := event.get("is_identified", None)) is None:
is_identified = VALIDATION_DEFAULTS["is_identified"]

# Update sample_df with validation defaults
sample_df["sample_type"] = VALIDATION_DEFAULTS["sample_type"]
sample_df["sample_type"] = sample_type
sample_df["panel_type"] = panel_type
sample_df["is_identified"] = is_identified
sample_df["indication"] = VALIDATION_DEFAULTS["indication"]
sample_df["disease_id"] = VALIDATION_DEFAULTS["disease_id"]
sample_df["disease_name"] = VALIDATION_DEFAULTS["disease_name"]
sample_df["is_identified"] = VALIDATION_DEFAULTS["is_identified"]
sample_df["requesting_physicians_first_name"] = VALIDATION_DEFAULTS["requesting_physicians_first_name"]
sample_df["requesting_physicians_last_name"] = VALIDATION_DEFAULTS["requesting_physicians_last_name"]
sample_df["first_name"] = VALIDATION_DEFAULTS["first_name"]
sample_df["last_name"] = VALIDATION_DEFAULTS["last_name"]
sample_df["date_of_birth"] = VALIDATION_DEFAULTS["date_of_birth"]
sample_df["specimen_type"] = VALIDATION_DEFAULTS["specimen_type"]
sample_df["date_accessioned"] = VALIDATION_DEFAULTS["date_accessioned"]
sample_df["date_collected"] = VALIDATION_DEFAULTS["date_collected"]
Expand Down Expand Up @@ -124,18 +139,11 @@ def lambda_handler(event, context):
sample_df["accession_number"] = case_accession_number
sample_df["date_accessioned"] = datetime_obj_to_utc_isoformat(CURRENT_TIME)

# Convert times to utc time and strings
for date_column in ["date_received", "date_collected", "date_of_birth"]:
sample_df[date_column] = sample_df[date_column].apply(
lambda x: datetime_obj_to_utc_isoformat(handle_date(x))
)

# Rename columns
logger.info("Rename external subject and external sample columns")
sample_df = sample_df.rename(
columns={
"external_sample_id": "external_specimen_id",
"external_subject_id": "mrn"
}
)

Expand All @@ -148,6 +156,31 @@ def lambda_handler(event, context):
axis="columns"
)

# For identified - we rename external subject id as the medical record number
if all(sample_df["is_identified"]):
sample_df["first_name"] = VALIDATION_DEFAULTS["first_name"]
sample_df["last_name"] = VALIDATION_DEFAULTS["last_name"]
sample_df["date_of_birth"] = VALIDATION_DEFAULTS["date_of_birth"]
sample_df = sample_df.rename(
columns={
"external_subject_id": "mrn"
}
)
# For deidentified - we rename the external subject id as the study subject identifier
else:
sample_df["study_identifier"] = sample_df["project_name"]
sample_df = sample_df.rename(
columns={
"external_subject_id": "study_subject_identifier"
}
)

# Convert times to utc time and strings
for date_column in ["date_received", "date_collected", "date_of_birth"]:
sample_df[date_column] = sample_df[date_column].apply(
lambda x: datetime_obj_to_utc_isoformat(handle_date(x))
)

# Assert expected values exist
logger.info("Check we have all of the expected information")
for expected_column in EXPECTED_ATTRIBUTES:
Expand All @@ -158,10 +191,6 @@ def lambda_handler(event, context):
)
raise ValueError

if (panel_type := event.get("panel_type", None)) is None:
panel_type = VALIDATION_DEFAULTS["panel_type"].name.lower()
sample_df["panel_type"] = panel_type

# Launch batch lambda function
accession_json: Dict = sample_df.to_dict(orient="records")[0]

Expand Down
Loading