Preserve unmapped values #163

ALightNHS · 2023-02-03T08:13:47Z

Is it possible to preserve the unmapped and missing/invalid source values when converting from source tables to the CDM tables?

PhilAppleby · 2023-02-03T15:55:17Z

I would need to know more about what you mean by "preserve".

Rejected data does not generate CDM output as that would be meaningless

ALightNHS · 2023-02-06T13:32:58Z

Thank you for your response.
I would like to see unmapped (potentially error-prone) source values in the CDM output as this would help to identify data quality issues/ inconsistencies where "similar" fields are captured in different systems.

For example, if patient height is stored in two different datasets, it would be useful to map these data sources, and then identify inconsistent source values per patient in the CDM.

Another side to this question is: how would we map source fields containing unstructured text data to the CDM? It wouldn't be feasible to apply the same mapping logic to a json config for clinical notes.

PhilAppleby · 2023-02-09T11:46:39Z

Hello again, could you let me have more information on the particular use-case you have in mind?

The software was designed, in collaboration with data partners, to map from input values to output OMOP concepts. Placing an input value in the OMOP output, unless explicitly mapped as a "source_value", would be a violation of this principle.

Also, with reference to your height example, if a person's information is captured as part of two different datasets, this tool will not be aware of this as it works in isolation on each data set individually. We could not use information from one dataset for the other unless we had explicit approvals to do so and therefore this tool has been designed to work on each data set in complete isolation.

Additionally, a file "summary.tsv" is produced which contains no detailed data but gives an indication of rejected input numbers as percentages.

Finally we do have manual methods for mapping clinical notes to OMOP concepts you would need to contact our data team for guidance on that.

ALightNHS · 2023-02-09T13:36:28Z

Hi Phil,
Firstly, I would like to say thank you for your responses and patience. I am very excited about CaRROT and believe that it will have a significant impact.

With your permission, I would appreciate the opportunity to exchange emails to discuss this further?

Otherwise, I will try to explain further. I realise that my particular use-case for CaRROT (and the CDM in general) goes against their intended designs. I am trying to take advantage of the CDM's relational schema to integrate multiple data sources, identify inconsistencies, and then to diagnose and resolve these at source.

PhilAppleby · 2023-02-13T10:50:24Z

Hi there, as the development of CaRROT-CDM is funded by health data research projects we can discuss further if you contact me using my University of Dundee email account - [email protected]. Could you also identify yourself so I know with whom I'm talking?

ALightNHS · 2023-02-13T13:18:08Z

Hi Phil, my name is Anthony Lighterness - I'm a data scientist at The Christie NHS FT. I'll send you an email, thank you for that!

ALightNHS closed this as completed Feb 3, 2023

ALightNHS reopened this Feb 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve unmapped values #163

Preserve unmapped values #163

ALightNHS commented Feb 3, 2023

PhilAppleby commented Feb 3, 2023

ALightNHS commented Feb 6, 2023 •

edited

Loading

PhilAppleby commented Feb 9, 2023

ALightNHS commented Feb 9, 2023

PhilAppleby commented Feb 13, 2023

ALightNHS commented Feb 13, 2023

Preserve unmapped values #163

Preserve unmapped values #163

Comments

ALightNHS commented Feb 3, 2023

PhilAppleby commented Feb 3, 2023

ALightNHS commented Feb 6, 2023 • edited Loading

PhilAppleby commented Feb 9, 2023

ALightNHS commented Feb 9, 2023

PhilAppleby commented Feb 13, 2023

ALightNHS commented Feb 13, 2023

ALightNHS commented Feb 6, 2023 •

edited

Loading