QuantitativeSmokingStatusExtractionNLPTool

A rule-base NLP system to extract quantitative smoking information (e.g., pack-year) from clinical notes

Aim

The package is used for extraction of quantitative smoking information from clinical notes.

Smoking information type

pack per day
smoking year
quit year (e.g., quit for 10 years)
year at quit (e.g., quit at 2008)
pack-year

Requirement

java 1.8

Input format

For input, we expect a CSV file with encoding as UTF-8. The data table should have no header (only real data in table) and 5 columns as

note ID
patient ID
note Date
note Type
note text (You can use dummy text for 1-4)

Output format

The output is a TSV with encoding as UTF-8.

note ID
patient ID
note Date
note Type
extracted data type
extracted data value
a snippet of where the extracted value located in text (50 characters before and after the value) (1-4 is copied from input data)

How to run

change to the project directory
run java -jar RuleBaseSmokingInfoExtraction.jar or use the run.sh (modify arguments necessary)

Example

we provide sample.csv for testing, see run.sh

Note

this is a rule-based system
we are keeping update rules to cover special cases

Release

we released the RuleBaseSmokingInfoExtraction.jar
we will release source code

Citation

Yang X, Yang H, Lyu T, Yang S, Guo Y, Bian J, Xu H, Wu Y. 
A Natural Language Processing Tool to Extract Quantitative Smoking Status from Clinical Narratives.
2020 IEEE International Conference on Healthcare Informatics (ICHI), 2020, pp. 1-2. 
doi: 10.1109/ICHI48887.2020.9374369.
PMID: 33173920; PMCID: PMC7654916.

https://ieeexplore.ieee.org/abstract/document/9374369

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
resources		resources
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
RuleBaseSmokingInfoExtraction.jar		RuleBaseSmokingInfoExtraction.jar
run.sh		run.sh
sample.csv		sample.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QuantitativeSmokingStatusExtractionNLPTool

Aim

Smoking information type

Requirement

Input format

Output format

How to run

Example

Note

Release

Citation

About

Releases

Packages

Languages

License

uf-hobi-informatics-lab/QuantitativeSmokingStatusExtractionNLPTool

Folders and files

Latest commit

History

Repository files navigation

QuantitativeSmokingStatusExtractionNLPTool

Aim

Smoking information type

Requirement

Input format

Output format

How to run

Example

Note

Release

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages