wranglEHR

Overview

wranglEHR is a data wrangling and cleaning tool for CC-HIC. It is designed to run against the CC-HIC EAV table structure (which at present exists in PostgreSQL and SQLite flavours). We are about to undergo a major rewrite to OHDSI CDM version 6, so this package will be in flux. Please see the R vignettes for further details on how to use the package to perform the most common tasks:

extract_demographics() produces a table for time invariant dataitems.
extract_timevarying() produces a table for longitudinal dataitems.
clean() cleans the above tables according to pre-defined standards.

This package is designed to work in concert with inspectEHR which provides data quality evaluation for the CC-HIC.

Installation

# install directly from github with
remotes::install_github("DocEd/wranglEHR")
library(wranglEHR)

Usage

# Connect to the database (will use the internal test db)
ctn <- setup_dummy_db()

# Extract static variables. Rename on the fly.
dtb <- extract_demographics(
  connection = ctn,
  episode_ids = 1:10, # specify for episodes
  code_names = c("NIHR_HIC_ICU_0017", "NIHR_HIC_ICU_0019"),
  rename = c("height", "weight")
)

head(dtb)
#> # A tibble: 6 × 2
#>   episode_id height
#>        <int>  <dbl>
#> 1          1  2.34 
#> 2          2  2.01 
#> 3          3  4.00 
#> 4          4 -0.318
#> 5          5  2.44 
#> # … with 1 more row

# Extract time varying variables. Rename on the fly.
ltb <- extract_timevarying(
  ctn,
  episode_ids = 1:10,
  code_names = "NIHR_HIC_ICU_0108",
  rename = "hr")
#> 3e-04 hours to process
#> WEE! How sublime was that?!

head(ltb)
#> # A tibble: 6 × 3
#>   episode_id  time    hr
#>        <int> <dbl> <int>
#> 1          1     0    91
#> 2          1     1    78
#> 3          1     2   102
#> 4          1     3    94
#> 5          1     4    69
#> # … with 1 more row

# Pull out to any arbitrary temporal resolution and custom define the
# behaviour for information recorded at resolution higher than you are sampling.
# only extract the first 24 hours of data

ltb_2 <- extract_timevarying(
  ctn,
  episode_ids = 1:10,
  code_names = "NIHR_HIC_ICU_0108",
  rename = "hr",
  cadence = 2, # 1 row every 2 hours
  coalesce_rows = mean, # use mean to downsample to our 2 hour cadence
  time_boundaries = c(0, 24)
  )
#> 0.00026 hours to process
#> HUZZAH! How cat's meow was that?!

head(ltb_2)
#> # A tibble: 6 × 3
#>   episode_id  time    hr
#>        <int> <dbl> <dbl>
#> 1          1     0  84.5
#> 2          1     2 102  
#> 3          1     4  81.3
#> 4          1     6  80  
#> 5          1     8  80.3
#> # … with 1 more row

## Don't forget to turn the lights out as you leave.
DBI::dbDisconnect(ctn)

Getting help

If you find a bug, please file a minimal reproducible example on github.

https://www.ohdsi.org/analytic-tools/achilles-for-data-characterization/
Kahn, Michael G.; Callahan, Tiffany J.; Barnard, Juliana; Bauck, Alan E.; Brown, Jeff; Davidson, Bruce N.; Estiri, Hossein; Goerg, Carsten; Holve, Erin; Johnson, Steven G.; Liaw, Siaw-Teng; Hamilton-Lopez, Marianne; Meeker, Daniella; Ong, Toan C.; Ryan, Patrick; Shang, Ning; Weiskopf, Nicole G.; Weng, Chunhua; Zozus, Meredith N.; and Schilling, Lisa (2016) “A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data,” eGEMs (Generating Evidence & Methods to improve patient outcomes): Vol. 4: Iss. 1, Article 18.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github		.github
R		R
data-raw		data-raw
docs		docs
man		man
pkgdown/favicon		pkgdown/favicon
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
logo.png		logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wranglEHR

Overview

Installation

Usage

Getting help

About

Releases

Packages

Languages

DocEd/wranglEHR

Folders and files

Latest commit

History

Repository files navigation

wranglEHR

Overview

Installation

Usage

Getting help

About

Resources

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages