-
Notifications
You must be signed in to change notification settings - Fork 22
Pipeline #148
Conversation
Codecov Report
@@ Coverage Diff @@
## master #148 +/- ##
==========================================
- Coverage 75.65% 72.56% -3.09%
==========================================
Files 14 15 +1
Lines 1269 1323 +54
==========================================
Hits 960 960
- Misses 309 363 +54
Continue to review full report at Codecov.
|
Is the AppVeyor failure specific to this PR, or a long-standing problem? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good (though I don't really know R!). My main comment would be that it needs a little more user-oriented documentation & commenting if others are to use it. But I agree the focus should be getting it to work first :)
names | ||
|
||
if(length(types) == 1) { | ||
if(cname == 'NIHR_HIC_ICU_0005'){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding a comment explaining this special case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment about this
* data.table, | ||
* yaml, | ||
* dbplyr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't mentioned earlier? Typo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dbplyr must be installed on the system. But you never explicitly use it. It is a bit strange that way
* pander, | ||
* Rcpp, | ||
* methods | ||
* RPostgreSQL, | ||
* XML, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably once the DB connector is working we could strip out the old XML handling code?
But not in this PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know about that. Based on Sinan's messages about the package being general purpose, keeping the XML handling seems like it could be sensible since it may not be used just for the database we have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll ask this morning :)
R/dbConnection.R
Outdated
#' codes that are available with their meanings, and | ||
#' what the different columns mean for them. | ||
#' | ||
#' This functions returns a loaded into memory R table, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/functions/function/
|
||
#' The metadata table. This is all of the different | ||
#' codes that are available with their meanings, and | ||
#' what the different columns mean for them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth adding to this what the structure of the table is, so callers know how to use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have given that a go
It is really great to see you guys starting to implement the database approach. Although I don't know exactly what you are planning to do, cleanEHR now has been branded as a more generic data cleaning package, at least this is what we said in the JOSS paper, which is still under review. Moving from ccdata to database fits the purpose well. As you named the branch as "pipeline", I wonder if this is the data processing pipeline only for CCHIC, e.g. import XML files from hospitals to the database, you might want to consider to move it to a separate package, at least until the JOSS paper is published. |
@sinanshi - the database importer is done in a different package, this is only to read from a database and generate a ccdata object. |
@dpshelio that's great. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
R/dbConnection.R
Outdated
#' | ||
#' This functions returns a loaded into memory R table, | ||
#' * Code_name is the XML code of this type of data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that actually a capital C? Would be nice to be consistent with the other column names if it is.
R/dbConnection.R
Outdated
#' This functions returns a loaded into memory R table, | ||
#' * Code_name is the XML code of this type of data | ||
#' * long_name is an english description of what it is | ||
#' * primary_column is what columns of the events table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/columns/column/
(I know, I'm nitpicking!)
This basically allows you to create ccd objects not from the XML or the RData, but rather than the database using the following:
con <- connect(username="something", database="something")
ccd <- table.to.ccdata(exportData(con) %>% collect, metadata(connection = con))
The only real changes are in R/dbConnection.R [and the README but those are trivial]