Skip to content
This repository has been archived by the owner on Aug 23, 2022. It is now read-only.

Pipeline #148

Closed
wants to merge 10 commits into from
Closed

Pipeline #148

wants to merge 10 commits into from

Conversation

klapaukh
Copy link
Collaborator

This basically allows you to create ccd objects not from the XML or the RData, but rather than the database using the following:
con <- connect(username="something", database="something")
ccd <- table.to.ccdata(exportData(con) %>% collect, metadata(connection = con))

The only real changes are in R/dbConnection.R [and the README but those are trivial]

@klapaukh klapaukh requested review from dpshelio and jonc125 November 21, 2017 03:16
@codecov-io
Copy link

codecov-io commented Nov 21, 2017

Codecov Report

Merging #148 into master will decrease coverage by 3.08%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #148      +/-   ##
==========================================
- Coverage   75.65%   72.56%   -3.09%     
==========================================
  Files          14       15       +1     
  Lines        1269     1323      +54     
==========================================
  Hits          960      960              
- Misses        309      363      +54
Impacted Files Coverage Δ
R/dbConnection.R 0% <0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a21af68...a5995eb. Read the comment docs.

@jonc125
Copy link

jonc125 commented Nov 21, 2017

Is the AppVeyor failure specific to this PR, or a long-standing problem?

@jonc125
Copy link

jonc125 commented Nov 21, 2017

Seems to have been failing for a while, including all of #145. From #135 it may be caused by importing dplyr?

Copy link

@jonc125 jonc125 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good (though I don't really know R!). My main comment would be that it needs a little more user-oriented documentation & commenting if others are to use it. But I agree the focus should be getting it to work first :)

names

if(length(types) == 1) {
if(cname == 'NIHR_HIC_ICU_0005'){
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth adding a comment explaining this special case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment about this

* data.table,
* yaml,
* dbplyr,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't mentioned earlier? Typo?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dbplyr must be installed on the system. But you never explicitly use it. It is a bit strange that way

* pander,
* Rcpp,
* methods
* RPostgreSQL,
* XML,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably once the DB connector is working we could strip out the old XML handling code?

But not in this PR!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know about that. Based on Sinan's messages about the package being general purpose, keeping the XML handling seems like it could be sensible since it may not be used just for the database we have?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll ask this morning :)

R/dbConnection.R Outdated
#' codes that are available with their meanings, and
#' what the different columns mean for them.
#'
#' This functions returns a loaded into memory R table,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/functions/function/


#' The metadata table. This is all of the different
#' codes that are available with their meanings, and
#' what the different columns mean for them.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding to this what the structure of the table is, so callers know how to use it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have given that a go

@sinanshi
Copy link
Collaborator

It is really great to see you guys starting to implement the database approach. Although I don't know exactly what you are planning to do, cleanEHR now has been branded as a more generic data cleaning package, at least this is what we said in the JOSS paper, which is still under review. Moving from ccdata to database fits the purpose well. As you named the branch as "pipeline", I wonder if this is the data processing pipeline only for CCHIC, e.g. import XML files from hospitals to the database, you might want to consider to move it to a separate package, at least until the JOSS paper is published.

@dpshelio
Copy link
Collaborator

@sinanshi - the database importer is done in a different package, this is only to read from a database and generate a ccdata object.

@sinanshi
Copy link
Collaborator

@dpshelio that's great.

Copy link

@jonc125 jonc125 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

R/dbConnection.R Outdated
#'
#' This functions returns a loaded into memory R table,
#' * Code_name is the XML code of this type of data
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that actually a capital C? Would be nice to be consistent with the other column names if it is.

R/dbConnection.R Outdated
#' This functions returns a loaded into memory R table,
#' * Code_name is the XML code of this type of data
#' * long_name is an english description of what it is
#' * primary_column is what columns of the events table
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/columns/column/

(I know, I'm nitpicking!)

@maelle maelle closed this Aug 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants