-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add context (res id and name) when downloading data #24
Conversation
This is great! I`m probably leaning toward having it optional (but perhaps default to TRUE for get_dataset) and I think it might be worth having the option for get_resource as well (but would perhaps default to FALSE for get_resource)? |
Also, improve the clarity of the error message (likely only developers will see this error)
This is currently used throughout the package.
Also make tests skip if it looks like the OpenData platform isn't available.
From discussion
|
Tests should run faster
@csillasch and @ross-hull I think this is done now. The modified date still returns as NA if nothing is returned from the API (we discussed possibly setting it to the created date but don't remember the final decision? We talked about adding some additional tests, I've added some but please let me know if you want anything else specifically testing - or importantly if you find a resource / dataset this isn't working for in which case I'll definitely add that and fix whatever the issue is! |
This looks good to me. And using it as a user was simple. One thing that I wondered is if it is worth having the same naming convention between the new context columns, as is already used in the open data? The context columns are snake case while the others are upper camel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
I spotted unexpected behaviour when testing on the General Practitioner Contact Details dataset: modified date is returned correctly using get_dataset but returns NULL with get_resource("647b256e-4a03-4963-8402-bf559c9e2fff")
Suggested a change to sort this which works for the resource but might be worth testing against some of the others - especially where it worked fine previously.
Fair enough, do these names look good?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get_dataset seems to handle it fine but get_resource is returning the dates as lists - all your changes look OK to me so not sure how/why this is happening
This was the advice I got back from the `{vctrs}` team.
The problem was highlighted in this teams chat, when using
get_dataset
if the dataset doesn't include a variable likedate
it becomes impossible to tell which rows are from which resource asget_dataset
just joins them into a single tibble.This PR introduces an
include_context
parameter which isFALSE
by default to bothget_resource
andget_dataset
ifTRUE
they will now append a few variables:"res_id", "res_name", "res_created_date" and "res_modified_date"
to the front of the data.For
get_dataset
this is achieved with no additional API calls as all the needed info is included when we hitpackage_show
for the list of IDs. Forget_resource
this involves an additional API call toresource_show
(we haven't used this before).I chose to have them both be
FALSE
by default so it won't break any existing code.