Check that dataset being pulled is the most recent #14

ciarag01 · 2023-02-08T10:25:45Z

As discussed with @rmccreath :

I use the GP Practice Contact Details and List Sizes dataset, currently using get_resource and dataset ID but this needs the ID to be updated whenever the dataset is which is easy to miss. Looking at get_dataset instead, it does seem that setting max_resources to 1 will give the most recent file every time but this isn't obvious from the documentation or the data returns. Possible fixes:

Update documentation to explain how the resources are ordered,
A new function to pull back only the most recent data, - Get latest resource #36
A change to get_dataset that adds a date (maybe the date when dataset was updated?) to the data when it is pulled from open data. Not sure how easy this is, just an idea! - Add context (res id and name) when downloading data #24

I've mentioned GP Practice Contact Details and List Sizes dataset because that's how this came up for me but this will likely be useful for any dataset that gets updated regularly.

Moohan · 2023-02-08T10:46:39Z

I had exactly this use case in mind when I first built get_dataset(). I don't think the package does any explicit sorting of the IDs / datasets so we're just relying on the API returning them in age order (which I'm pretty certain it does).

I think we should confirm that the API will always list datasets in age order (someone who know the backend will have to look this up).

There's probably not scope to sort the dataset IDs as, to me they just seem like random strings?

Pending the checks on the API my preference would be to update the documentation to make it clear that it returns them in age order. We could modify the param to take "all" or "latest" (default) as well as a specific number, which might also help with clarity.

rmccreath added the enhancement New feature or request label Feb 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check that dataset being pulled is the most recent #14

Check that dataset being pulled is the most recent #14

ciarag01 commented Feb 8, 2023 •

edited by Moohan

Loading

Moohan commented Feb 8, 2023

Check that dataset being pulled is the most recent #14

Check that dataset being pulled is the most recent #14

Comments

ciarag01 commented Feb 8, 2023 • edited by Moohan Loading

Moohan commented Feb 8, 2023

ciarag01 commented Feb 8, 2023 •

edited by Moohan

Loading