-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use DNA/Culture-specific views for Chicago Beach Lab Data #332
Comments
Hey @levyj, is there something on our end we need to do to get these data sets straightened out? If you have a login for the system, you can register them. If not, we can work with you to get them set up and get the data ingested. |
@vforgione - Sorry for the delay in replying. How about if we did the following?
I certainly can register Number 2 but we probably should make sure Number 1 seems acceptable first. Thanks. |
I've updated the existing data set and added a new one. The DNA data set is coming back just fine via the API, but the culture data is still funky. I looked at the exported CSV and there are a lot of null values for the location field and many of the timestamps are being parsed incorrectly in the database. |
Thanks. There is no way to filter out bad records on your end, is there? If not, how about if I set up a separate, hidden filtered view that excludes problematic records? It would be papering over problems instead of really solving them but that may be acceptable in this case. Data ending in 2016 will not be of huge interest, anyway. |
As it stands, there really isn't a good way. We've tried several tweaks to the ETL process and it inevitably breaks something else (hence the work on a new platform). A view that filters those out would be the better option right now. |
Let's see how this works: https://data.cityofchicago.org/Parks-Recreation/Beach-Lab-Data-Culture-Tests-For-Plenario/84eh-sf3p |
That worked better. I have a match on the number of rows, and can directly query it in the db. The API is still reporting no data and I'm getting nothing in my error report. I'm gonna have to dig deeper on this. |
@HeyZoos, I added a new table for the culture tests (dataset name is |
Hey guys, sorry it took me so long to get on the issue. I was moving and so was out of action for a good while. Issuing these two queries seems to yield data:
grabs 2935 records of beach lab dna (including 2017 data) and:
grabs 629 records of beach lab cultures It could be that if you guys had issued queries before the ETL completed, an empty result was cached that takes a half hour to refresh if I remember correctly. |
Are you able to see the dna and culture data @levyj? |
I just removed the old data set. I'm not sure about the information reported in the explorer app, but going through the API and checking the database we definitely have and are producing all the data ingested from the view you provided us. |
As noted in Chicago/opengrid#309, http://plenar.io/explore/event/beach_lab_data has not updated properly since 2016 due to our reconfiguration of the underlying dataset to include both Culture and DNA tests, meaning that no one date field is consistently populated.
It probably would make sense to separate the Plenario dataset into separate datasets for these two test types, although DNA seems like the higher priority since there are no culture tests this year. We have created filtered views that can be used as source "datasets":
https://data.cityofchicago.org/Parks-Recreation/Beach-Lab-Data-DNA-Tests/hmqm-anjq
https://data.cityofchicago.org/Parks-Recreation/Beach-Lab-Data-Culture-Tests/hh4t-tnq8
CC: @tomschenkjr @nicklucius
The text was updated successfully, but these errors were encountered: