-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"soil data aggregation engine" vs. wrangling SQL statements or ad hoc R code #178
Comments
Here is a small InterpretationEngine proof of concept involving WCS and custom data aggregation I made last week while testing the new R package. https://github.com/ncss-tech/interpretation-engine/blob/master/misc/demo-SDA-WCS-IE-integration.R EDIT: Images of output and 3D rendering of "Dunn1" SVI interp output |
#179 adds the ssurgoOnDemand queries in I think we could probably get some more stub queries that target some other data categories from here |
Is there ever a valid reason to convert NULL values → 0? Current SOD:
Changes made in #185 remove this step. Related examples: library(aqp)
library(soilDB)
library(sf)
library(rasterVis)
library(viridisLite)
# SJQ | Amador Co.
# Mokelumne | Sed. Rock Land
# https://casoilresource.lawr.ucdavis.edu/gmap/?loc=38.46643,-121.02565,z14
bb <- '-121.1099 38.4288,-121.1099 38.4942,-120.9728 38.4942,-120.9728 38.4288,-121.1099 38.4288'
wkt <- sprintf('POLYGON((%s))', bb)
x <- st_as_sfc(wkt)
st_crs(x) <- 4326
(mu <- mukey.wcs(aoi = x, db = 'gnatsgo'))
levelplot(mu, att = 'ID', margin = FALSE, colorkey = FALSE, col.regions = viridis)
# get unique mukeys from grid
ll <- levels(mu)[[1]]
s <- get_SDA_property(property = 'pH 1:1 water - Rep Value', method = 'Weighted Average', mukeys = ll$ID, top_depth = 0, bottom_depth = 25)
## with ISNULL(x, 0)
## !! what? 0s?
s[which(s$ph1to1h2o_r == 0), ]
z <- fetchSDA("mukey = '461915'")
z$ph1to1h2o_r
## dilution of weighted mean due to weights associated with NULL values
## document this
# another issue: weighted averages
z <- fetchSDA("mukey = '461984'")
z$ph1to1h2o_r |
I think this issue does not have any remaining action items to address and suggest we close it. In my opinion UDFs are not a replacement for basic queries that operate on custom data sources using standard syntax. While UDFs may provide an easy route to the "right" answers, they require documentation, maintenance, and testing of their own. I think the functionality offered would have to be very significant to warrant overhead of maintaining a separate library of these functions. Much of my thought behind the work porting SOD/SDA queries, mostly avoiding pure R implementations in soilDB, was to provide a mapping of query string templates to specific categories of tabular results. Where a query relies on specific versions of UDFs, and the logic is not otherwise embedded in the query, it would be harder to port that query to other systems or for others to replicate. The current approach in soilDB is a mix of the two "short-term" plans involving string templates and/or native R functions. I don't think that we need an issue in soilDB to discuss the merits of whether we should continue to develop functions like these. Since this issue was first raised, we have been addressing issues with WSS parity, bugs, etc in specific issues as they arise and I think we can now consider these to be in the "long term" plan for soilDB. soilDB is a great place to prototype potential upgrades for SDA--for example the "MLRAPOLYGON" functionality recently added to In the future, converting to {dbplyr} for programmatic building of SQL could further enhance our ability to do aggregation and generalize to different database interfaces. While this is a nice idea it would of course be a significant change in terms of dependencies, would require refactoring most of the existing code, and would not work easily with the SDA web service (which is not a DBI source per se). |
Good points. I agree that this isn't really a soilDB issue and can be closed. I plan to purse the aggregation engine outside of SDA for now, as a long-term solution to this problem. SDA works well and does not need a bunch of new things bolted-on. |
Great. I would love to see some of the work you have done with soil data aggregation in Postgres for SoilWeb/ISSR800 and the like made available in some reusable form. |
Long-term plan: new macros or UDF to support aggregation on the fly in SQL and parity with WSS output. Contractors are working towards this, but no ETA.
Short-term plans:
slab
,data.table
, etc.) should match SOD / SDDT output when possibleSee
misc/soil-data-aggregation
for stubs that can be built-upon for both styles. The (eventual) SDA-based solution will solve most problems, but R-based solutions will be required for data that hasn't been or cannot be pushed to SDA.The text was updated successfully, but these errors were encountered: