diff --git a/NEWS.md b/NEWS.md index 63798282..abba73d2 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,9 @@ +# soilDB 2.7.9.9000 (2023-09-18) + - `fetchSDA_spatial()` gains `geom.src="mlrapolygon"` for obtaining Major Land Resource Area (MLRA) polygon boundaries. When using this geometry source `x` is a vector of `MLRARSYM` (MLRA Symbols). + + - The geometry source is the MLRA Geographic Database v5.2 (2022) which is not (yet) part of Soil Data Access. Instead of SDA, GDAL utilities are used to read a zipped ESRI Shapefile from a remote URL: . Therefore, most additional `fetchSDA_spatial()` arguments are _not_ currently supported for the MLRA geometry source. In the future a `mlrapolygon` table may be added to SDA (analogous to `mupolygon` and `sapolygon`), and the function will be updated accordingly at that time. + + # soilDB 2.7.9 (2023-09-01) - Added new `method` options for `fetchSDA_spatial()`. Aggregation grouping is controlled by the `by.col` argument. This works for mapunit and survey area polygon geometries, aggregating all polygons in the group for each `mukey`, `nationalmusym`, `lkey`, or `areasymbol` extent. diff --git a/R/fetchSDA_spatial.R b/R/fetchSDA_spatial.R index 8a02a762..aff67cf3 100644 --- a/R/fetchSDA_spatial.R +++ b/R/fetchSDA_spatial.R @@ -4,10 +4,10 @@ #' #' A Soil Data Access query returns geometry and key identifying information about the map unit or area of interest. Additional columns from the map unit or legend table can be included; see `add.fields` argument. #' -#' @param x A vector of map unit keys (`mukey`) or national map unit symbols (`nmusym`) for `mupolygon` geometry OR legend keys (`lkey`) or soil survey area symbols (`areasymbol`) for `sapolygon` geometry +#' @param x A vector of map unit keys (`mukey`) or national map unit symbols (`nmusym`) for `mupolygon` geometry OR legend keys (`lkey`) or soil survey area symbols (`areasymbol`) for `sapolygon` geometry. If `geom.src="mlrapolygon"` then `x` refers to `MLRARSYM` (major land resource area symbols). #' @param by.col Column name containing map unit identifier `"mukey"`, `"nmusym"`/`"nationalmusym"` for `geom.src` `mupolygon` OR `"areasymbol"`, `"areaname"`, `"mlraoffice"`, `"mouagncyresp"` for `geom.src` `sapolygon`; default is determined by `is.numeric(x)` `TRUE` for `mukey` or `lkey` and `nationalmusym` or `areasymbol` otherwise. #' @param method geometry result type: `"feature"` returns polygons, `"bbox"` returns the bounding box of each polygon (via `STEnvelope()`), `"point"` returns a single point (via `STPointOnSurface()`) within each polygon, `"extent"` returns an aggregate bounding box (the extent of all polygons, `geometry::EnvelopeAggregate()`) ), `"convexhull"` (`geometry::ConvexHullAggregate()`) returns the aggregate convex hull around all polygons, `"union"` (`geometry::UnionAggregate()`) and `"collection"` (`geometry::CollectionAggregate()`) return a `MULTIPOLYGON` or a `GEOMETRYCOLLECTION`, respectively, for each `mukey`, `nationalmusym`, or `areasymbol `. In the case of the latter four aggregation methods, the groups for aggregation depend on `by.col` (default by `"mukey"`). -#' @param geom.src Either `mupolygon` (map unit polygons) or `sapolygon` (soil survey area boundary polygons) +#' @param geom.src Either `mupolygon` (map unit polygons), `sapolygon` (soil survey area boundary polygons), or `mlrapolygon` (major land resource area boundary polygons) #' @param db Default: `"SSURGO"`. When `geom.src` is `mupolygon`, use STATSGO polygon geometry instead of SSURGO by setting `db = "STATSGO"` #' @param add.fields Column names from `mapunit` or `legend` table to add to result. Must specify parent table name as the prefix before column name e.g. `mapunit.muname`. #' @param chunk.size Number of values of `x` to process per query. Necessary for large results. Default: `10` @@ -21,10 +21,17 @@ #' Querying regions with complex mapping may require smaller `chunk.size`. Numerically adjacent IDs in the input vector may share common qualities (say, all from same soil survey area or region) which could cause specific chunks to perform "poorly" (slow or error) no matter what the chunk size is. Shuffling the order of the inputs using `sample()` may help to eliminate problems related to this, depending on how you obtained your set of MUKEY/nationalmusym to query. One could feasibly use `muacres` as a heuristic to adjust for total acreage within chunks. #' #' Note that STATSGO data are fetched where `CLIPAREASYMBOL = 'US'` to avoid duplicating state and national subsets of the geometry. -#' +#' +#' A prototype interface, `geom.src="mlrapolygon"`, is provided for obtaining Major Land Resource Area (MLRA) polygon +#' boundaries. When using this geometry source `x` is a vector of `MLRARSYM` (MLRA Symbols). The geometry source is +#' the MLRA Geographic Database v5.2 (2022) which is not (yet) part of Soil Data Access. Instead of SDA, GDAL utilities +#' are used to read a zipped ESRI Shapefile from a remote URL: . +#' Therefore, most additional `fetchSDA_spatial()` arguments are _not_ currently supported for the MLRA geometry source. +#' In the future a `mlrapolygon` table may be added to SDA (analogous to `mupolygon` and `sapolygon`), +#' and the function will be updated accordingly at that time. +#' #' @author Andrew G. Brown, Dylan E. Beaudette #' @examplesIf curl::has_internet() -#' @examples #' \donttest{ #' #' # get spatial data for a single mukey @@ -60,12 +67,14 @@ fetchSDA_spatial <- function(x, chunk.size = 10, verbose = TRUE, as_Spatial = getOption('soilDB.return_Spatial', default = FALSE)) { + geom.src <- match.arg(tolower(geom.src), choices = c("mupolygon", "sapolygon", "mlrapolygon")) db <- match.arg(toupper(db), choices = c("SSURGO", "STATSGO")) # survey area polygons only available in SSURGO if (geom.src == 'sapolygon') { db <- 'SSURGO' } + # statsgo flag use_statsgo <- (db == "STATSGO") @@ -78,8 +87,11 @@ fetchSDA_spatial <- function(x, # be in different chunks x <- unique(x) - # lkey and areasymbol are the option for sapolygon - if (geom.src == 'sapolygon' && (by.col %in% c("mukey", "nmusym", "nationalmusym"))) { + if (geom.src == "mlrapolygon") { + # mlra polygons are not part of SSURGO or STATSGO + by.col <- "MLRARSYM" + } else if (geom.src == 'sapolygon' && (by.col %in% c("mukey", "nmusym", "nationalmusym"))) { + # lkey and areasymbol are the option for sapolygon if (is.numeric(x)) { by.col <- "lkey" } else { @@ -129,6 +141,19 @@ fetchSDA_spatial <- function(x, } mukey.list <- unique(res$lkey) + + } else if (by.col == "MLRARSYM") { + if (!requireNamespace("sf")) { + stop("package 'sf' is required to read MLRA boundaries from ZIP file source", call. = FALSE) + } + res <- sf::read_sf("/vsizip//vsicurl/https://www.nrcs.usda.gov/sites/default/files/2022-10/MLRA_52_2022.zip/MLRA_52_2022", query = sprintf("SELECT * FROM MLRA_52 WHERE MLRARSYM IN %s", format_SQL_in_statement(x)), + as_tibble = FALSE, + stringsAsFactors = FALSE) + # use "geom" for consistency with other spatial outputs from SDA; requires sf >= 1.0-6 + sf::st_geometry(res) <- "geom" + # TODO: could provide custom MLRA aggregation methods here: centroid, bbox, convex hull? + # in the future a T-SQL implementation would allow for any of the defined method options + return(res) } else { return(try(stop(paste0("Unknown mapunit identifier (", by.col, ")"), call. = FALSE))) } @@ -182,7 +207,6 @@ fetchSDA_spatial <- function(x, sub.res <- .fetchSDA_spatial(mukeys, geom.type, geom.src, use_statsgo, add.fields, verbose, paste0(i, "_", xx), by.col) - if (inherits(sub.res$result, 'try-error')) { # explicit handling for a hypothetical unqueryable single mukey message("Symbol ", xx, " dropped from result due to error! May exceed the JSON serialization limit or have other topologic problems.") diff --git a/man/fetchSDA_spatial.Rd b/man/fetchSDA_spatial.Rd index 30f25fe0..c0fda5be 100644 --- a/man/fetchSDA_spatial.Rd +++ b/man/fetchSDA_spatial.Rd @@ -17,15 +17,15 @@ fetchSDA_spatial( ) } \arguments{ -\item{x}{A vector of map unit keys (\code{mukey}) or national map unit symbols (\code{nmusym}) for \code{mupolygon} geometry OR legend keys (\code{lkey}) or soil survey area symbols (\code{areasymbol}) for \code{sapolygon} geometry} +\item{x}{A vector of map unit keys (\code{mukey}) or national map unit symbols (\code{nmusym}) for \code{mupolygon} geometry OR legend keys (\code{lkey}) or soil survey area symbols (\code{areasymbol}) for \code{sapolygon} geometry. If \code{geom.src="mlrapolygon"} then \code{x} refers to \code{MLRARSYM} (major land resource area symbols).} -\item{by.col}{Column name containing map unit identifier \code{"mukey"}, \code{"nmusym"}/\code{"nationalmusym"} for \code{geom.src} \code{mupolygon} OR \code{"areasymbol"}, \code{"areaname"}, \code{"mlraoffice"}, \code{"mouagncyresp"} for \code{geom.src} \code{sapolygon}; default is determined by \code{is.numeric(x)} \code{TRUE} for \code{mukey} or \code{lkey} and \code{nationalmusym} or \code{areasymbol} otherwise.} +\item{by.col}{Column name containing map unit identifier \code{"mukey"}, \code{"nmusym"}/\code{"nationalmusym"} for \code{geom.src="mupolygon"} OR \code{"areasymbol"}, \code{"areaname"}, \code{"mlraoffice"}, \code{"mouagncyresp"} for \code{geom.src="sapolygon"}; default is determined by \code{is.numeric(x)} \code{TRUE} for \code{mukey} or \code{lkey} and \code{nationalmusym} or \code{areasymbol} otherwise.} \item{method}{geometry result type: \code{"feature"} returns polygons, \code{"bbox"} returns the bounding box of each polygon (via \code{STEnvelope()}), \code{"point"} returns a single point (via \code{STPointOnSurface()}) within each polygon, \code{"extent"} returns an aggregate bounding box (the extent of all polygons, \code{geometry::EnvelopeAggregate()}) ), \code{"convexhull"} (\code{geometry::ConvexHullAggregate()}) returns the aggregate convex hull around all polygons, \code{"union"} (\code{geometry::UnionAggregate()}) and \code{"collection"} (\code{geometry::CollectionAggregate()}) return a \code{MULTIPOLYGON} or a \code{GEOMETRYCOLLECTION}, respectively, for each \code{mukey}, \code{nationalmusym}, or \code{areasymbol }. In the case of the latter four aggregation methods, the groups for aggregation depend on \code{by.col} (default by \code{"mukey"}).} -\item{geom.src}{Either \code{mupolygon} (map unit polygons) or \code{sapolygon} (soil survey area boundary polygons)} +\item{geom.src}{Either \code{mupolygon} (map unit polygons), \code{sapolygon} (soil survey area boundary polygons), or \code{mlrapolygon} (major land resource area boundary polygons)} -\item{db}{Default: \code{"SSURGO"}. When \code{geom.src} is \code{mupolygon}, use STATSGO polygon geometry instead of SSURGO by setting \code{db = "STATSGO"}} +\item{db}{Default: \code{"SSURGO"}. When \code{geom.src} is \code{mupolygon}, use STATSGO polygon geometry instead of SSURGO by setting \code{db = "STATSGO"}.} \item{add.fields}{Column names from \code{mapunit} or \code{legend} table to add to result. Must specify parent table name as the prefix before column name e.g. \code{mapunit.muname}.} @@ -49,10 +49,17 @@ This function automatically "chunks" the input vector (using \code{makeChunks()} Querying regions with complex mapping may require smaller \code{chunk.size}. Numerically adjacent IDs in the input vector may share common qualities (say, all from same soil survey area or region) which could cause specific chunks to perform "poorly" (slow or error) no matter what the chunk size is. Shuffling the order of the inputs using \code{sample()} may help to eliminate problems related to this, depending on how you obtained your set of MUKEY/nationalmusym to query. One could feasibly use \code{muacres} as a heuristic to adjust for total acreage within chunks. Note that STATSGO data are fetched where \code{CLIPAREASYMBOL = 'US'} to avoid duplicating state and national subsets of the geometry. + +A prototype interface, \code{geom.src="mlrapolygon"}, is provided for obtaining Major Land Resource Area (MLRA) polygon +boundaries. When using this geometry source \code{x} is a vector of \code{MLRARSYM} (MLRA Symbols). The geometry source is +the MLRA Geographic Database v5.2 (2022) which is not (yet) part of Soil Data Access. Instead of SDA, GDAL utilities +are used to read a zipped ESRI Shapefile from a remote URL: \url{https://www.nrcs.usda.gov/sites/default/files/2022-10/MLRA_52_2022.zip}. +Therefore, most additional \code{fetchSDA_spatial()} arguments are \emph{not} currently supported for the MLRA geometry source. +In the future a \code{mlrapolygon} table may be added to SDA (analogous to \code{mupolygon} and \code{sapolygon}), +and the function will be updated accordingly at that time. } \examples{ \dontshow{if (curl::has_internet()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} -\dontshow{\}) # examplesIf} \donttest{ # get spatial data for a single mukey @@ -77,6 +84,7 @@ Note that STATSGO data are fetched where \code{CLIPAREASYMBOL = 'US'} to avoid d # demo adding a field (`muname`) to attribute table of result head(try(fetchSDA_spatial(x = "2x8l5", by="nmusym", add.fields="muname"))) } +\dontshow{\}) # examplesIf} } \author{ Andrew G. Brown, Dylan E. Beaudette