Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetchSDA_spatial: add new T-SQL aggregate geometry methods #299

Merged
merged 6 commits into from
Oct 2, 2023

Conversation

brownag
Copy link
Member

@brownag brownag commented Sep 1, 2023

Adds new options ("extent", "convexhull", "union", and "collection") for method argument to fetchSDA_spatial(). These options perform various aggregations on the mupolygon or sapolygon geometry on the server side, returning a simplified result.

Using the T-SQL static aggregate geometry methods:

Demonstration:

suppressPackageStartupMessages(library(soilDB))

# default behavior: 2 bbox for extent, one for each mukey (in this case these have same nationalmusym)
x <- fetchSDA_spatial(
  x = c(2440240, 2924995),
  method = "extent"
) 
#> Using 1 chunks...
#> Chunk #1 completed (n = 2; 1 secs)
#> Done in 1 secs; mean/chunk: 1 secs; mean/symbol: 0.5 secs.
x
#> Simple feature collection with 2 features and 3 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -120.8787 ymin: 37.47041 xmax: -119.9521 ymax: 38.30401
#> Geodetic CRS:  WGS 84
#>     mukey areasymbol nationalmusym                           geom
#> 1 2440240      CA630         2mx8f POLYGON ((-120.8787 37.7260...
#> 2 2924995      CA649         2mx8f POLYGON ((-120.3299 37.4704...
plot(x$geom)

# if by.col="nationalmusym" only one result is returned. in this case using method="convexhull"
y <- fetchSDA_spatial(
  x = "2mx8f",
  by.col = "nationalmusym",
  method = "convexhull"
)
#> Using 1 chunks...
#> Chunk #1 completed (n = 2; 0.2 secs)
#> Done in 0.4 secs; mean/chunk: 0.2 secs; mean/symbol: 0.19 secs.
y
#> Simple feature collection with 1 feature and 3 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -120.8787 ymin: 37.47041 xmax: -119.9521 ymax: 38.30401
#> Geodetic CRS:  WGS 84
#>              mukey   areasymbol nationalmusym                           geom
#> 1 2440240, 2924995 CA630, CA649         2mx8f POLYGON ((-119.9521 37.4998...


# note that the convex hull includes areas that are part of both constituent mukey bboxes
plot(y$geom, add=TRUE, col="RED")

# aggregation methods apply equivalently to the sapolygon geometries
z <- fetchSDA_spatial(
  x = c("CA740", "CA732", "CA802"),
  by.col = "areasymbol",
  geom = "sapolygon",
  method = "convexhull"
)
#> Using 1 chunks...
#> Chunk #1 completed (n = 3; 0.3 secs)
#> Done in 0.4 secs; mean/chunk: 0.3 secs; mean/symbol: 0.13 secs.
plot(z)

# using method="union" you can get 3 multipart survey areas returned as 3 MULTIPOLYGON
z <- fetchSDA_spatial(
  x = c("CA740", "CA732", "CA802"),
  by.col = "areasymbol",
  geom = "sapolygon",
  method = "union"
)
#> Using 1 chunks...
#> Chunk #1 completed (n = 3; 3.1 secs)
#> Done in 3.2 secs; mean/chunk: 3.1 secs; mean/symbol: 1.07 secs.
z
#> Simple feature collection with 3 features and 2 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -119.3903 ymin: 36.03901 xmax: -117.7945 ymax: 38.2094
#> Geodetic CRS:  WGS 84
#>    lkey areasymbol                           geom
#> 1 14164      CA732 MULTIPOLYGON (((-118.8043 3...
#> 2 14165      CA740 MULTIPOLYGON (((-119.248 37...
#> 3 14244      CA802 MULTIPOLYGON (((-119.1054 3...

@dylanbeaudette
Copy link
Member

The following map unit keys are associated with very complex polygons: ('1598653', '1598670', '162786', '1413444', '403308', '2605765', '161428', '808535').

Here are some stats / point-on-surface coordinates for some delineations tied to these map unit keys. Anything with a fractal dimension ("fd") > 1.3 is either very complex, or some kind of geometric error (e.g. sliver polygon).

 areasymbol |  mukey  | ogc_fid  |        fd         |       st_x        |       st_y
------------+---------+----------+-------------------+-------------------+------------------
 in069      | 161428  |  9937163 |  1.48328711710149 |  -85.416953776054 | 40.7499389755057
 in069      | 161428  |  9948437 |  1.46487512704564 |  -85.482530692046 | 40.7068578216985
 la075      | 1598653 | 12399549 |  1.46163071003991 | -89.9114833563087 | 29.4975338125364
 la075      | 1598653 | 12390850 |  1.45988530180611 | -89.4456791098703 | 29.1909038539091
 in069      | 161428  |  9947035 |  1.45958506507375 | -85.4776080088492 | 40.9512008459263
 la057      | 808535  | 12358142 |  1.44351990633671 | -90.9052395795618 | 29.7535805073645
 la075      | 1598653 | 12399094 |   1.4377956041352 | -89.4571619816923 | 29.1988396856297
 ia021      | 403308  |  4957979 |  1.43676407697544 | -95.0402802568294 | 42.7394000845724
 la057      | 808535  | 12357522 |  1.43548782699027 | -90.5702108833809 | 29.7445771122274
 in069      | 161428  |  9937364 |  1.42887746607053 |  -85.600613115432 | 40.9457825632656
 in069      | 161428  |  9936537 |  1.42302224162814 |  -85.640465571175 | 40.9778259842066
 la057      | 808535  | 12358490 |  1.42189171043039 | -90.5196409691703 | 29.6089347799457
 la057      | 808535  | 12358986 |  1.41932662856896 | -90.2435096155968 | 29.4243124877804
 la057      | 808535  | 12356857 |  1.40927152230042 | -90.5504866246769 | 29.7146190633603
 la075      | 1598653 | 12403712 |  1.40641834934647 |  -89.617586937944 | 29.3005355683554
 la057      | 808535  | 12357703 |  1.40373102348149 | -90.1078195510741 | 29.4591174179254
 la057      | 808535  | 12357687 |  1.40208424133049 | -90.5174479199878 | 29.7147878087805
 in069      | 161428  |  9938261 |  1.40206214445821 | -85.4262303400103 |   40.86742435906
 la057      | 808535  | 12355116 |  1.40001693367425 |  -90.596932159237 | 29.6824435958016
 la057      | 808535  | 12355102 |  1.39822321741683 | -90.6061649657523 | 29.6832350107081
 in069      | 161428  |  9943436 |  1.38400867153309 | -85.5537482314007 | 40.9908107561344

 - one liner is easy to incorporate into query but probably not the most efficient; consider refactor
@brownag
Copy link
Member Author

brownag commented Sep 23, 2023

Thanks for pointing out those mukeys, @dylanbeaudette!

I think the fractal dimension was not the issue here (for the new aggregation queries) but rather the sheer number of polygons and the clumsy way I was aggregating multiple mukey/areasymbol per nationalmusym. This works now, and will generalize for full add.fields support, but I may opt to do this aggregation in a more efficient way.

I've pushed a fix (368c727), and now I am able to do e.g. convex hull of your problem mukeys almost 100x faster than the full geometries. e.g.

suppressPackageStartupMessages(library(soilDB))

x <- c('1598653', '1598670', '162786', '1413444', '403308', '2605765', '161428', '808535')

y <- fetchSDA_spatial(x, method = "convexhull")
#> Using 1 chunks...
#> Chunk #1 completed (n = 8; 4.1 secs)
#> Done in 4.1 secs; mean/chunk: 4.1 secs; mean/symbol: 0.51 secs.

plot(y$geom)

z <- fetchSDA_spatial(x)
#> Using 1 chunks...
#> Chunk #1 completed (n = 8; 3 secs)
#> Done in 2.96 mins; mean/chunk: 177.9 secs; mean/symbol: 22.24 secs.

@dylanbeaudette
Copy link
Member

Nice. having these generalization tools easily available is a great addition.

@brownag brownag marked this pull request as ready for review September 25, 2023 23:29
@brownag brownag changed the title fetchSDA_spatial: add new aggregation methods fetchSDA_spatial: add new T-SQL aggregate geometry methods Oct 2, 2023
@brownag brownag merged commit 89cd245 into master Oct 2, 2023
4 of 5 checks passed
@brownag brownag deleted the fetchSDA_spatial-agg branch December 13, 2023 23:26
brownag added a commit that referenced this pull request Feb 2, 2024
…ry results

 - introduced in PR #299 when "extent", "convexhull", "union", and "collection" methods were added
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants