-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Testing for sf object #62
Comments
If I remember correctly, this is a hack to deal with the fact that, when running a workflow, datasets that are generated from a |
Oh, I see. Yep - I knew it would melt my head. May be a simple solution might be a simple heads-up to the user to name the geometry list column "geometry" ala |
@btupper Can you have a look at the branch |
Woooohooo! It works without issue after assigning a random name to the geometry list column. The example below borrows from your tutorial but substitutes in some local data. Thank you! library(tidysdm)
#> Loading required package: tidymodels
#> Loading required package: spatialsample
library(sf)
#> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(here)
#> here() starts at /Users/ben/Library/CloudStorage/Dropbox/code/projects/mola-mola
model_input = sf::read_sf("model_input.gpkg") |>
sf::st_set_geometry("happy_halloween")
model_input
#> Simple feature collection with 1089 features and 5 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -76 ymin: 35 xmax: -59.5 ymax: 46
#> Geodetic CRS: WGS 84
#> # A tibble: 1,089 × 6
#> class sst windspeed u_wind v_wind happy_halloween
#> <chr> <dbl> <dbl> <dbl> <dbl> <POINT [°]>
#> 1 presence 12.5 4.91 -0.270 -0.543 (-66.60798 42.2943)
#> 2 presence 20.2 4.86 1.59 0.756 (-68.00023 42.60569)
#> 3 presence 14.9 8.41 2.05 -0.739 (-72.66639 40.46282)
#> 4 presence 18.1 4.31 1.28 2.79 (-69.75882 42.53246)
#> 5 presence 16.7 8.38 0.864 -1.11 (-75.3774 36.2458)
#> 6 presence 27.8 7.50 -3.31 -1.33 (-75.08756 35.54985)
#> 7 presence 12.6 4.61 -0.155 -0.274 (-67.15288 42.30458)
#> 8 presence 16.6 4.08 0.846 0.505 (-71.46238 41.07934)
#> 9 presence 17.4 4.48 0.906 0.429 (-65.18623 42.52456)
#> 10 presence 17.5 4.15 -0.0541 0.253 (-69.61 41.55)
#> # ℹ 1,079 more rows
rec <- recipe(model_input, formula = class ~ .)
models <-
# create the workflow_set
workflow_set(
preproc = list(default = rec),
models = list(
# the standard glm specs
glm = tidysdm::sdm_spec_glm(),
# rf specs with tuning
rf = tidysdm::sdm_spec_rf(),
# boosted tree model (gbm) specs with tuning
gbm = tidysdm::sdm_spec_boost_tree(),
# maxent specs with tuning
maxent = tidysdm::sdm_spec_maxent()
),
# make all combinations of preproc and models,
cross = TRUE ) |>
# tweak controls to store information needed later to create the ensemble
option_add(control = tidysdm::control_ensemble_grid())
input_cv <- spatial_block_cv(data = model_input, v = 3, n = 5)
set.seed(1234567)
models <- models |>
workflow_map("tune_grid",
resamples = input_cv,
grid = 3,
metrics = tidysdm::sdm_metric_set(),
verbose = TRUE )
#> i No tuning parameters. `fit_resamples()` will be attempted
#> i 1 of 4 resampling: default_glm
#> ✔ 1 of 4 resampling: default_glm (264ms)
#> i 2 of 4 tuning: default_rf
#> i Creating pre-processing data to finalize unknown parameter: mtry
#> ✔ 2 of 4 tuning: default_rf (2.3s)
#> i 3 of 4 tuning: default_gbm
#> i Creating pre-processing data to finalize unknown parameter: mtry
#> ✔ 3 of 4 tuning: default_gbm (6s)
#> i 4 of 4 tuning: default_maxent
#> ✔ 4 of 4 tuning: default_maxent (1.9s) Created on 2024-10-31 with reprex v2.1.1 |
I'm trying to understand the context of the prep.spatial_recipe function. In particular, the
training
argument is being tested for a "geometry" column. This may lead to problems if the geometry list column is not named "geometry" - there is no requirement that the the geometry list column have any name in particular. In fact, when reading from the ubiquitous geopackage (".gpkg") formatted files it seems that the column is named "geom".The same test occurs in
bake.spatial_recipe
here but with anewdata
argument.Is the point to test that the
training
(ornewdata
) object is ansf
object? If so, wouldinherits(training, 'sf')
suffice? Maybe like this...On the other hand, it may be that
training
data is something I misunderstand. Perhaps it has been modified upstream to explicitly contain a "geometry" column? That would make my head melt, but would explain my confusion, too.The text was updated successfully, but these errors were encountered: