-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spotty NCCPI data #303
Comments
Hi Kevin, Thanks for the detailed description of the situation. I believe that you are correct in assuming that the NCCPI submodel estimates are not "exported" for most states. I've asked our Leader for Soil Interpretations for additional information on the plans for NCCPI submodels on into the future. I could be that many of the regional interpretations will be adopted by additional states. I doubt that you will ever be able to create a national map of "NCCPI corn" since the models aren't generalized to the climates outside of the Midwest. I'll report back with my findings. Otherwise, I think that you are attempting to access the data correctly. |
Sorry to hear you are struggling, but I am glad to hear you are interested in our interpretations
Yes, that is true. Only a handful of states (IN, OH, SC, CT, NJ, ND, RI) have the corn/soybean submodels exported at this time (September 2023). I am not sure if more states have opted to export the submodels for the upcoming October 1 refresh. NCCPI v3 (the "main model") is required national interpretation to be exported for all mapunits (see sample code below)
There is no status database other than what is in SSURGO. You can write query against
Running the interpretations on "SSURGO" would be difficult, or perhaps impossible, as the interpretation engine that creates the SSURGO export is using the NASIS data model which is slightly different. There have been some efforts to generalize the ability / data sources we use as input for interpretations, but so far nobody (to my knowledge) has built a working version of NCCPI outside of NASIS. It is theoretically possible but NCCPI is a very complex interpretation compared to some others we have in SSURGO. It would be perhaps possible to create a custom NASIS export of the submodels for all mapunits in the US, but this would require the interpretations be run on the NASIS server, not the SSURGO product. Bob Dobos, National Leader for Interpretations, and main mind behind NCCPI would be the one to talk to about how to go about this. Here is some sample code comparing NCCPI v3 v.s. submodels library(soilDB)
RULES <- c("NCCPI - National Commodity Crop Productivity Index (Ver 3.0)",
"NCCPI - NCCPI Corn Submodel (I)")
res <- soilDB::get_SDA_interpretation(rulename = RULES, method = "Dominant Component",
mukeys = c(242963, 164556))
# all nationalmusyms, excluding STATSGO
x0 <- SDA_query("SELECT COUNT(DISTINCT nationalmusym) FROM mapunit
INNER JOIN legend ON legend.lkey = mapunit.lkey AND legend.areasymbol != 'US'
INNER JOIN component ON mapunit.mukey = component.mukey
AND majcompflag = 'Yes'")
#> single result set, returning a data.frame
x0
#> V1
#> 1 239750
# all nationalmusyms, excluding STATSGO, have NCCPI v3.0
x <- SDA_query("SELECT COUNT(DISTINCT nationalmusym) FROM mapunit
INNER JOIN component ON mapunit.mukey = component.mukey AND majcompflag = 'Yes'
INNER JOIN cointerp ON component.cokey = cointerp.cokey AND
mrulename = 'NCCPI - National Commodity Crop Productivity Index (Ver 3.0)'")
#> single result set, returning a data.frame
x
#> V1
#> 1 239750
# nationalmusyms with NCCPI Corn Submodel (for example)
y <- SDA_query("SELECT COUNT(DISTINCT nationalmusym) FROM mapunit
INNER JOIN component ON mapunit.mukey = component.mukey AND majcompflag = 'Yes'
INNER JOIN cointerp ON component.cokey = cointerp.cokey AND
mrulename = 'NCCPI - NCCPI Corn Submodel (I)'")
#> single result set, returning a data.frame
y
#> V1
#> 1 20490
# 100% of nationalmusyms have NCCPI
x$V1 / x0$V1
#> [1] 1
# and only 9% of nationalmusym have NCCPI corn submodel exported
y$V1 / x0$V1
#> [1] 0.08546403
# general function to see what states export a particular rule
get_states_using_cointerp <- function(rulename) {
do.call('rbind', lapply(rulename, function(r){
res <- SDA_query(sprintf("SELECT DISTINCT SUBSTRING(areasymbol, 0, 3) AS state FROM mapunit
INNER JOIN legend ON legend.lkey = mapunit.lkey AND areasymbol != 'US'
INNER JOIN component ON mapunit.mukey = component.mukey AND majcompflag = 'Yes'
INNER JOIN cointerp ON component.cokey = cointerp.cokey AND mrulename = '%s'", r))
res$mrulename <- r
res
}))
}
get_states_using_cointerp('NCCPI - NCCPI Corn Submodel (I)')
#> single result set, returning a data.frame
#> state mrulename
#> 1 IN NCCPI - NCCPI Corn Submodel (I)
#> 2 OH NCCPI - NCCPI Corn Submodel (I)
#> 3 SC NCCPI - NCCPI Corn Submodel (I)
#> 4 CT NCCPI - NCCPI Corn Submodel (I)
#> 5 NJ NCCPI - NCCPI Corn Submodel (I)
#> 6 ND NCCPI - NCCPI Corn Submodel (I)
#> 7 RI NCCPI - NCCPI Corn Submodel (I)
get_states_using_cointerp('NCCPI - NCCPI Soybeans Submodel (I)')
#> single result set, returning a data.frame
#> state mrulename
#> 1 IN NCCPI - NCCPI Soybeans Submodel (I)
#> 2 OH NCCPI - NCCPI Soybeans Submodel (I)
#> 3 SC NCCPI - NCCPI Soybeans Submodel (I)
#> 4 CT NCCPI - NCCPI Soybeans Submodel (I)
#> 5 NJ NCCPI - NCCPI Soybeans Submodel (I)
#> 6 ND NCCPI - NCCPI Soybeans Submodel (I)
#> 7 RI NCCPI - NCCPI Soybeans Submodel (I) |
Thank you all for the thorough responses! I'm disappointed that this awesome sub-models are not widely available, but I now understand what's going on.
I will reach out to Bob to see how hard this would be. |
Update: Bob says "Actually, the data for all the sub-models is already available of the soil data access facility. I will dust off the query and see if it still works." Great news! |
From my email:
I looked into this. The latter point becomes a fairly difficult problem to fold in I realized an alternate approach--which is something that I should have done a long time ago--and that is is to put the subrule ratings into the "reason" field (c3219c9). These values can then be re-extracted from the result that is 1:1 with mukey and mrulename. This is still a bit messy but it does the job in terms of providing the component ratings alongside the main rule rating: library(soilDB)
x <- get_SDA_interpretation(rulename = "NCCPI - National Commodity Crop Productivity Index (Ver 3.0)",
method = "Dominant Component",
mukeys = c("242963","242964","242965"))
# some post-processing of the reason field to extract subrule ratings
x2 <- cbind(x, do.call('rbind', lapply(strsplit(
x$reason_NCCPINationalCommodityCropProductivityIndexVer30,
"; "), function(x) {
x3 <- do.call('rbind', strsplit(gsub("([A-Za-z ]+) \\((.*)\\)", "\\1;\\2", x), ";"))
x4 <- as.data.frame(as.list(x3[,2]))
colnames(x4) <- paste0("rating_NCCPINationalCommodityCropProductivityIndexVer30_", soilDB:::.cleanRuleColumnName(x3[,1]))
x4
})))
x2
#> mukey cokey areasymbol musym
#> 1 242963 23671045 IL019 152A
#> 2 242964 23670915 IL019 134A
#> 3 242965 23671016 IL019 154A
#> muname compname compkind comppct_r
#> 1 Drummer silty clay loam, 0 to 2 percent slopes Drummer Series 94
#> 2 Camden silt loam, 0 to 2 percent slopes Camden Series 92
#> 3 Flanagan silt loam, 0 to 2 percent slopes Flanagan Series 95
#> majcompflag rating_NCCPINationalCommodityCropProductivityIndexVer30
#> 1 Yes 0.826
#> 2 Yes 0.917
#> 3 Yes 0.899
#> class_NCCPINationalCommodityCropProductivityIndexVer30
#> 1 High inherent productivity
#> 2 High inherent productivity
#> 3 High inherent productivity
#> reason_NCCPINationalCommodityCropProductivityIndexVer30
#> 1 No limitation (0); Cotton (0.001); Small grains (0.687); Soybeans (0.752); Corn (0.826)
#> 2 Cotton (0.001); Soybeans (0.777); Small grains (0.791); Corn (0.917); No limitation (0)
#> 3 No limitation (0); Cotton (0.001); Small grains (0.734); Soybeans (0.76); Corn (0.899)
#> rating_NCCPINationalCommodityCropProductivityIndexVer30_Nolimitation
#> 1 0
#> 2 0
#> 3 0
#> rating_NCCPINationalCommodityCropProductivityIndexVer30_Cotton
#> 1 0.001
#> 2 0.001
#> 3 0.001
#> rating_NCCPINationalCommodityCropProductivityIndexVer30_Smallgrains
#> 1 0.687
#> 2 0.791
#> 3 0.734
#> rating_NCCPINationalCommodityCropProductivityIndexVer30_Soybeans
#> 1 0.752
#> 2 0.777
#> 3 0.76
#> rating_NCCPINationalCommodityCropProductivityIndexVer30_Corn
#> 1 0.826
#> 2 0.917
#> 3 0.899 |
@brownag have you had any success with this? |
@kevinwolz yes, and I refined the initial solution proposed above in #308 so it should be fairly painless to access any subrule ratings for interps. Using the new argument library(soilDB)
x <- get_SDA_interpretation(rulename = "NCCPI - National Commodity Crop Productivity Index (Ver 3.0)",
method = "Dominant Component",
mukeys = c("242963","242964","242965"),
wide_reason = TRUE)
x
#> mukey cokey areasymbol musym
#> 1 242963 23671045 IL019 152A
#> 2 242964 23670915 IL019 134A
#> 3 242965 23671016 IL019 154A
#> muname compname compkind comppct_r
#> 1 Drummer silty clay loam, 0 to 2 percent slopes Drummer Series 94
#> 2 Camden silt loam, 0 to 2 percent slopes Camden Series 92
#> 3 Flanagan silt loam, 0 to 2 percent slopes Flanagan Series 95
#> majcompflag rating_NCCPINationalCommodityCropProductivityIndexVer30
#> 1 Yes 0.826
#> 2 Yes 0.917
#> 3 Yes 0.899
#> class_NCCPINationalCommodityCropProductivityIndexVer30
#> 1 High inherent productivity
#> 2 High inherent productivity
#> 3 High inherent productivity
#> reason_NCCPINationalCommodityCropProductivityIndexVer30
#> 1 Impacted soil "No limitation" (0); NCCPI - NCCPI Cotton Submodel (II) "Cotton" (0.001); NCCPI - NCCPI Small Grains Submodel (II) "Small grains" (0.687); NCCPI - NCCPI Soybeans Submodel (I) "Soybeans" (0.752); NCCPI - NCCPI Corn Submodel (I) "Corn" (0.826)
#> 2 NCCPI - NCCPI Cotton Submodel (II) "Cotton" (0.001); NCCPI - NCCPI Soybeans Submodel (I) "Soybeans" (0.777); NCCPI - NCCPI Small Grains Submodel (II) "Small grains" (0.791); NCCPI - NCCPI Corn Submodel (I) "Corn" (0.917); Impacted soil "No limitation" (0)
#> 3 Impacted soil "No limitation" (0); NCCPI - NCCPI Cotton Submodel (II) "Cotton" (0.001); NCCPI - NCCPI Small Grains Submodel (II) "Small grains" (0.734); NCCPI - NCCPI Soybeans Submodel (I) "Soybeans" (0.76); NCCPI - NCCPI Corn Submodel (I) "Corn" (0.899)
#> rating_reason_NCCPINationalCommodityCropProductivityIndexVer30_Impactedsoil
#> 1 0
#> 2 0
#> 3 0
#> rating_reason_NCCPINationalCommodityCropProductivityIndexVer30_NCCPINCCPICottonSubmodelII
#> 1 0.001
#> 2 0.001
#> 3 0.001
#> rating_reason_NCCPINationalCommodityCropProductivityIndexVer30_NCCPINCCPISmallGrainsSubmodelII
#> 1 0.687
#> 2 0.791
#> 3 0.734
#> rating_reason_NCCPINationalCommodityCropProductivityIndexVer30_NCCPINCCPISoybeansSubmodelI
#> 1 0.752
#> 2 0.777
#> 3 0.76
#> rating_reason_NCCPINationalCommodityCropProductivityIndexVer30_NCCPINCCPICornSubmodelI
#> 1 0.826
#> 2 0.917
#> 3 0.899 |
I am struggling with the fact that the NCCPI data is very spotty in the database. For example:
Here I look up the most common mukey (Drummer) in Champaign County, IL, which is dominated by corn & soybeans:
The results provide the generic NCCPI rating, but all of the submodels come back as NA. I can understand why the cotton submodel might not be evaluated on this mukey, but the absence of the corn and soybean submodel doata surprised me.
If I generalize this call to instead search for the nationalmusym that includes that mukey, I get the same results but for many counties in Illinois that have this nationalmusym. However, I DO get one mukey in Indiana, and this one does have values for all submodels except cotton.
I've been told in the past that the presence/absence of these interpretation data is dependent on whether a particular state office "decided to run/export that interpretation". Is that true, or is there something else going on here?
Is there some status database somewhere that can show me which states have run/exported which interpretations?
I am trying to create a national map of NCCPI corn or soybean submodel values, is there any way that
soilDB::get_SDA_interpretation
can help me get around this?Or perhaps my only choice would be build out these models from scratch and run them on raw SSURGO data myself? Maybe that's easier than trying to run the circuit and convince all states to export the data. Is there any
soilDB
support for something like this?The text was updated successfully, but these errors were encountered: