diff --git a/AQP/aqp/estimate-ML-horizonation.html b/AQP/aqp/estimate-ML-horizonation.html index 578a3f7..0d07f2d 100644 --- a/AQP/aqp/estimate-ML-horizonation.html +++ b/AQP/aqp/estimate-ML-horizonation.html @@ -1,116 +1,350 @@ - +
- - + + -D.E. Beaudette, J.M. Skovlin
2016-08-17
This document is based on aqp
version 1.9.11 and soilDB
version 1.8-1.
D.E. Beaudette, J.M. Skovlin
2024-03-05
This document is
+based on aqp
version 2.0.3 and soilDB
version
+2.8.1.
Consider this situation: you have a collection of pedons that have been correlated to a named soil series (or component) and would like to objectively compute a range in characteristics (“low-rv-high” values) and horizon depths. As with most collections of pedon data, there may be considerable variation in description style and horizons used, horizon depths, and number of horizons described:
+Consider this situation: you have a collection of pedons that have +been correlated to a named soil series (or component) and would like to +objectively compute a range in characteristics (“low-rv-high” +values) and horizon depths. As with most collections of pedon data, +there may be considerable variation in description style and horizons +used, horizon depths, and number of horizons described:
In this scenario, there are several obvious “micro-correlation” decisions that need to be made before horizons can be grouped for aggregation. For example, what horizonation prototype scheme (e.g., A-Bt1-Bt2-Bt3-Cr-R) best conveys the concept of this soil series or soil component? Does it make sense to group [Bt3, Bt4, BCt, CBt] horizons for aggregation? Or what about grouping [Bt3, 2Bt3] horizons? Do [BA, AB] horizons occur frequently enough to be included in the horizonation prototype?
-Based on your knowledge of the area, pedon 2 might be a good “typical” pedon to use in developing a horizonation prototype. After careful review of the data and consultation with your crew, a new set of labels are assigned to each horizon (red labels in figure above) that define groups over which soil properties will be aggregated. These new labels define functionally similar groups that may span multiple genetic horizons.
+In this scenario, there are several obvious “micro-correlation” +decisions that need to be made before horizons can be grouped for +aggregation. For example, what horizonation prototype scheme (e.g., +A-Bt1-Bt2-Bt3-Cr-R) best conveys the concept of this soil series or soil +component? Does it make sense to group [Bt3, Bt4, BCt, CBt] horizons for +aggregation? Or what about grouping [Bt3, 2Bt3] horizons? Do [BA, AB] +horizons occur frequently enough to be included in the horizonation +prototype?
+Based on your knowledge of the area, pedon 2 might be a good +“typical” pedon to use in developing a horizonation prototype. After +careful review of the data and consultation with your crew, a new set of +labels are assigned to each horizon (red +labels in figure above) that define groups over which soil +properties will be aggregated. These new labels define functionally +similar groups that may span multiple genetic horizons.
Once you have assigned generalized horizon labels (GHL) to your pedons it is possible to generate a set of “most-likely” horizon depths for each GHL. An evaluation GHL probability along 1-cm depth slices results in a set of probability depth functions; the points at which these curves cross are good estimates of “most-likely” horizon depths. For example, in the figure below approximate horizon depths of 8, 28, 50, and 90cm can be readily estimated from the probability depth functions. It is also possible to visually determine when less common horizons (such as the BA horizon in the figure below) may not be common enough to include in a representative profile.
+Once you have assigned generalized +horizon labels (GHL) to your pedons it is possible to generate a set +of “most-likely” horizon depths for each GHL. An evaluation GHL +probability along 1-cm depth slices results in a set of probability +depth functions; the points at which these curves cross are good +estimates of “most-likely” horizon depths. For example, in the figure +below approximate horizon depths of 8, 28, 50, and 90cm can be readily +estimated from the probability depth functions. It is also possible to +visually determine when less common horizons (such as the BA horizon in +the figure below) may not be common enough to include in a +representative profile.
If you have never used the aqp or soildb packages before, you will likely need to install them. This only needs to be done once.
+If you have never used the aqp or soildb +packages before, you will likely need to install them. This only needs +to be done once.
# stable version from CRAN + dependencies
-install.packages('aqp', dep=TRUE)
-install.packages('soilDB', dep=TRUE)
+install.packages('aqp', dep=TRUE)
+install.packages('soilDB', dep=TRUE)
# latest versions from R-Forge:
-install.packages('aqp', repos="http://R-Forge.R-project.org")
-install.packages('soilDB', repos="http://R-Forge.R-project.org")
-Once you have all of the R packages on which this document depends, it is a good idea to load them. R packages must be installed anytime you change versions of R (e.g., after an upgrade) and loaded anytime you want to access functions from within those packages.
+install.packages('aqp', repos="http://R-Forge.R-project.org") +install.packages('soilDB', repos="http://R-Forge.R-project.org") +Once you have all of the R packages on which this document depends, +it is a good idea to load them. R packages must be +installed anytime you change versions of R (e.g., after +an upgrade) and loaded anytime you want to access +functions from within those packages.
library(aqp)
library(soilDB)
library(latticeExtra)
-library(plyr)
library(rms)
While the methods outlined in this document can be applied to any collection of pedons, it is convenient to work with a standardized set of data. You can follow along with the analysis by copying code from the following blocks and running it in your R session. The sample data used in this document is based on 15 soil profiles that have been correlated to the Loafercreek soil series from the Sierra Nevada Foothill Region of California. Note that the internal structure of the loafercreek
data is identical to the structure returned by fetchNASIS()
from the soilDB package. All horizon-level values are pulled from the pedon horizon table of the pedons being analyzed.
While the methods outlined in this document can be applied to any
+collection of pedons, it is convenient to work with a standardized set
+of data. You can follow along with the analysis by copying code from the
+following blocks and running it in your R session. The
+sample data used in this document is based on 15 soil profiles that have
+been correlated to the Loafercreek
+soil series from the Sierra Nevada Foothill Region of California. Note
+that the internal structure of the loafercreek
data is
+identical to the structure returned by fetchNASIS()
+from the soilDB package. All horizon-level values are pulled from
+the pedon horizon table of the pedons being analyzed.
# load sample data from the soilDB package
-data(loafercreek, package = 'soilDB')
+data(loafercreek, package = 'soilDB')
pedons <- loafercreek
+
# plot the first 15 profiles
par(mar=c(0,0,0,0))
-plot(pedons[1:15, ], name='hzname', print.id=FALSE, cex.names=0.8, axis.line.offset=-4)
-
+plot(pedons[1:15, ], name = 'hzname', print.id = FALSE, cex.names = 0.8, depth.axis = list(line = -4))
+
The following code block demonstrates how to pull data in using the fetchNASIS()
function from the soilDB package.
The following code block demonstrates how to pull data in using the
+fetchNASIS()
+function from the soilDB package.
# first load the desired data set within NASIS into your NASIS selected set
# then load data from the NASIS selected set into R
pedons <- fetchNASIS()
# optionally subset the data by taxon name - enter your taxon name
-pedons <- pedons[grep(pattern='ENTER_YOUR_TAXON_NAME', pedons$taxonname, ignore.case=TRUE), ]
+pedons <- pedons[grep(pattern='ENTER_YOUR_TAXON_NAME', pedons$taxonname, ignore.case=TRUE), ]
# proportional-odds logistics regression: fits well, ignore standard errors using sliced data
@@ -218,41 +502,43 @@ TODO: abstract most of this into a convenience function
# 100 cm... should we use penalized PO-LR? see pentrace()
dd <- datadist(horizons(s))
options(datadist = "dd")
+
(l.genhz <- orm(genhz ~ rcs(hzdept), data = horizons(s), x = TRUE, y = TRUE))
-##
+## Frequencies of Missing Values Due to Each Variable
+## genhz hzdept
+## 4222 0
+##
## Logistic (Proportional Odds) Ordinal Regression Model
##
## orm(formula = genhz ~ rcs(hzdept), data = horizons(s), x = TRUE,
## y = TRUE)
+##
+##
## Frequencies of Responses
##
## A BA Bt1 Bt2 Bt3 Cr R
-## 440 182 1203 1276 1054 1117 1651
-##
-## Frequencies of Missing Values Due to Each Variable
-## genhz hzdept
-## 2590 0
+## 668 599 1879 1812 1986 1568 3272
##
##
## Model Likelihood Discrimination Rank Discrim.
## Ratio Test Indexes Indexes
-## Obs 6923 LR chi2 12887.35 R2 0.868 rho 0.929
-## Unique Y 7 d.f. 4 g 5.722
-## Median Y 5 Pr(> chi2) <0.0001 gr 305.564
-## max |deriv| 5e-04 Score chi2 11732.53 |Pr(Y>=median)-0.5| 0.409
+## Obs 11784 LR chi2 19864.93 R2 0.837 rho 0.912
+## Distinct Y 7 d.f. 4 R2(4,11784) 0.815
+## Median Y 5 Pr(> chi2) <0.0001 R2(4,11353.3) 0.826
+## max |deriv| 5e-05 Score chi2 18397.03 |Pr(Y>=median)-0.5| 0.405
## Pr(> chi2) <0.0001
##
## Coef S.E. Wald Z Pr(>|Z|)
-## y>=BA -1.2790 0.0902 -14.17 <0.0001
-## y>=Bt1 -2.0662 0.0948 -21.80 <0.0001
-## y>=Bt2 -5.9137 0.1648 -35.88 <0.0001
-## y>=Bt3 -9.0995 0.1993 -45.67 <0.0001
-## y>=Cr -11.4069 0.2085 -54.72 <0.0001
-## y>=R -13.9939 0.2201 -63.59 <0.0001
-## hzdept 0.2112 0.0061 34.80 <0.0001
-## hzdept' -0.2966 0.0295 -10.06 <0.0001
-## hzdept'' 0.6433 0.0897 7.17 <0.0001
-## hzdept''' -0.5940 0.1348 -4.41 <0.0001
+## y>=BA -0.8805 0.0674 -13.07 <0.0001
+## y>=Bt1 -2.2527 0.0741 -30.41 <0.0001
+## y>=Bt2 -5.2425 0.1102 -47.57 <0.0001
+## y>=Bt3 -7.6052 0.1269 -59.93 <0.0001
+## y>=Cr -10.1042 0.1358 -74.42 <0.0001
+## y>=R -11.9939 0.1418 -84.59 <0.0001
+## hzdept 0.1762 0.0040 43.80 <0.0001
+## hzdept' -0.1991 0.0208 -9.59 <0.0001
+## hzdept'' 0.4051 0.0634 6.39 <0.0001
+## hzdept''' -0.4686 0.0911 -5.15 <0.0001
# predict along same depths: columns are the class-wise probability fitted.ind --> return all
# probability estimates
p <- data.frame(predict(l.genhz, data.frame(hzdept = slice.vect), type = "fitted.ind"))
@@ -263,16 +549,16 @@ TODO: abstract most of this into a convenience function
# add depths
p$top <- slice.vect
-p.ml <- get.ml.hz(p, hz.names)
+p.ml <- get.ml.hz(p, o.names = hz.names)
p.ml
-## hz top bottom confidence pseudo.brier
-## 1 A 0 9 54 0.26535710
-## 2 Bt1 9 30 62 0.22608888
-## 3 Bt2 30 50 58 0.27730934
-## 4 Bt3 50 68 48 0.39411511
-## 5 Cr 68 91 52 0.35529430
-## 6 R 91 151 85 0.06798828
+## hz top bottom confidence pseudo.brier mean.H
+## 1 A 0 10 47 0.3651021 1.4962620
+## 2 Bt1 10 32 53 0.3228127 1.6427389
+## 3 Bt2 32 47 49 0.3791655 1.6904646
+## 4 Bt3 47 71 50 0.3619597 1.7297704
+## 5 Cr 71 85 42 0.4887502 1.7022853
+## 6 R 85 151 81 0.0854379 0.7510338