-
Notifications
You must be signed in to change notification settings - Fork 17
What to do with a species with 0% or 100% encounters in any year
I'm getting an error during Data_Fn
about Some years and/or categories have either all or no encounters
: what should I do?
As the message states, this error indicates that you have either 0% or 100% encounters in one or more year for at least one species. This will interfere with default model settings, and I have therefore added an error and informative message. Specifically, default settings for SpatialDeltaGLMM
and VAST
have an intercept for each year of the encounter-probability component (using a delta-model for continuous-valued data) or zero-inflation probability (for count-valued data). This is a problem as explained below. In either case, the intercept going to +/-Inf will then result in a Hessian matrix that is at best positive-semi-definite, such that standard-error computations using the delta-method will fail. Conceptually, it makes sense that the model with a fixed-effect intercept for each year will fail for any year with 0% encounters -- in this case, the best estimate for that year individually is that the species have zero-density everywhere! The easiest solution is to exclude any species where any year was 0% or 100% encounters.
If you don't want to do this, there are several alternative solutions, and all are experimental.
For the delta-model, this intercept will go to +Inf or -Inf for species-year combinations with 100% or 0% encounters.
-
If using
VAST
and some species-year combinations have 100% encounter rates, then you can useObsModel[2]=3
, e.g.,ObsModel=c(1,3)
. This indicates that VAST should check for species-years combinations with 100% encounter rates and, for any such combination, fix the intercept for encounter probability to an extremely high value such that predicted encounter rates are essentially 100% for that year. This generally eliminates any leverage for data in that year on random effectepsilon2_stp
, although you may also want to add some temporal structure to this random effect as well. -
If using
VAST
and specifying that each spatio-temporal term is independent among species and categories (usingFieldConfig = c("Omega1"="IID", "Epsilon1"="IID", "Omega2"="IID", "Epsilon2"="IID")
), you can identify any year-category combination that has 0% encounters and change all data from that year-category combination toNA
.VAST
will then turn-off intercepts for those species-category combinations, andSpatialDeltaGLMM::PlotIndex_Fn
is designed to predict zero total abundance for those year-category combinations when calculating and plotting the abundance-indices. This is useful during compositional-expansion, when the other solutions are not feasible. -
You could make the intercept for encounter-probability/zero-inflation constant over time via
Data_Fn
inputRhoConfig=c("Beta1"=3,"Beta2"=0,"Epsilon1"=0,"Epsilon2"=0)
. You can then over-ride the error message viaData_Fn( ..., "CheckForErrors"=FALSE)
. -
You could make the intercept for encounter-probability/zero-inflation a random effect that is independent among years, follows a random-walk, or follows a first-order autoregressive process using
RhoConfig=c("Beta1"=1,"Beta2"=0,"Epsilon1"=0,"Epsilon2"=0)
orRhoConfig=c("Beta1"=2,"Beta2"=0,"Epsilon1"=0,"Epsilon2"=0)
orRhoConfig=c("Beta1"=4,"Beta2"=0,"Epsilon1"=0,"Epsilon2"=0)
, respectively -
You could use an alternative Poisson-link delta-model that ties together encounter probability and positive-catch-rate components, using
ObsModel[2]=1
(instead ofObsModel[2]=0
as used by default). This may eliminate the issue if the problem is some years with 100% encounter probability and you restrict structure on the 2nd ("average-weight") component usingRhoConfig=c("Beta1"=0,"Beta2"=3,"Epsilon1"=0,"Epsilon2"=0)
andFieldConfig
=c("Omega1"=1, "Epsilon1"=1, "Omega2"=0, "Epsilon2"=0)`. This will not help if the problem is some years with 0% encounter probability. -
In a multispecies model using
VAST
, you can implement one of these solutions for an single species (instead of for all species as the above-options do) by custom modifying theMap
input. This involves building your model:
# Make data
TmbData = Data_Fn(..., "CheckForErrors"=FALSE) # where ... is your existing inputs to Data_Fn
# Load model
TmbList = Build_TMB_Fn( ... ) # where ... is your existing inputs to Build_TMB_Fn
# Extract pre-made `Map`
Map_customized = TmbList[["Map"]]
# Add custom-edits for `Map`
Map_customized $beta1_ct <- ### Add structure here
# Reload model
TmbList = VAST::Build_TMB_Fn("Map"=Map_customized, ...) # where ... is your previous inputs to Build_TMB_Fn
This will build a TMB model with customized restrictions on parameters, but likely requires understanding the structure of the model in detail, as well as how to use the map
input to TMB::MakeADFun
Please note that none of these solutions are "conventional" (because the conventional delta-GLMM involves an intercept for each year) but they each could overcome the issue of having 0% or 100% encounters in any year.
For the zero-inflation model, the zero-inflation intercept will go to +Inf for any species-year combination with 0% encounters in any year. Experimental solutions include:
-
If including a species that has 100% encounter rate in one or a few years (but not all years), you can impose some temporal structure on intercepts, using methods #3-4 above.
-
You can customize the
Map
input following instructions in method #6 above.