Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: incorporate mapping into the run function for models with region specific covariates #44

Open
chantelwetzel-noaa opened this issue Dec 13, 2024 · 0 comments
Assignees

Comments

@chantelwetzel-noaa
Copy link
Collaborator

Describe the problem your feature request is related to.

The configuration_yellowtail using a region specific interaction term to include data south of Cape Mendocino to improve boundary estimates. However, are no positive tows south of Cape Mendocino in 2007, so the configuration_yellowtail code modifies the year*split_mendocino for 2007 since it is not estimable. Since these checks and fixes are not part of the current code, the general configuration file does for yellowtail does not include the interaction term.

Describe the solution you'd like

For models that include a region specific interaction term, adding checks and general parameter fixes within the run code would support including these terms in the general configuration file.

Describe alternatives you have considered

The alternative would be to not include these region-specific interaction terms in the configuration file where we would need to run these specific model structures separately for appropriate species where the needed parameter mapping and starting values would need to be done in the code when needed.

Additional context

The code the configuration_yellowtail uses to check and augment non-estimable parameters are:

# Find variables that aren't identifiable for presence-absence model
lm <- lm(formula = as.formula(configuration$formula),
         data = data$data_filtered[[1]])
#not_identifiable <- names(which(is.na(coef(lm))))
# Find variables that aren't identifiable for positive model
lm_pos <- lm(formula = as.formula(configuration$formula),
         data = dplyr::filter(data$data_filtered[[1]], catch_weight>0))
pos_not_identifiable <- names(which(is.na(coef(lm_pos))))

# Create variables to be not estimated/ mapped off
coef_names <- names(coef(lm))
.map_pos <- coef_names
.map_pos[coef_names %in% pos_not_identifiable] <- NA
.map_pos <- factor(.map_pos)
.start_pos <- rep(0, length(coef_names))
.start_pos[coef_names %in% pos_not_identifiable] <- -20

The run code is then modified as:

best <- data |>
  dplyr::mutate(
    # Evaluate the call in family
    family = purrr::map(family, .f = ~ eval(parse(text = .x))),
    # Run the model on each row in data
    results = purrr::pmap(
      .l = list(
        data = data_filtered,
        formula = formula,
        family = family,
        anisotropy = anisotropy,
        n_knots = knots,
        share_range = share_range,
        spatiotemporal = purrr::map2(spatiotemporal1, spatiotemporal2, list),
        sdmtmb_control = list(
          sdmTMB::sdmTMBcontrol(
            map = list(b_j = .map_pos, b_j2 = .map_pos), # pass the mapping information for parameters that can't be estimated
            start = list(b_j = .start_pos, b_j2 = .start_pos), # pass the starting values parameters that can't be estimated
            newton_loops = 3
          )
        )
      ),
      .f = indexwc::run_sdmtmb
    )
  )
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants