Simplified Model Relative Influences #72

nffarabaugh · 2022-03-23T18:10:58Z

Is there a way to access the relative influence information for the parameters that are included in the simplified model? It does not appear in the report CSV. Similarly is there a way to autogenerate plots for these?

Cheers!

SimonDedman · 2022-03-29T04:22:42Z

mate please could you send me your run script and your data (or a representative chunk so it'll run)? Thanks.

SimonDedman · 2022-03-29T04:27:58Z

gbm.auto: report around L1036 populated by Bin_Bars$var,
from L858: summary(get(Bin_Best_Model)
from L686 can be bin best simp if worthy.
So 858 should populate 1036 with simp thus make simplified bars, and simplified best vars / rel info Report entries.

nffarabaugh · 2022-03-29T19:43:40Z

For sure! I have popped my script and the CSV you will need below. Obviously don't share it around. Thanks for the help! Cheers,N. Frances Farabaugh Biology PhD Candidate Marine Community & Behavioral Ecology LabFlorida International ***@***.*** On Tuesday, March 29, 2022, 12:22:59 AM EDT, Simon Dedman ***@***.***> wrote: mate please could you send me your run script and your data (or a representative chunk so it'll run)? Thanks. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

SimonDedman · 2022-03-30T00:19:56Z

I just tried the first run with only tc 1, lr 0.01 and bf 0.5. Best combo was the unsimplified version, so even though Report.csv lists the simp predictors dropped and kept, if the best simp run doesn't lower the deviance, it won't outcompete the existing best unsimplified BRT run, so the relative influence values for the simp model aren't included because they're not relevant.

You can tell which model was chosen as best under "Best Gaussian BRT"; if this doesn't end in "_simp" then there's no issue. Please let me know tc lr bf combos for examples where this isn't the case, and the simp run wins but its best variables & their relative influence scores aren't produced correctly.

One confusing element is that the simp_dops_gaus.jpeg has a negative change in predictuve deviance for the removal of 1, 2, 6, 7, & 8 variables, with 8 being the greatest reduction. Intuitively this would mean the one with 8 dropped variables was better than the 'parent' combo with all variables retained, but actually simp is only selected if it's self.statistics$correlation score is better, aka training data correlation.

LMK how you get on, and please close this if this answers everything. Cheers!

nffarabaugh · 2022-03-30T14:35:15Z

Thanks I think this was an error in my understanding. So far none of my models have simp as the "best" model. I was confused because of the negative change in the predictive deviance (simp_dops_gaus.jpeg). Thanks for the help!

nffarabaugh · 2022-10-11T16:57:56Z

Hello, seems this is an issue even when the best model is a simplified model. I have attached a the generated report and code below.
Report_carangidae.csv
Self_CV_Statistics.csv
gbm.auto(
grids = NULL,
samples = wide.df1 %>% filter(site != "Nuka Hiva"),
expvar = c("temp", "ave_npp", "depth", "visibility", "topo", "pop.dens", "bait", "time.no.bait", "isl_grp", "Season", "lagoon.size"), # fix to final variables
resvar = "carangidae_maxN_a",
tc = c(5), # add combos you want to see for initial runs and it will try each. doens't run the whole gambit like the loops do
lr = c(0.0005),
bf = c(0.55),
n.trees = 50,
ZI = "CHECK",
fam1 = c("bernoulli", "binomial", "poisson", "laplace", "gaussian"),
fam2 = c("poisson"), #
simp = TRUE, # Change to true
gridslat = 2,
gridslon = 1,
multiplot = TRUE,
cols = grey.colors(1, 1, 1),
linesfiles = TRUE, # change to true for final run
smooth = TRUE,
savedir = "~/Documents/My Documents/FinPrint French Poly/Analysis/DataExploration_03_2022/Teleosts",
savegbm = TRUE, # change to true for final runs
loadgbm = NULL,
varint = TRUE,
map = TRUE,
shape = NULL,
RSB = TRUE,
BnW = TRUE,
alerts = TRUE, # this is the noise alerts
pngtype = c("quartz"), # quartz for mac this one for windows : "cairo-png"
gaus = TRUE,
MLEvaluate = TRUE,
brv = NULL,
grv = NULL,
Bin_Preds = NULL,
Gaus_Preds = NULL)

nffarabaugh · 2022-10-11T17:04:50Z

Gaussian BRT Variable contributions.csv

…ied expvars, based on #72. DESCRIPTION version to 1.5.9

SimonDedman · 2022-10-11T17:26:40Z

gaus:
L1143 & 4:

Report[1:(length(Gaus_Bars[,1])),(reportcolno - 2)] <- as.character(Gaus_Bars$var)
Report[1:(length(Gaus_Bars[,2])),(reportcolno - 1)] <- as.character(round(Gaus_Bars$rel.inf), 2)

Bin is L1067:75

L887:
if (gaus) {Gaus_Bars <- summary(get(Gaus_Best_Model),
so bin/gaus_bars are already simp if simp was better...
So why are they printing all of the rel.inf's if most of the vars got dropped?

Gaus_Bars <- summary(get(Gaus_Best_Model),
                                      cBars = length(get(Gaus_Best_Model)$var.names),
                                      n.trees = get(Gaus_Best_Model)$n.trees,
                                      plotit = FALSE, order = TRUE, normalize = TRUE, las = 1, main = NULL)
      write.csv(Gaus_Bars, file = paste0("./", names(samples[i]), "/Gaussian BRT Variable contributions.csv"), row.names = FALSE)

Output csv colnames: var, rel.inf. I.e. not cBars nor n.trees. Odd.

L668: simplification.
L671: Gaus_Best_Simp assigned gbm object AFTER simplification, so should have extra variables dropped?

See notes from Bonnie having the same issue, L674:681
L680 & 645 replacements testing now.

SimonDedman · 2022-10-12T15:21:32Z

Pushed change, model re-run by Frances didn't need simplifying so change not tested, dangerzone.

SimonDedman · 2022-11-21T22:10:03Z

NFF any update on this, did the change solve the issue? If so please mark as closed. Cheers!

nffarabaugh closed this as completed Mar 30, 2022

nffarabaugh reopened this Oct 11, 2022

SimonDedman added a commit that referenced this issue Oct 11, 2022

auto: bin gaus best simp changed gbm.x to try to use only the simplif…

c319167

…ied expvars, based on #72. DESCRIPTION version to 1.5.9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplified Model Relative Influences #72

Simplified Model Relative Influences #72

nffarabaugh commented Mar 23, 2022

SimonDedman commented Mar 29, 2022

SimonDedman commented Mar 29, 2022

nffarabaugh commented Mar 29, 2022 via email

SimonDedman commented Mar 30, 2022

nffarabaugh commented Mar 30, 2022

nffarabaugh commented Oct 11, 2022

nffarabaugh commented Oct 11, 2022

SimonDedman commented Oct 11, 2022

SimonDedman commented Oct 12, 2022

SimonDedman commented Nov 21, 2022

Simplified Model Relative Influences #72

Simplified Model Relative Influences #72

Comments

nffarabaugh commented Mar 23, 2022

SimonDedman commented Mar 29, 2022

SimonDedman commented Mar 29, 2022

nffarabaugh commented Mar 29, 2022 via email

SimonDedman commented Mar 30, 2022

nffarabaugh commented Mar 30, 2022

nffarabaugh commented Oct 11, 2022

nffarabaugh commented Oct 11, 2022

SimonDedman commented Oct 11, 2022

SimonDedman commented Oct 12, 2022

SimonDedman commented Nov 21, 2022