Skip to content

Commit

Permalink
Merge pull request #6 from tpoisot/tpoisot/issue3
Browse files Browse the repository at this point in the history
Shapley values maps
  • Loading branch information
tpoisot authored Oct 10, 2023
2 parents c36f40e + 4de7432 commit f17ec95
Show file tree
Hide file tree
Showing 8 changed files with 44 additions and 31 deletions.
4 changes: 2 additions & 2 deletions _freeze/slides/execute-results/html.json

Large diffs are not rendered by default.

Binary file modified _freeze/slides/figure-revealjs/cell-36-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified _freeze/slides/figure-revealjs/cell-37-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
43 changes: 21 additions & 22 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link href="slides_files/libs/quarto-html/light-border.css" rel="stylesheet">
<link href="slides_files/libs/quarto-html/quarto-html.min.css" rel="stylesheet" data-mode="light">
<link href="slides_files/libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles"><meta charset="utf-8">
<meta name="generator" content="quarto-1.4.386">
<meta name="generator" content="quarto-1.4.358">

<meta name="author" content="Timothée Poisot">
<title>Building an interpretable SDM from scratch</title>
Expand Down Expand Up @@ -97,7 +97,6 @@
div.csl-bib-body { }
div.csl-entry {
clear: both;
margin-bottom: 0em;
}
.hanging-indent div.csl-entry {
margin-left:2em;
Expand Down Expand Up @@ -543,7 +542,7 @@ <h2>Species occurrence filtering</h2>
<section id="where-are-we-so-far" class="slide level2">
<h2>Where are we so far?</h2>

<img data-src="slides_files/figure-revealjs/cell-8-output-1.png" id="fig-539a35d47e664c97a50115a146a7f1bd-1" class="r-stretch quarto-figure-center"></section>
<img data-src="slides_files/figure-revealjs/cell-8-output-1.png" class="r-stretch"></section>
<section id="wait" class="slide level2">
<h2>WAIT!</h2>
<p>It’s not serious ecology unless we use Phylopic:</p>
Expand All @@ -562,7 +561,7 @@ <h2>WAIT!</h2>
<section id="where-are-we-so-far-1" class="slide level2">
<h2>Where are we so far?</h2>

<img data-src="slides_files/figure-revealjs/cell-10-output-1.png" id="fig-539a35d47e664c97a50115a146a7f1bd-2" class="r-stretch quarto-figure-center"></section>
<img data-src="slides_files/figure-revealjs/cell-10-output-1.png" class="r-stretch"></section>
<section id="spatial-thinning" class="slide level2">
<h2>Spatial thinning</h2>
<p>We limit the occurrences to one per grid cell, assigned to the center of the grid cell</p>
Expand Down Expand Up @@ -601,7 +600,7 @@ <h2>Background points cleaning</h2>
<section id="data-overview" class="slide level2">
<h2>Data overview</h2>

<img data-src="slides_files/figure-revealjs/cell-15-output-1.png" id="fig-539a35d47e664c97a50115a146a7f1bd-3" class="r-stretch quarto-figure-center"></section>
<img data-src="slides_files/figure-revealjs/cell-15-output-1.png" class="r-stretch"></section>
<section id="preparing-the-responses-and-variables" class="slide level2">
<h2>Preparing the responses and variables</h2>
<div id="assemble-y-and-x" class="cell" data-execution_count="16">
Expand Down Expand Up @@ -670,7 +669,7 @@ <h2>A note on cross-validation</h2>
<section id="baseline-performance" class="slide level2">
<h2>Baseline performance</h2>
<p>We need to get a sense of how difficult the classification problem is:</p>
<div id="7545d4eb" class="cell" data-execution_count="20">
<div id="1a34b6d5" class="cell" data-execution_count="20">
<div class="sourceCode cell-code" id="cb14"><pre class="sourceCode julia"><code class="sourceCode julia"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a>C0 <span class="op">=</span> <span class="fu">zeros</span>(ConfusionMatrix, <span class="fu">length</span>(folds))</span>
<span id="cb14-2"><a href="#cb14-2" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> (i,f) <span class="kw">in</span> <span class="fu">enumerate</span>(folds)</span>
<span id="cb14-3"><a href="#cb14-3" aria-hidden="true" tabindex="-1"></a> trn, val <span class="op">=</span> f</span>
Expand Down Expand Up @@ -720,7 +719,7 @@ <h2>Measures on the confusion matrix</h2>
<section id="variable-selection" class="slide level2">
<h2>Variable selection</h2>
<p>We add variables one at a time, until the Matthew’s Correlation Coefficient stops increasing – we keep annual temperature, isothermality, mean diurnal range, and annual precipitation</p>
<div id="cedf0299" class="cell" data-execution_count="21">
<div id="62855f23" class="cell" data-execution_count="21">
<div class="sourceCode cell-code" id="cb15"><pre class="sourceCode julia"><code class="sourceCode julia"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a>available_variables <span class="op">=</span> <span class="fu">constrainedselection</span>(ty, tX, folds, naivebayes, mcc, [<span class="fl">1</span>, <span class="fl">2</span>, <span class="fl">3</span>, <span class="fl">12</span>])</span></code></pre></div>
</div>
<p>This method identifies 8 variables, some of which are:</p>
Expand All @@ -740,7 +739,7 @@ <h2>Discuss - can we force variable selection?</h2>
</section>
<section id="model-with-variable-selection" class="slide level2">
<h2>Model with variable selection</h2>
<div id="8cf371ec" class="cell" data-execution_count="22">
<div id="fe0f9169" class="cell" data-execution_count="22">
<div class="sourceCode cell-code" id="cb16"><pre class="sourceCode julia"><code class="sourceCode julia"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a>C1 <span class="op">=</span> <span class="fu">zeros</span>(ConfusionMatrix, <span class="fu">length</span>(folds))</span>
<span id="cb16-2"><a href="#cb16-2" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> (i,f) <span class="kw">in</span> <span class="fu">enumerate</span>(folds)</span>
<span id="cb16-3"><a href="#cb16-3" aria-hidden="true" tabindex="-1"></a> trn, val <span class="op">=</span> f</span>
Expand Down Expand Up @@ -802,7 +801,7 @@ <h2>How do we make the model better?</h2>
</section>
<section id="thresholding-the-model" class="slide level2">
<h2>Thresholding the model</h2>
<div id="2e807b81" class="cell" data-execution_count="23">
<div id="f16be36e" class="cell" data-execution_count="23">
<div class="sourceCode cell-code" id="cb17"><pre class="sourceCode julia"><code class="sourceCode julia"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a>ty, tX <span class="op">=</span> y[idx], X[idx,available_variables]</span>
<span id="cb17-2"><a href="#cb17-2" aria-hidden="true" tabindex="-1"></a>thr <span class="op">=</span> <span class="fu">LinRange</span>(<span class="fl">0.0</span>, <span class="fl">1.0</span>, <span class="fl">350</span>)</span>
<span id="cb17-3"><a href="#cb17-3" aria-hidden="true" tabindex="-1"></a>C <span class="op">=</span> <span class="fu">zeros</span>(ConfusionMatrix, (k, <span class="fu">length</span>(thr)))</span>
Expand All @@ -819,10 +818,10 @@ <h2>Thresholding the model</h2>
<section id="but-how-do-we-pick-the-threshold" class="slide level2">
<h2>But how do we pick the threshold?</h2>

<img data-src="slides_files/figure-revealjs/cell-25-output-1.svg" id="fig-539a35d47e664c97a50115a146a7f1bd-4" class="r-stretch quarto-figure-center"></section>
<img data-src="slides_files/figure-revealjs/cell-25-output-1.svg" class="r-stretch"></section>
<section id="tuned-model-with-selected-variables" class="slide level2">
<h2>Tuned model with selected variables</h2>
<div id="18931752" class="cell" data-execution_count="25">
<div id="1ed15e57" class="cell" data-execution_count="25">
<div class="sourceCode cell-code" id="cb18"><pre class="sourceCode julia"><code class="sourceCode julia"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a>C2 <span class="op">=</span> <span class="fu">zeros</span>(ConfusionMatrix, <span class="fu">length</span>(folds))</span>
<span id="cb18-2"><a href="#cb18-2" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> (i,f) <span class="kw">in</span> <span class="fu">enumerate</span>(folds)</span>
<span id="cb18-3"><a href="#cb18-3" aria-hidden="true" tabindex="-1"></a> trn, val <span class="op">=</span> f</span>
Expand Down Expand Up @@ -886,7 +885,7 @@ <h2>Measures on the confusion matrix</h2>
<section id="tuned-model-performance" class="slide level2">
<h2>Tuned model performance</h2>
<p>We can retrain over <em>all</em> the training data</p>
<div id="1dcb73f9" class="cell" data-execution_count="26">
<div id="4afac4da" class="cell" data-execution_count="26">
<div class="sourceCode cell-code" id="cb19"><pre class="sourceCode julia"><code class="sourceCode julia"><span id="cb19-1"><a href="#cb19-1" aria-hidden="true" tabindex="-1"></a>finalmodel <span class="op">=</span> <span class="fu">naivebayes</span>(ty, tX)</span>
<span id="cb19-2"><a href="#cb19-2" aria-hidden="true" tabindex="-1"></a>prediction <span class="op">=</span> <span class="fu">vec</span>(<span class="fu">mapslices</span>(finalmodel, X[tidx,available_variables]; dims<span class="op">=</span><span class="fl">2</span>))</span>
<span id="cb19-3"><a href="#cb19-3" aria-hidden="true" tabindex="-1"></a>Cf <span class="op">=</span> <span class="fu">ConfusionMatrix</span>(prediction, y[tidx], thr[m])</span></code></pre></div>
Expand Down Expand Up @@ -934,7 +933,7 @@ <h2>Acceptable bias</h2>
</section>
<section id="prediction-for-each-pixel" class="slide level2">
<h2>Prediction for each pixel</h2>
<div id="56a2d274" class="cell" data-execution_count="28">
<div id="30790a1c" class="cell" data-execution_count="28">
<div class="sourceCode cell-code" id="cb20"><pre class="sourceCode julia"><code class="sourceCode julia"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a>prediction <span class="op">=</span> <span class="fu">similar</span>(<span class="fu">first</span>(predictors))</span>
<span id="cb20-2"><a href="#cb20-2" aria-hidden="true" tabindex="-1"></a><span class="bu">Threads</span>.<span class="pp">@threads</span> <span class="cf">for</span> k <span class="kw">in</span> <span class="fu">keys</span>(prediction)</span>
<span id="cb20-3"><a href="#cb20-3" aria-hidden="true" tabindex="-1"></a> prediction[k] <span class="op">=</span> <span class="fu">finalmodel</span>([p[k] for p <span class="kw">in</span> predictors[available_variables]])</span>
Expand All @@ -947,33 +946,33 @@ <h2>Prediction for each pixel</h2>
<section id="tuned-model---prediction" class="slide level2">
<h2>Tuned model - prediction</h2>

<img data-src="slides_files/figure-revealjs/cell-30-output-1.png" id="fig-539a35d47e664c97a50115a146a7f1bd-5" class="r-stretch quarto-figure-center"></section>
<img data-src="slides_files/figure-revealjs/cell-30-output-1.png" class="r-stretch"></section>
<section id="tuned-model---uncertainty" class="slide level2">
<h2>Tuned model - uncertainty</h2>

<img data-src="slides_files/figure-revealjs/cell-31-output-1.png" id="fig-539a35d47e664c97a50115a146a7f1bd-6" class="r-stretch quarto-figure-center"><div class="footer">
<img data-src="slides_files/figure-revealjs/cell-31-output-1.png" class="r-stretch"><div class="footer">
<p>IQR for the models trained on each fold</p>
</div>
</section>
<section id="tuned-model---entropy" class="slide level2">
<h2>Tuned model - entropy</h2>

<img data-src="slides_files/figure-revealjs/cell-32-output-1.png" id="fig-539a35d47e664c97a50115a146a7f1bd-7" class="r-stretch quarto-figure-center"><div class="footer">
<img data-src="slides_files/figure-revealjs/cell-32-output-1.png" class="r-stretch"><div class="footer">
<p>Entropy (in bits) of the NBC probability</p>
</div>
</section>
<section id="tuned-model---range" class="slide level2">
<h2>Tuned model - range</h2>

<img data-src="slides_files/figure-revealjs/cell-33-output-1.png" id="fig-539a35d47e664c97a50115a146a7f1bd-8" class="r-stretch quarto-figure-center"><div class="footer">
<img data-src="slides_files/figure-revealjs/cell-33-output-1.png" class="r-stretch"><div class="footer">
<p>Probability &gt; 0.209</p>
</div>
</section>
<section id="predicting-the-predictions" class="slide level2">
<h2>Predicting the predictions?</h2>
<p>Shapley values (Monte-Carlo approximation): if we mix the variables across two observations, how important is the <span class="math inline">\(i\)</span>-th variable?</p>
<p>Expresses “importance” as an additive factor on top of the <em>average</em> prediction (here: average prob. of occurrence)</p>
<div id="a0c32999" class="cell" data-execution_count="33">
<div id="79449adc" class="cell" data-execution_count="33">
<div class="sourceCode cell-code" id="cb21"><pre class="sourceCode julia"><code class="sourceCode julia"><span id="cb21-1"><a href="#cb21-1" aria-hidden="true" tabindex="-1"></a>shapval <span class="op">=</span> [<span class="fu">similar</span>(<span class="fu">first</span>(predictors)) for i <span class="kw">in</span> <span class="fu">eachindex</span>(available_variables)]</span>
<span id="cb21-2"><a href="#cb21-2" aria-hidden="true" tabindex="-1"></a><span class="bu">Threads</span>.<span class="pp">@threads</span> <span class="cf">for</span> k <span class="kw">in</span> <span class="fu">keys</span>(shapval[<span class="fl">1</span>])</span>
<span id="cb21-3"><a href="#cb21-3" aria-hidden="true" tabindex="-1"></a> x <span class="op">=</span> [p[k] for p <span class="kw">in</span> predictors[available_variables]]</span>
Expand All @@ -988,7 +987,7 @@ <h2>Predicting the predictions?</h2>
</section>
<section id="importance-of-variables" class="slide level2">
<h2>Importance of variables</h2>
<div id="d0f15efa" class="cell" data-execution_count="34">
<div id="0a2d808d" class="cell" data-execution_count="34">
<div class="sourceCode cell-code" id="cb22"><pre class="sourceCode julia"><code class="sourceCode julia"><span id="cb22-1"><a href="#cb22-1" aria-hidden="true" tabindex="-1"></a>varimp <span class="op">=</span> <span class="fu">sum</span>.(<span class="fu">map</span>(abs, shapval))</span>
<span id="cb22-2"><a href="#cb22-2" aria-hidden="true" tabindex="-1"></a>varimp <span class="op">./=</span> <span class="fu">sum</span>(varimp)</span>
<span id="cb22-3"><a href="#cb22-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> v <span class="kw">in</span> <span class="fu">sortperm</span>(varimp, rev<span class="op">=</span><span class="cn">true</span>)</span>
Expand All @@ -1012,11 +1011,11 @@ <h2>Importance of variables</h2>
<section id="top-three-variables" class="slide level2">
<h2>Top three variables</h2>

<img data-src="slides_files/figure-revealjs/cell-36-output-1.png" id="fig-539a35d47e664c97a50115a146a7f1bd-9" class="r-stretch quarto-figure-center"></section>
<img data-src="slides_files/figure-revealjs/cell-36-output-1.png" class="r-stretch"></section>
<section id="most-determinant-predictor" class="slide level2">
<h2>Most determinant predictor</h2>

<img data-src="slides_files/figure-revealjs/cell-37-output-1.png" id="fig-539a35d47e664c97a50115a146a7f1bd-10" class="r-stretch quarto-figure-center"></section>
<img data-src="slides_files/figure-revealjs/cell-37-output-1.png" class="r-stretch"></section>
<section id="take-home" class="slide level2">
<h2>Take-home</h2>
<ul>
Expand All @@ -1032,7 +1031,7 @@ <h2>References</h2>
<img src="assets/poisotlab.png" class="slide-logo r-stretch"><div class="footer footer-default">

</div>
<div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0" role="list">
<div id="refs" class="references csl-bib-body hanging-indent" role="list">
<div id="ref-barbet-massin2012" class="csl-entry" role="listitem">
Barbet-Massin, M., Jiguet, F., Albert, C.H. &amp; Thuiller, W. (2012). <a href="https://doi.org/10.1111/j.2041-210x.2011.00172.x">Selecting pseudo-absences for species distribution models: how, where and how many?</a> <em>Methods in Ecology and Evolution</em>, 3, 327–338.
</div>
Expand Down
Binary file modified docs/slides_files/figure-revealjs/cell-36-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/slides_files/figure-revealjs/cell-37-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit f17ec95

Please sign in to comment.