-
Notifications
You must be signed in to change notification settings - Fork 3
/
rmg.html
227 lines (176 loc) · 7.16 KB
/
rmg.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
<!--
Google IO 2012 HTML5 Slide Template
Authors: Eric Bidelman <[email protected]>
Luke Mahe <[email protected]>
URL: https://code.google.com/p/io-2012-slides
-->
<!DOCTYPE html>
<html>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<script type="text/javascript" src="path-to-mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type="text/javascript"
src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
</script>
<head>
<title>Slideshow</title>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="chrome=1">
<link rel="stylesheet" media="all" href="theme/css/default.css">
<base target="_blank"> <!-- This amazingness opens all links in a new tab. -->
<script data-main="js/slides" src="js/require-1.0.8.min.js"></script>
</head>
<body style="opacity: 0">
<slides class="layout-widescreen">
<slide class="title-slide segue nobackground" >
<hgroup>
<h1>Why Folding is Easy: MSMs for Protein Conformation Change</h1>
<h2></h2>
<p><p>Robert T. McGibbon</p>
<p>April 22, 2013</p></p>
</hgroup>
</slide>
<slide >
<hgroup>
<h2>Background</h2>
<h3></h3>
</hgroup>
<article ><ul class="build">
<li>Markov state modes are great.</li>
<li>You can model protein dynamics and stuff.</li>
</ul></article>
</slide>
<slide >
<hgroup>
<h2>Plan of Attack</h2>
<h3></h3>
</hgroup>
<article ><ul class="build">
<li>Resolving conformation slow conformation changes within a folding data set.</li>
<li>Manual parameter selection</li>
<li>Statistical over fitting</li>
<li>Adaptive sampling</li>
</ul></article>
</slide>
<slide >
<hgroup>
<h2>MSM Parameter Selection</h2>
<h3></h3>
</hgroup>
<article ><ul class="build">
<li>MSM construction is a mix of supervised and unsupervised learning problems.</li>
<li>Unsupervised learning is the problem of finding "hidden" structure in unlabeled data, where there's no <em>right</em> answer.</li>
<li>How many conformational states does a protein adopt? The answer is in the eye of the beholder.</li>
<li>Parameterizing a transition matrix is supervised.</li>
</ul></article>
</slide>
<slide >
<hgroup>
<h2>MSM State Decomposition</h2>
<h3></h3>
</hgroup>
<article ><p>The MSM state decomposition, <em>clustering</em>, is characterized by a bias-variance trade off.</p>
<ul class="build">
<li><strong>Bias:</strong> As you lower the number of states, you introduce systematic error in modeling the dynamics.</li>
<li>Hamiltonian dynamics are completely Markovian in $\mathbb{R}^{6N}$</li>
<li><strong>Variance:</strong> As you raise the number of states, you're increasing subject to statistical noise in the transition matrix estimation.</li>
<li>How do we balance this trade off?</li>
</ul></article>
</slide>
<slide >
<hgroup>
<h2>Choosing the states' shape</h2>
<h3></h3>
</hgroup>
<article class="flexbox vcenter"><p><img height=300 src=figures/gpcr_activation.png /></p>
<ul class="build">
<li>Conformational change is characterized by slow <em>conformationally subtle</em> transitions.</li>
<li>To resolve these transitions in our models, our states need to be "smaller" than the spatial extent of the process.</li>
<li>Increasing the number of states is not the <em>only</em> way to lower the bias -- we can also pick the <strong>shape</strong> of our states more intelligently.</li>
</ul></article>
</slide>
<slide >
<hgroup>
<h2>Protein motions aren't isotropic</h2>
<h3></h3>
</hgroup>
<article class="flexbox vcenter"><p>Our MSM states shouldn't be either</p>
<p><img height=300 src=figures/gpcr_activation.png /></p>
<ul class="build">
<li>Goal: to learn a distance metric for clustering which maximally separates kinetically close and kinetically distant conformations.</li>
<li>Different structural degrees of freedom should be weighted according to their discrimitory power (equilibration rate).</li>
</ul></article>
</slide>
<slide >
<hgroup>
<h2>Large-Margin Learning</h2>
<h3></h3>
</hgroup>
<article class="flexbox vcenter"><p><img height=250 src="figures/classifiers.png"><span style="padding:1em;"></span><img height=250 src="figures/classifiers2.png"></p>
<ul class="build">
<li>A common goal in supervised learning is to construct binary classifiers.</li>
<li>Examples: actives vs. inactives, cats vs. others.</li>
<li>The "margin" is the distance of the object's score from the the decision threshold.</li>
<li>Large margin approaches attempt to find a classifier via optimization methods that maximize the (avg) margins.</li>
</ul></article>
</slide>
<slide >
<hgroup>
<h2>Large-Margin Distance Metric</h2>
<h3></h3>
</hgroup>
<article ><ul>
<li>Consider a set of $N$ triplets of structres, $(a, b, c)$, such that $a$ and $b$ appear close together in a single trajectory, while $a$ and $c$ do not.</li>
<li>We restrict our attention to the set of squared Mahalanobis metrics.</li>
<li>And maximize the margin of separation between the close and far pairs.
$$ d^{\mathbf{X}}(\vec{a}, \vec{b}) = (\vec{a} - \vec{b})^{T} \mathbf{X} (\vec{a} - \vec{b}) \
\max_{\mathbf{X},\rho} \left[ \alpha \rho - \frac{1}{N} \sum_i^N \lambda \left(d^\mathbf{X}(\vec{a}_i,\vec{c}_i) - d^\mathbf{X}(\vec{a}_i, \vec{b}_i) - \rho \right) \right]
$$</li>
</ul></article>
</slide>
<slide >
<hgroup>
<h2>Optimization and Constraints</h2>
<h3></h3>
</hgroup>
<article ><p>$$ \max_{\mathbf{X},\rho} \left[ \alpha \rho - \frac{1}{N} \sum_i^N \lambda \left(d^\mathbf{X}(\vec{a}_i,\vec{c}_i) - d^\mathbf{X}(\vec{a}_i, \vec{b}_i) - \rho \right) \right] $$</p>
<ul>
<li>The matrix $\mathbf{X}$ is constrained to be positive semidefinite.</li>
<li>Optimized by a gradient descent algorithm with rank-1 updates.</li>
<li>Shen, C.; Kim, J.; Wang, L. Scalable large-margin Mahalanobis distance metric learning. <em>IEEE</em> <em>Trans.</em> <em>Neural</em> <em>Networks</em> <strong>2010</strong>, 21, 1524–1530</li>
</ul></article>
</slide>
<slide >
<hgroup>
<h2>Model System</h2>
<h3></h3>
</hgroup>
<article class="flexbox vcenter"><p><img height=250 src="paper_figures/microstates_unweighted.png"><img height=250 src="paper_figures/microstates_kdml.png"></p>
<ul>
<li> Two dimensional Brownian dynamics, where the diffusion constant in the $y$ direction is 10 times greather than the diffusion constant in the $x$ direction.
</li>
<li> <div style="margin-left:150px;"> $\mathbf{X} = \begin{pmatrix} 0.9915 & 0.0 \\ 0.0 & 0.0085 \end{pmatrix}$ </div>
</li>
</ul></article>
</slide>
<slide >
<hgroup>
<h2>Model System</h2>
<h3></h3>
</hgroup>
<article class="flexbox vcenter"><p><img height=500 src="paper_figures/timescales.png"></p></article>
</slide>
<slide >
<hgroup>
<h2>MSMAccelerator</h2>
<h3></h3>
</hgroup>
<article class="flexbox vcenter"><video height=350 controls> <source src="figures/movie.ogv" type="video/ogg"> </video></article>
</slide>
<slide class="backdrop"></slide>
</slides>
</body>
</html>