generated from jhudsl/OTTR_Template
-
Notifications
You must be signed in to change notification settings - Fork 2
/
05-pvacview_tour.Rmd
301 lines (231 loc) · 16.8 KB
/
05-pvacview_tour.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
# Review Your Results in pVACview
```{r, include = FALSE}
ottrpal::set_knitr_image_path()
```
## Learning Objectives
This chapter will cover:
- Introduction to the pVACview module
- How to start pVACview
- How to load your pVACseq data into pVACview
- The pVACview user interface
- How to re-tier results in pVACview
- How to add comments to the variants in the report
- How to export the reviewed results
## Introduction to the pVACview module
pVACview is a R shiny based tool designed to aid specifically in the prioritization and selection of
neoantigen candidates for personalized cancer vaccines or other applications. It takes as inputs a
pVACseq output aggregate report file (tsv format) and a corresponding pVACseq output metrics file (json).
pVACview allows the user to launch an R shiny application to load and visualize the given neoantigen
candidates with detailed information including that of the genomic variant, transcripts covering the
variant, and strong-binding peptides predicted from the respective transcripts. It also incorporates
anchor prediction data for a range of class I HLA alleles and peptides ranging from 8- to 11-mers. By
taking all these types of information into account for the neoantigen candidates, researchers will be
able to make more informed decisions when deciding final peptide candidates for experiments, personalized
cancer vaccines, or T cell therapies designed to target neoantigens.
## Starting pVACview
The pVACview R source code is distributed with every pVACseq run the in
MHC_Class_I and/or MHC_Class_II subdirectories, depending on which prediction
algorithms were run. For the HCC1395 pVACseq example run you did earlier, it
can be started by running the following command in your Terminal:
```{r, engine = 'bash', eval = FALSE}
pvacview run ${PWD}/pVACtools_outputs/MHC_Class_I
```
This starts a process for the pVACview R shiny application. Do not close the
terminal window or end this process. You can now open pVACview in your browser
by navigating to [http://127.0.0.1:3333/](http://127.0.0.1:3333/).
## Uploading pVACseq result data into pVACview
The starting screen shows an upload form. You will want to upload the Class I
aggregated.tsv report under "1. Neoantigen Candidate Aggregate Report (tsv
required)". We generally recommend using Class I as the main report, as these
are more robust than Class II. However, either can be uploaded here as long as
the correct data type is selected under "Does this aggregate report file
correspond to Class I or Class II prediction data?"
Upload the matching Class I metrics.json file under "2. Neoantigen Candidate
Metrics file (json required)". The Class II aggregated.tsv can now be uploaded
as a supplementary file under "3. Additional Neoantigen Candidate Aggregate
Report (tsv required)".
This interface also optionally allows users to upload a gene-of-interest TSV
file to, e.g. highlight variants on known cancer genes. We have provided such
a TSV with your original HCC1395 downloaded data, called
"cancer_census_hotspot_gene_list.tsv".
After uploading all files, you can visualize the result by clicking on the
"Visualize" button at the bottom of the page, or the "Visualize and Explore"
tab in the sidebar.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "pVACview data upload interface."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_2")
```
## The pVACview User Interface
### Aggregate Report of Best Candidates by Variant
The main table in the Aggregate Report of Best Candidates by Variant panel
shows the best neoantigen candidate for each variant.
It lists the gene and amino acid change of the variant as well as additional
information about the best peptide and the best transcript coding for it. These
include, from left to right, the transcript support level, the best-binding HLA
allele, the mutated positions of the best peptide, any positions in the peptide
where the amino acid might be problematic for manufacturing, and the total number
of neoantigen candidates passing the binding affinity threshold set by the user.
If a gene of interest list was uploaded, variants on those genes have their gene
highlighted with a green border.
Next, this table lists the IC50 peptide MHC binding affinity for both the mutant
and the wild type. It also shows the percentile scores of the binding affinity values.
For the mutant values, a heatmap coloring is applied to make it easier to visually
identify well-binding peptides.
The next few columns show the coverage and expression of the best transcript with
a bar plot background to represent where specific values fall across the entire
patient sample.
The Tier column represents the tier assigned to the best peptide. The neoantigen
candidates in this view were all sorted into the Pass tier but tiers such as Low
Expression or Subclonal are applied to easily identify why a neoantigen candidate
might be unsuitable for vaccine selection.
The Ref Match column reflects whether or not the best peptide was found in the
reference proteome which is undesired since such peptides are not novel and
including them in a vaccine might lead to an auto immune response.
Users are able to set a status for each candidate in the Evaluation column to mark
them as Accept, Reject, or requiring further review.
The Investigate button can be clicked to see more detail for a variant. This
will update the lower panels with details for the selected variant.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "Upon successfully uploading the relevant data files, you can explore the different aspects of your neoantigen candidates."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g2491f283519_0_8")
```
For candidates not sorted into the Pass tier, red borders visually highlight the
attributes failed by the candidate.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "Neoantigen candidates are binned into tiers depending on their suitability for vaccine creation."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_8")
```
### Variant Information
The Variant Information panel shows more variant-level details of the selected
neoantigen candidate.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Variant Information tab shows more details for the selected variant."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_14")
```
On the left is a three-tab section. The first tab, "Transcript Sets of Selected
Variant", shows a list of transcript sets. pVACtools bins transcripts that code for
the same set of neoantigen candidates into a set
because the neoantigen-level information for all transcripts in a set will be
identical. The transcript set containing the Best Peptide is highlighted in
green. This table shows the number of transcripts in each set, the number of
well-binding peptides the transcripts in the set code for, and the total
expression of all the transcripts in the set.
The second tab, titled Reference Matches, shows the details for any reference matches of the neoantigen
candidate. It repeats information about the Best Peptide sequence, the amino acid
change, the mutated position, and the gene for easy reference. The mutated
positions are notated in red in the Best Peptide sequence and the Query
Sequence. The Query Sequence is a longer sequence around the somatic mutation.
The Best Peptide subsequence is highlighted in yellow in the Query Sequence. For
any 8mer subsequence of the query sequence, pVACtools looks for matches in the
reference proteome. Any matches found are reported in the Hits table
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Reference Matches tab shows details of reference matches of the neoantigen candidate."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_20")
```
The "Additional Data" tab shows the additional data for the variant if a Additional Neoantigen Candidate Aggregate
Report was uploaded. It shows the Best Peptide and its information for this variant from the
additional report. This can be used, e.g., when a Class I neoantigen candidate
is a bad binder but all other metrics look good. Oftentimes this variant can
be rescued by considering the best Class II neoantigen candidate instead.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Additional Data shows data from the Additional Aggregate Report for the variant."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_26")
```
The next section, "Variant & Gene Info", shows coverage and expression information as well as
the genomic coordinates for the variant.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Variant & Gene INfo section show coverage and expression information for the variant."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_75")
```
The last section, "Peptide Evaluation Overview", shows counts for how many peptides have been accepted,
rejected, or marked for review. For most vaccines a certain minimum number of
neoantigen candidates is desired so this panel makes it easy to review how
many neoantigen candidates are still needed.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Peptide Evaluation Overview section shows how many neoantigen candidates have been accepted, rejected, marked for review, or are pending."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_80")
```
### Transcript Set Detailed Data
When selecting a transcript set in the Variant Info panel, this panel will
show details about the neoantigen candidates the transcripts in the set code
for and as well as details on the transcripts themselves.
The Peptide Candidates from Selected Transcript Set tab shows a list of
mutant and matched wildtype peptides and their IC50 binding
affinity to the patient HLA alleles. Only neoantigen candidates where at least
one peptide-MHC binding prediction falls within the `--aggregate-inclusion-threshold`
will be shown in this table. For HLA alleles where the peptide is not
well-binding the prediction details will show `X`. This table also shows the
mutant position, whether or not the neoantigen candidate has any problematic
positions, and whether or not it failed the anchor criteria. This helps in
determining whether a neoantigen candidate was deprioritized when selecting the
Best Peptide. The Best Peptide is highlighted in green.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Transcript Set Detailed Data panel shows binding prediction details for the neoantigens the transcripts in the set code for."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_32")
```
The Transcripts in Set tabs shows details of the transcripts in the selected
set such as the transcript Ensembl ID, the transcript expression, the
transcript support level, the biotype, and the transcript length. This
reflects the criteria used in determining the Best Transcript. The Best
Transcript is highlighted in green.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Transcripts in Set tab shows details about the transcripts in the set."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_38")
```
### Additional Peptide Information
In the Additional Peptide Information panel, users can see more information
for the neoantigen candidate selected in the Transcript Set Detailed Data
panel.
The first tab, "IC50 Plot", shows violin plots of the predicted IC50 binding
affinity for each prediction algorithm for both the neoantigen candidate and
its matched wildtype peptide. This can be used to check concordance
of predictions between the different algorithms. It also allows for a
detailed comparison between the mutant and wildtype predictions in addition to
the median or lowest IC50 binding affinity used elsewhere. A solid line is used
to represent the median score.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Additional Peptide Information panel shows more information for the peptide selected in the Transcript Set Detailed Data panel."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_44")
```
The %ile Plot tab shows a similar violin plot but for the predicted percentile
scores as opposed to the IC50 binding affinity. A solid line is also used here
to represent the median score.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The %ile Plot tab shows violin plots of the percentile score predicted by each algorithm."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_50")
```
The next tab, "Binding Data", shows the IC50 binding affinity and
percentile score but in table format.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Binding Data tab shows a table of the IC50 binding affinity and percentile predicted by each algorithm."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_56")
```
The Elution Table tab shows the predicted elution scores and percentiles, if
the appropriate prediction algorithm(s) were chosen.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Elution Table tab shows elution prediction scores and precentiles for the selected peptide."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_62")
```
Lastly, the Anchor Heatmap tab shows a heatmap overlayed over each neoantigen
candidate from the selected transcript set. A darker color represents a higher
probability that a position in the peptide is an anchor. Mutated positions are
represented by red letters. More information
about how to interpret the heatmap can be found in the graph on the right of this
panel.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "The Anchor heatmaps show which positions in a peptide are likely to be anchors."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_68")
```
## Regenerate Tiers with Custom Parameters
During review of your data it might become apparent that different tiering
thresholds would've been more approriate. pVACview allows you to re-tier your
data with custom parameters by adjusting the sliders and inputs in the
"Advanced Options: Regenerate Tiering with different parameters" panel and
pressing the "Recalculate Tiering with new parameters" button.
The parameters that were used in the original pVACseq run can still be viewed
in the "Original Parameters for Tiering" panel and the tiers can be reset to
those parameters by pressing the "Reset to original parameters" button.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "Users can re-tier the neoantigen candidates by adjusting the tiering thresholds."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_87")
```
## Adding Comments to Variants
When reviewing neoantigen candidates in pVACseq, users are able to add
comments on each variant, for example, describing what additional review is
necessary. Please note that comments are not saved until the "Update Comment
Section" button is pressed.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "Users can leave comments on each variant."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_93")
```
## Exporting the Aggregated Table
Once review of the neoantigen candidates has been completed, the results can
be exported by switching to the Export interface via the sidebar. This will
export the Aggregated Report with the updated Evaluation column and comments
added. The report can be exported in either TSV or Excel format.
```{r, fig.align='center', out.width="100%", echo = FALSE, fig.alt= "Users can leave comments on each variant."}
ottrpal::include_slide("https://docs.google.com/presentation/d/1uz39zaObDGKhEVCGzO0JO35CTbC0oRAM0mxgLcMAA9Y/edit#slide=id.g25ad9ce8c9b_0_99")
```