-
-
Notifications
You must be signed in to change notification settings - Fork 11
/
ASReviewLAB.qmd
312 lines (227 loc) · 14.1 KB
/
ASReviewLAB.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
---
title: "Introductory exercise to ASReview LAB"
author: "The ASReview Academy Team"
---
**This exercise was made using ASReview LAB version 1.6. If you have a different version you can observe differences between your setup and the exercise.**
## Introduction to the software ASReview LAB v1.6
The **goal** of this exercise is to get familiar with AI-aided screening
by making use of ASReview LAB v1.6.
You will learn how to install and set up the software, upload data,
screen records, and export and interpret the results. The exercise will
guide you through all the steps of AI-aided screening as described in
the [workflow on
Read-the-Docs](https://asreview.readthedocs.io/en/latest/about.html#labeling-workflow-with-asreview){target="_blank"}.
Enjoy!
## Getting familiar
Before you start, you might want to read a bit more on:
- [What is ASReview
LAB](https://asreview.readthedocs.io/en/latest/about.html#what-is-asreview-lab){target="_blank"}?
- [The terminology
used](https://asreview.readthedocs.io/en/latest/about.html#asreview-lab-terminology){target="_blank"}
- The paper that was published in [Nature Machine
Intelligence](https://www.nature.com/articles/s42256-020-00287-7){target="_blank"}
## Prerequisites
### Installing ASReview LAB
Before you start the tutorial ASReview LAB needs to be installed; see for instructions the [ASReview website](https://asreview.nl/download/){target="_blank"}.
*Have you installed ASReview LAB? You can proceed to the exercise!*
## Exercise
### Step 1: Starting ASReview LAB
Open ASReview LAB in your browser. Note that if you do this via Command Prompt (Windows) or Terminal (MacOS) you have to keep your command-line interpreter running while using ASReview LAB, even though the interface is in your browser!
*Have you opened ASReview LAB in your browser? If so, you can proceed to
step 2!*
### Step 2: Creating a project
Now that you have opened ASReview LAB, you can create a
new project. Below you will find a step-by-step guide.
1. *New project;*
Hover your mouse over the ‘`create`’ button with the plus sign in the
bottom right corner.
![](images/ASReviewLAB/step_2a_V1_6.png){width=85% fig-align="center"}
2. *Project name;*
Select Validation Mode, fill out a project name and press ‘`NEXT`’. Note
that you can fill out your name and a description as well.
![](images/ASReviewLAB/step_2b_V1_6.png){width=85% fig-align="center"}
For this exercise we are screening in the so-called ‘`Validation Mode`’
of ASReview. By screening in the [Validation
Mode](https://asreview.readthedocs.io/en/latest/project_create.html#project-modes){target="_blank"},
we are going to make use of a [benchmark
dataset](https://asreview.readthedocs.io/en/latest/data_labeled.html#fully-labeled-data){target="_blank"}.
This means that all records in the dataset have already been labeled as
relevant or irrelevant. This is indicated to the user through a banner
above each article. Note that in ‘`Oracle Mode`’ - when screening your own
dataset - the relevant papers will not be marked; you, the oracle, have
to make the decisions.
More detailed information about setting up a project can be found on
[ReadTheDocs](https://asreview.readthedocs.io/en/latest/project_create.html#project-information){target="_blank"}.
*Have you started creating a new project? If so, you can proceed to step
3!*
## Project setup
### Step 3: The dataset
Now that you have created your ASReview project (woohoo!), you need to set it
up. Without data, we have nothing to screen. So, you need to tell ELAS
which dataset you want to screen for relevant articles.
Click on the ‘`ADD`’ button next to ‘`Add dataset`’. Now a menu appears
where you can choose how to load the dataset. You can add your dataset
by selecting a file or providing an URL. For this exercise, we will use
a benchmark dataset.
Go to the ‘`Benchmark datasets`’ button, open the first dataset (i.e. the
Van de Schoot (2017) dataset about PTSD trajectories) and click on
‘`ADD`’. After you select the dataset, click on ‘`SAVE`’.
![](images/ASReviewLAB/step_3_V1_6.png){width=85% fig-align="center"}
*Have you successfully selected/uploaded the dataset? If so, you can
proceed to step 4!*
### Step 4: Prior knowledge
Before you can start screening the records, you need to tell ELAS what
kind of records you <u>are</u> and what kind of records you <u>are
not</u> looking for (i.e., relevant and irrelevant records,
respectively). We call this *prior knowledge*. Based on the prior
knowledge you provide, ELAS will reorder the stack of papers and provide
you with the record that is most likely to be relevant (default
settings).
When performing a systematic review with your own data, you need to
provide the prior knowledge yourself (at least one relevant and one
irrelevant record). However, because you are using the Validation Mode
of ASReview, the relevant records are known; the original authors have
already read ALL records.
To select the prior knowledge you first need to click on the ‘`ADD`’
button next to ‘`Add prior knowledge`’; see also the documentation
about the selection of [prior knowledge](https://asreview.readthedocs.io/en/latest/project_create.html#select-prior-knowledge){target="_blank"}.
Now you will see a menu about selecting prior knowledge.
The following five papers are known to be relevant:
- Latent Trajectories of Trauma Symptoms and Resilience
(DOI: [10.4088/JCP.13m08914](https://doi.org/10.4088/jcp.13m08914){target="_blank"})
- A Latent Growth Mixture Modeling Approach to PTSD Symptoms in Rape
Victims (DOI: [10.1177/1534765610395627](https://doi.org/10.1177/1534765610395627){target="_blank"})
- Peace and War: Trajectories of Posttraumatic Stress Disorder
Symptoms Before, During, and After Military Deployment in
Afghanistan (DOI: [10.1177/0956797612457389](https://doi.org/10.1177/0956797612457389){target="_blank"})
- The relationship between course of PTSD symptoms in deployed U.S.
Marines and degree of combat exposure (DOI: [10.1002/jts.21988](https://doi.org/10.1002/jts.21988){target="_blank"})
- Trajectories of trauma symptoms and resilience in deployed US
military service members: Prospective cohort study (DOI: [10.1192/bjp.bp.111.096552](https://doi.org/10.1192/bjp.bp.111.096552){target="_blank"})
To add the relevant records, you click on ‘`Search`’, copy and paste the titles
of these relevant records one by one in the search bar and add them as
relevant.
![](images/ASReviewLAB/step_4_V1_6.png){width=85% fig-align="center"}
After adding all five relevant records, you can add some irrelevant ones
by clicking the ‘`Random`’ button (use the arrow in the upper left corner
to be able to select this button) and by changing ‘`relevant`’ to
‘`irrelevant`’. Select five irrelevant records and click on ‘`CLOSE`’.
*Have you selected five relevant and five irrelevant records? If so, you
can proceed to step 5!*
### Step 5: Active learning model
The last step to complete the setup is to select the active learning
model you want to use. The default settings (i.e. Naïve Bayes, Max and
tf-idf) will suffice for this exercise. If you want to change the mode,
read which options are
[built-in](https://asreview.readthedocs.io/en/latest/project_create.html#model){target="_blank"}
or add your own model via a
[template](https://github.com/asreview/template-extension-new-model){target="_blank"}.
You can click on ‘`NEXT`’. A menu with the defaults will appear. Since we
are using the defaults, you can click on ‘`NEXT`’ again. In the last step
of the setup, ASReview LAB runs the feature extractor, trains a model,
and ranks the records in your dataset. Depending on the model and the
size of your dataset, this can take a couple of minutes (meanwhile, you
can enjoy the animation video or read the blog post on [What is Active
Learning?](https://asreview.nl/blog/active-learning-explained/){target="_blank"}).
After the project is successfully initialized, you can start reviewing.
![](images/ASReviewLAB/step_5a_V1_6.png){width=85% fig-align="center"}
*Have you finished the setup? If so, you can proceed to step 6!*
![](images/ASReviewLAB/step_5b_V1_6.png){width=85% fig-align="center"}
## Screening phase
### Step 6: Screening the records
Everything is set up and ready to screen, well done!
Since we are in the Validation Mode of ASReview, you can pretend to be an
expert on the topic of PTSD and pretend you have all the knowledge of the
original screeners. All records in the dataset have been labeled as
relevant/irrelevant, which is indicated through a banner above each article.
Click on the heart shaped Relevant button if the record is marked as relevant.
If not, you can press the Irrelevant button.
Now, all we need is a Stopping Rule to determine when you are confident
that you have identified (almost) all relevant records in your dataset.
For this exercise, continue screening records until you have marked *50
consecutive records as irrelevant*. You can check up on your progress in
the [Analytics
page](https://asreview.readthedocs.io/en/latest/progress.html){target="_blank"}.
When you have reached your Stopping Rule and you are done screening, go
back to the Analytics page. Here you can see the summary statistics of
your project such as the number of records you have labeled relevant or
irrelevant. It also shows how many records are in your dataset and how
many records you labeled irrelevant since you have screened the last
relevant record. For more information about how to read these summary
statistics and interpret the corresponding charts, check out the
[documentation](https://asreview.readthedocs.io/en/latest/progress.html#analytics){target="_blank"}
on the Analytics page.
The Van de Schoot (2017) dataset contained 38 relevant records in this
particular example. Did you get to label all of them as relevant before
you reached your Stopping Rule? If you did, great!
What is the percentage of total papers you needed to screen to find the
number of relevant records you have found? Is it \<100%? Then, you were
quicker compared to the original screeners of the dataset!
You probably had to screen about only 2-3% of the data. Amazing right?!
Chances are though that you did not get to see a couple of relevant
records before you stopped screening. Do you think this is acceptable?
There is a trade-off between the time spent screening and the error
rate: the more records you screen, the lower the risk of missing a
relevant record. However, screening all records in your dataset is still
no guarantee for an error rate of zero, since even traditional screening
by humans - which is the gold standard to which we compare AI-assisted
screening - is not perfect [^1].
Your willingness to accept the risk that you may exclude some relevant
records is something to take into account when deciding on a Stopping
Rule. Read more about Stopping Rules and how to decide on a good
strategy for your data on the [discussion
platform](https://github.com/asreview/asreview/discussions){target="_blank"}.
### Step 7: Extracting and inspecting the data
Now that you found all or most relevant records, you can export your
data using [these
instructions](https://asreview.readthedocs.io/en/latest/progress.html#export-results){target="_blank"}.
If you choose to inspect your data in Excel, download the data in
‘`Excel`’ format. If you prefer to inspect your data in R, download the
‘`CSV (UTF-8)`’ format and open it in R.
You can find all the data that was originally imported to ASReview in
the exported data file, in a new order and with two new columns added at
the end.
Using the information about the [Read the Docs
page](https://asreview.readthedocs.io/en/latest/progress.html#export-results){target="_blank"}
can you reorder your data to appear in the order in which you loaded
them into ASReview? And back to the order provided by ASReview?
Check if the number of records coded `included = 1` corresponds to the number
of relevant records on your Analytics page. From which row number, based on
the original ordering, do the included articles come from?
For the last exercise, it is important to change the order back to the order
provided by ASReview. Lastly, check out the first few records with no number
in the `included` column. Are those articles labeled as ‘`relevant`’ in the
original dataset? (Whether or not a record was pre-labeled as relevant is
shown in the column `label_included` in the original dataset.)
## Goal
In the beginning of the LAB the following goal was specified: “The
**goal** of this LAB is to get familiar with AI-aided screening by
making use of ASReview LAB.”
Did you achieve this goal?
If so: **congratulations!** You now know all the steps to create and
screen for a systematic review. ELAS wishes you a lot of fun screening
with ASReview!
Do you like the software, leave a star on
[Github](https://github.com/asreview/asreview){target="_blank"}; this will help to
increase the visibility of the open-source project.
## What’s next?
Some suggestions:
- Read a blog posts about:
- [Five ways to get involved in
ASReview](https://asreview.nl/blog/open-source-and-research/){target="_blank"},
- [Seven ways to integrate ASReview in your systematic review
workflow](https://asreview.nl/blog/seven-ways-to-integrate-asreview/){target="_blank"},
- [Active learning
explained](https://asreview.nl/blog/active-learning-explained/){target="_blank"}.
- Ready to start your own project? Upload [your own
data](https://asreview.readthedocs.io/en/latest/data.html){target="_blank"} and start
screening in the Oracle mode!
- Try to find the hidden memory game in ASReview (some people found it
by going through the [source
code](https://github.com/asreview/asreview/tree/master/asreview){target="_blank"} on
Github… +1 for open-science!)
![](images/ASReviewLAB/game_V1_6.png){width=85% fig-align="center"}
[^1]: Wang Z, Nayfeh T, Tetzlaff J, O’Blenis P, Murad MH (2020) Error
rates of human reviewers during abstract screening in systematic
reviews. PLOS ONE 15(1): e0227742.
<[https://doi.org/10.1371/journal.pone.0227742](https://doi.org/10.1371/journal.pone.0227742){target="_blank"}>