-
Notifications
You must be signed in to change notification settings - Fork 1
/
FAQ
239 lines (165 loc) · 8.08 KB
/
FAQ
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
Biopipe FAQ
-----------
v. 1.0
This FAQ maintained by:
* Shawn Hoon <[email protected]>
---------------------------------------------------------------------------
Contents
---------------------------------------------------------------------------
0. About this FAQ
Q0.1: What is this FAQ?
Q0.2: How is it maintained?
1. Biopipe questions
Q1.1: How do I start?
Q1.2: What are the methods that I have to insert in the
datahandler table?
Q1.3: What should be inserted in the field 'name' in the argument table?
Q1.4: When I create the database for the pipeline should I manually
create the tables?
Q1.5: In the input table, there is a field called 'name'. What's that?
Q1.6: There are 2 dbnames, one in the analysis table and
the other in the dbadaptor table. Are they the same?
Q1.7: The pipeline ran well the first time but didn't the
second time. What could have gone wrong?
Q1.8: The pipeline is not running. All the tables are
populated correctly. What went wrong?
Q1.9: There are many tables that I left empty. Don't we need to put
something in them?
Q1.10: What about the output of the pipeline?
---------------------------------------------------------------------------
0. About this FAQ
---------------------------------------------------------------------------
Q0.1: What is this FAQ?
A: It is the list of Frequently Asked Questions about Biopipe.
Q0.2: How is it maintained?
A: This FAQ was generated using a Perl script and an XML file. All
the files are in the Bioperl distribution directory doc/faq. So do
not edit this file! Edit file faq.xml and run:
% faq.pl -text faq.xml
The XML structure was originally used by the Perl XML project.
Their website seems to have vanished, though. The XML and
modifying scripts were copied from Michael Rodriguez's web site
http://www.xmltwig.com/xmltwig/XML-Twig-FAQ.html and modified to
our needs.
---------------------------------------------------------------------------
1. Biopipe questions
---------------------------------------------------------------------------
Q1.1: How do I start?
A: First install the components, as described in
http://www.biopipe.org/bioperl-pipeline-install.html. Next steps:
- Create a DB for the pipeline, e.g genscan_pipe (follow the
schema of course)
- Update the DB in the PipeConf.pm
- Populate the DB Tables (see Populating the Tables in INSTALL for
more Info)
Q1.2: What are the methods that I have to insert in the
datahandler table?
A: The methods in the datahandler table are used to get
the input for the runnable and store the output of the
runnable. If the input object is in a database, we will be
instantiating a new dbadaptor and use its method to retrieve
the data from the database. The data will be
the input to the runnable. The output of the runnable(an object)
has to be stored in the database. Hence the methods that do this has
to be specified. Example:
+----------------+--------------+---------------------------+-----
-+
| datahandler_id | iohandler_id | method | rank
|
+----------------+--------------+---------------------------+-----
-+
| 1 | 1 | get_Contig_by_internal_id | 1
|
| 2 | 1 | perl_primary_seq | 2
|
| 3 | 2 | get_ScoreAdaptor | 1
|
| 4 | 2 | store_by_PID | 2
|
+----------------+--------------+---------------------------+-----
-+
The method get_Contig_by_internal_id gets the contig from the
database from its internal id. Then the perl_primary_seq is
called upon the contig to obtain a Seq object from the
contig. The output from the runnable (genscan) a seqfeature
object will have to be stored back to the database (or other
database). Hence the adaptor for the database will obtain the
scoreAdaptor via get_ScoreAdaptor method and the score
Adaptor will store the output in the database by the
store_by_PID method.
Q1.3: What should be inserted in the field 'name' in the argument table?
A: The argument table is for the arguments required by the methods in
the datahandler table. Example:
+-------------+----------------+--------+------+--------+
| argument_id | datahandler_id | name | rank | type |
+-------------+----------------+--------+------+--------+
| 1 | 1 | INPUT | 1 | SCALAR |
| 2 | 4 | OUTPUT | 1 | SCALAR |
+-------------+----------------+--------+------+--------+
Since only two methods take arguments (in this case) there
are only 2 entries in the arguments table. In the field
'name', there entries 'INPUT' and 'OUTPUT'. The method with
corresponding datahandler_id's 1 and 4 are the methods that
retrieve the input and store the output respectively.
Q1.4: When I create the database for the pipeline should I manually
create the tables?
A:
No. Users can copy the biopipelinedb-mysql.sql scheme when
they create a new database. Hence they don't have to manually create
tables. Steps:- go to the directory where the schema file
is then type these commands (assuming the dbname is testdb):
mysql>use testdb;
Database changed
mysql> source biopipelinedb-mysql.sql;
Q1.5: In the input table, there is a field called 'name'. What's that?
A: When we fetch an input from the database based on
certain fields, that field will be the name in the input
directory.
If the method get_Contig_by_internal_id is used, then the
internal_id will be the name in this case. The name of every
single entry has to be populated into the input table before
running the pipeline.
Q1.6: There are 2 dbnames, one in the analysis table and
the other in the dbadaptor table. Are they the same?
A: No. The db in the analysis table is for analysis like
Blast where it requires a database to blast against. The
dbname in the dbadaptor table is the database(s) where
the input is retrieved or stored.
Q1.7: The pipeline ran well the first time but didn't the
second time. What could have gone wrong?
A: There could be many reasons for this. The most
probable reason could be that you failed to empty the
output and job tables after running the pipeline the first time.
You might get a message like this:
Tests Completed. Starting Pipeline
Fetched 0 jobs
Waking up and run again!
The two tables have to be reset everytime the pipeline is run.
Q1.8: The pipeline is not running. All the tables are
populated correctly. What went wrong?
A: Again as in (7) there could be more than 1 reason for
this. One possibility is that you might be getting
messages like this:
Tests Completed. Starting Pipeline
Fetched 2 jobs
opening bsub command line:
bsub -o
/data0/tmp//2/genscan_pipe.job_1.Genscan.1025852231.749.out -e
/data0/tmp//2/genscan_pipe.job_1.Genscan.1025852231.749.err -q
/usr/users/savikalpa/src/bioperl-pipeline//Bio/Pipeline/runner.pl
1 2
couldn't submit jobs 1 2 to LSF.
Waking up and run again!
This could be due to some problem in sending the job to LSF.
Try running it locally (i.e. perl PipelineManager.pl -l) if
it runs then it confirms this.
Q1.9: There are many tables that I left empty. Don't we need to put
something in them?
A: Some of the tables like the job and output tables will be
automatically populated as the pipeline runs.
Q1.10: What about the output of the pipeline?
A: The output of the pipeline (i.e. gene object etc.) can be stored
in any database of choice (need the dbadaptor of course).
---------------------------------------------------------------------------
Copyright (c)2002-2003 Open Bioinformatics Foundation. You may distribute
this FAQ under the same terms as perl itself.