FAQ


Biopipe FAQ
-----------
v. 1.0

This FAQ maintained by:
* Shawn Hoon <shawnh@fugu-sg.org>


---------------------------------------------------------------------------

Contents

---------------------------------------------------------------------------

0. About this FAQ

   Q0.1: What is this FAQ?
   Q0.2: How is it maintained?

1. Biopipe questions

   Q1.1: How do I start?
   Q1.2: What are the methods that I have to insert in the
		  datahandler table?
   Q1.3: What should be inserted in the field 'name' in the argument table?
   Q1.4: When I create the database for the pipeline should I manually
	 create the tables?
   Q1.5: In the input table, there is a field called 'name'. What's that?
   Q1.6: There are 2 dbnames, one in the analysis table and
		  the other in the dbadaptor table. Are they the same?
   Q1.7: The pipeline ran well the first time but didn't the
		  second time. What could have gone wrong?
   Q1.8: The pipeline is not running. All the tables are
		  populated correctly. What went wrong?
   Q1.9: There are many tables that I left empty. Don't we need to put
	 something in them?
  Q1.10: What about the output of the pipeline?


---------------------------------------------------------------------------

0. About this FAQ

---------------------------------------------------------------------------



   Q0.1: What is this FAQ?

      A: It is the list of Frequently Asked Questions about Biopipe.


   Q0.2: How is it maintained?

      A: This FAQ was generated using a Perl script and an XML file. All
	 the files are in the Bioperl distribution directory doc/faq. So do
	 not edit this file! Edit file faq.xml and run:

	   % faq.pl -text faq.xml

	 The XML structure was originally used by the Perl XML project.
	 Their website seems to have vanished, though. The XML and
	 modifying scripts were copied from Michael Rodriguez's web site
	 http://www.xmltwig.com/xmltwig/XML-Twig-FAQ.html and modified to
	 our needs.


---------------------------------------------------------------------------

1. Biopipe questions

---------------------------------------------------------------------------



   Q1.1: How do I start?

      A: First install the components, as described in
	 http://www.biopipe.org/bioperl-pipeline-install.html. Next steps:

	 - Create a DB for the pipeline, e.g genscan_pipe (follow the
	 schema of course)

	 - Update the DB in the PipeConf.pm

	 - Populate the DB Tables (see Populating the Tables in INSTALL for
	 more Info)


   Q1.2: What are the methods that I have to insert in the
		  datahandler table?

      A: The methods in the datahandler table are used to get
	 the input for the runnable and store the output of the
	 runnable. If the input object is in a database, we will be 
         instantiating a new dbadaptor and use its method to retrieve
	 the data from the database. The data will be
	 the input to the runnable. The output of the runnable(an object)
	 has to be stored in the database. Hence the methods that do this has
	 to be specified. Example:
	 
	       
	 +----------------+--------------+---------------------------+-----
	 -+
	 | datahandler_id | iohandler_id | method		     | rank
	 |
	 +----------------+--------------+---------------------------+-----
	 -+
	 |		1 |	       1 | get_Contig_by_internal_id |	  1
	 |
	 |		2 |	       1 | perl_primary_seq	     |	  2
	 |
	 |		3 |	       2 | get_ScoreAdaptor	     |	  1
	 |
	 |		4 |	       2 | store_by_PID 	     |	  2
	 |
	 +----------------+--------------+---------------------------+-----
	 -+
	     
	 The method get_Contig_by_internal_id gets the contig from the
	 database from its internal id. Then the perl_primary_seq is
	 called upon the contig to obtain a Seq object from the
	 contig. The output from the runnable (genscan) a seqfeature
	 object will have to be stored back to the database (or other
	 database). Hence the adaptor for the database will obtain the 
         scoreAdaptor via get_ScoreAdaptor method and the score
	 Adaptor will store the output in the database by the
	 store_by_PID method.


   Q1.3: What should be inserted in the field 'name' in the argument table?

      A: The argument table is for the arguments required by the methods in
	 the datahandler table. Example:
	     
	     
	 +-------------+----------------+--------+------+--------+
	 | argument_id | datahandler_id | name	 | rank | type	 |
	 +-------------+----------------+--------+------+--------+
	 |	     1 |	      1 | INPUT  |    1 | SCALAR |
	 |	     2 |	      4 | OUTPUT |    1 | SCALAR |
	 +-------------+----------------+--------+------+--------+
	     
	 Since only two methods take arguments (in this case) there
	 are only 2 entries in the arguments table. In the field
	 'name', there entries 'INPUT' and 'OUTPUT'. The method with 
         corresponding datahandler_id's 1 and 4 are the methods that
	 retrieve the input and store the output respectively.


   Q1.4: When I create the database for the pipeline should I manually
	 create the tables?

      A: 
	 No. Users can copy the biopipelinedb-mysql.sql scheme when
	 they create a new database. Hence they don't have to manually create
	 tables. Steps:- go to the directory where the schema file
	 is then type these commands (assuming the dbname is testdb):

		mysql>use testdb;
		Database changed
		mysql> source biopipelinedb-mysql.sql;


   Q1.5: In the input table, there is a field called 'name'. What's that?

      A: When we fetch an input from the database based on
	 certain fields, that field will be the name in the input
	 directory.
	 
	 If the method get_Contig_by_internal_id is used, then the
	 internal_id will be the name in this case. The name of every
	 single entry has to be populated into the input table before
	 running the pipeline.	  


   Q1.6: There are 2 dbnames, one in the analysis table and
		  the other in the dbadaptor table. Are they the same?

      A: No. The db in the analysis table is for analysis like
         Blast where it requires a database to blast against. The
	 dbname in the dbadaptor table is the database(s) where
	 the input is retrieved or stored.


   Q1.7: The pipeline ran well the first time but didn't the
		  second time. What could have gone wrong?

      A: There could be many reasons for this. The most
	 probable reason could be that you failed to empty the
	 output and job tables after running the pipeline the first time.
	 You might get a message like this:
	       
	       Tests Completed. Starting Pipeline
	       Fetched 0 jobs
	       Waking up and run again!
	       
	     The two tables have to be reset everytime the pipeline is run. 

   Q1.8: The pipeline is not running. All the tables are
		  populated correctly. What went wrong?

      A: Again as in (7) there could be more than 1 reason for
         this. One possibility is that you might be getting
	 messages like this:
	 
	      Tests Completed. Starting Pipeline
	      Fetched 2 jobs
	      opening bsub command line:
	      bsub -o
	 /data0/tmp//2/genscan_pipe.job_1.Genscan.1025852231.749.out -e
	 /data0/tmp//2/genscan_pipe.job_1.Genscan.1025852231.749.err -q 
	 /usr/users/savikalpa/src/bioperl-pipeline//Bio/Pipeline/runner.pl
	 1 2
	      couldn't submit jobs 1 2 to LSF.
	      Waking up and run again!
	 
	 This could be due to some problem in sending the job to LSF.
	 Try running it locally (i.e. perl PipelineManager.pl -l) if
	 it runs then it confirms this.


   Q1.9: There are many tables that I left empty. Don't we need to put
	 something in them?

      A: Some of the tables like the job and output tables will be
	 automatically populated as the pipeline runs.


  Q1.10: What about the output of the pipeline?

      A: The output of the pipeline (i.e. gene object etc.) can be stored
	 in any database of choice (need the dbadaptor of course).

---------------------------------------------------------------------------
Copyright (c)2002-2003 Open Bioinformatics Foundation. You may distribute
this FAQ under the same terms as perl itself.