-
Notifications
You must be signed in to change notification settings - Fork 33
Requirements
Migun Shakya edited this page Feb 15, 2017
·
3 revisions
##Important files and folders in EDGE for integration
Before adding pipelines/tools into the EDGE framework, it is important that one understands its basic code structure. Here, I will describe only the parts of the EDGE source code that is important for adding a pipeline. For adding a pipeline there are four files or folders that are important.
- thirdParty/ This folder contains all the third party tools that are used in several of EDGE pipelines. The source code and binaries of the pipeline will reside within this folder.
- runPipeline
This is a Perl script that runs the pipeline based on the input it receives from a file in the format of
config_template.txt
. Essentially, the script readsconfig_template.txt
type file and determines what tools or pipelines to run. - config_template.txt This text file contains information on available pipelines and all the options that are required to run that pipeline.
- testData/ This folder contains test data and a test script associated with each of the available pipeline.
##Pipelines requirement
Bioinformatics pipeline come in many sizes and shapes based on the user requirement. The complexity of a pipeline is directly proportional to a need of a user, and a biologist as a user has many needs. So, in many cases the pipelines are rather complex, even for a simpler task. But, for the sake of EDGE, there are three basic components that a pipeline needs to be integrated into the EDGE.
- A wrapper script This script can be written in any language (Perl/Python/shell) and must contain ability to run pipeline from start to end.
- An install script
This script is important if your pipeline require its own third party tools. For example, an RNA seq pipeline has multiple dependencies. So, to install that pipeline one must have all the necessary tools installed and in the
PATH
. - Test data Every good pipeline needs a test data set that will test the validity of the pipeline.