Skip to content

TransIntegrator : a pipeline for integrative transcript library construction

Notifications You must be signed in to change notification settings

BADDxmu/InTrans

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TransIntegrator : a pipeline for integrative transcript library construction

Integrative transcript library for Branchiostoma floridae (www.bio-add.org/InTrans/)

Copyright (C) 2017 ZhiLiang Ji ([email protected])

RequirementT

This software is suitable for all unix-like system with python(version 2.7.7) installed.
One python module was required before usage : configparser3.5.0.

Moreover, three already published softwares should be correctly installed in advance, and make sure they had been add to your system environment variables. The three softwares are:
(1) IDBA (version 1.1.1) https://github.com/loneknightpy/idba
(2) CD-HIT (version 4.5.4) http://www.bioinformatics.org/cd-hit/
(3) CAP3 (version 12/21/07) http://seq.cs.iastate.edu/cap3.html
of course, for softwares mentioned above, other version is allowed. However, the pipeline operated stably with the recommended version.

Installation Guide

Simply installed by extracting the software package

Usage

In the package folder you extracted, there are three files and one derectory : "InTrans.py", "run.cfg", "__init__.py" and "test_data"
(1) "InTrans.py" is the software executed file
(2) "run.cfg" is the configure file, which contains a series of important parameters. For correctly running with your data, you set the right parameter value in "run.cfg" file. Detail of these parameters is writed in "run.cfg", or if you confused, please see the corresponding software manual.

Warnning

(1) the default maximun read length of IDBA is 128 bp, if your read is longer than that, you should change the vaue of 128 to longer one (e.g. 250) in "xx/idba-xxx/src/sequence/short_sequence.h" :
"static const uint32_t kMaxShortSequence = 128;"
->
"static const uint32_t kMaxShortSequence = 250;"
(2) correspondingly, you should also change the default kmer unit to bigger one(e.g. 8) in "xx/idba-xxxsrc/basic/kmer.h":
"static const uint32_t kNumUint64 = 4;"
->
"static const uint32_t kNumUint64 = 8;"
(3) recompile IDBA after modification to make new read length and kmer working

Running

If individual parameter value had been set in "run.cfg" file, then run the pipeline with:
$ python InTrans.py run.cfg
For example, you can make a test running with datas in "test_data":
(1) run without heterogeneous data, corresponding configure file is run_fq.cfg:
$ cd ./test_data/
$ python ../InTrans.py run_fq.cfg
(2) run with heterogeneous data, corresponding configure file is run_fq_heterogen.cfg:
$ cd ./test_data/
$ python ../InTrans.py run_fq_heterogen.cfg

Output

Two folders and one log file were generated after the program runs out:
(1) "output" folder
contains the final transcript file, which in fasta format.
(2) "temp_output" folder
contains the temporary file during running, include output of IDBA, CD-HIT, and CAP3.

About

TransIntegrator : a pipeline for integrative transcript library construction

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages