Predicting biosynthetic gene clusters in genomes. Authors: Peter Cimermancic & Michael Fischbach
Requirements:
- python (2.X)
- numpy
Instructions:
-
an example of input file: example_input.txt: COLUMN DESCRIPTION:
- GeneID
- Sequencing status
- Organism name
- Scaffold OID
- Organism OID
- Locus Tag
- Gene Start
- Gene End
- Strand
- Pfam Template Start
- Pfam Template End
- Pfam Start
- Pfam End
- PfamID
- Pfam E-score
- Enzyme ID
-
if your input file format differs from the one above, please modify the file or lines 51-55 of the ClusterFinder.py script
-
an example of running ClusterFinder is shown in ClusterFinder.py script DESCRIPTION:
- modify paths (if not running from ClusterFinder directory - lines 7, 19 & 20)
- name the organism and the input file - lines 15 & 16
- run: python ClusterFinder.py
-
testing run: python ClusterFinder.py without making any changes to the files
-
OUTPUT1 [organims_name.out]: same as input + a column with probability values
-
OUTPUT2 [organism_name.clusters.out]: same as OUTPUT1, but only for the domains from gene clusters that have passed the filtering steps.