-
Notifications
You must be signed in to change notification settings - Fork 2
amarallab/NullSeq
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
############################# # # # Nullseq # # # ############################# This is the development version of NullSeq, please note that we are currently working on a few updates here as well as in a separate version 2.0 repository that is located at: https://github.com/amarallab/NullSeq2.0 If you're looking for the most stable version, I recommend checking out the "releases" section where you'll find the code that was used in our published manuscript. Parameters to run nullseq: $ python nullseq.py [-h] [-m] [-n number] [-l length] [--TT TransTable] [--seq SEQ] [--AA AA] [--GC GC] [-o O] -m Use exact primary AA sequence from input file -n number Number of random sequences generated < Default: 1 > -l length Length of random sequence (# of AAs) < Optional > * Must be specified if running script with --AA csv input. * Otherwise uses length of input seqeunce --TT TransTable NCBI translation table < Default: 11 > --seq SEQ Path to FASTA file with nulceotide sequences < Optional > * either SEQ or AA must be specified * Sequences must include Stop and Start codons --AA AA Path to FASTA file with AA sequence or < Optional > csv with AA usage probabilities * either SEQ or AA must be specified --GC GC GC content of random sequence < Optional > * Must be specified if running script using --AA input. * Otherwise uses GC of input nucleotide seqeunce * [0,100] -o O Output file Path < Default: NullSeq_Output.fasta > Example: 1. Generating Random Sequences From a Known Nuclotide Sequence: $ python nullseq.py -n 10 --seq test_Nseq.fasta --GC and -l can be used to specify GC content and length of random sequence The GC content will be targeted at 50% The random sequences will be 500 amino acids in length $ python nullseq.py -n 10 --seq test_Nseq.fasta --GC 50 -l 500 -m indicates the use of the exact primary amino acid sequence of the input sequence $ python nullseq.py -m -n 10 --seq test_Nseq.fasta --GC 50 2. Generating Random Sequences from AA Usage Probabilities: * --GC -l must be specified $ python nullseq.py -n 10 --AA AAUsage.csv --GC 50 -l 500 Generates random sequence according to primary amino acid usage probabilities in AAUsage.csv 3. Generating Random Sequences from Primary AA Sequences: * --GC must be specified $ python nullseq.py -n 10 --AA test_AAseq.fasta --GC 50 -l can be used to specify length of random sequence $ python nullseq.py -n 10 --AA test_AAseq.fasta --GC 50 -l 500 -m indicates the use of the exact primary amino acid sequence of the input sequence $ python nullseq.py -m -n 10 --AA test_AAseq.fasta --GC 50 Please Cite: Liu SS, Hockenberry AJ, Lancichinetti A, Jewett MC, Amaral LAN (2016) NullSeq: A Tool for Generating Random Coding Sequences with Desired Amino Acid and GC Contents. PLoS Comput Biol 12(11): e1005184. doi:10.1371/journal.pcbi.1005184
About
Creates Random Coding Sequences with specified GC content and Amino Acid usage
Resources
Stars
Watchers
Forks
Packages 0
No packages published