Skip to content

Latest commit

 

History

History
33 lines (21 loc) · 1.71 KB

README.md

File metadata and controls

33 lines (21 loc) · 1.71 KB

FuncRed

Quantify expressed redundancy of multiple functions between microbial communities

##Input file format A tab-delimited text file with 10 columns (the first 7 for the taxonomy from species up to phylum, Function (KEGG KO), Sample/Replicate-ID, Sample-group), for example:

Marinomonas ushuaiensis	Marinomonas	Oceanospirillaceae	Oceanospirillales	Gammaproteobacteria	Proteobacteria	Bacteria	K02954	Bb3 	 Bb
Tenacibaculum mesophilum	Tenacibaculum	Flavobacteriaceae	Flavobacteriales	Flavobacteriia	Bacteroidetes	Bacteria	K04043	Fb2 	 Fb
Polaribacter dokdonensis	Polaribacter	Flavobacteriaceae	Flavobacteriales	Flavobacteriia	Bacteroidetes	Bacteria	K03648	Mb1 	 Mb
Marinomonas sp. MWYL1	Marinomonas	Oceanospirillaceae	Oceanospirillales	Gammaproteobacteria	Proteobacteria	Bacteria	K01869	Fb3 	 Fb
Marinomonas sp. MWYL1	Marinomonas	Oceanospirillaceae	Oceanospirillales	Gammaproteobacteria	Proteobacteria	Bacteria	K03583	Bb2 	 Bb
Psychroserpens mesophilus	Psychroserpens	Flavobacteriaceae	Flavobacteriales	Flavobacteriia	Bacteroidetes	Bacteria	K03043	Mb3 	 Mb

##Subsampling To randomly get equal numbers of reads/entries/"individuals" for each sample/replicate use

shuf original_input_file.txt > shuffled_input_file.txt

(for Mac OS X you'll need to install coreutils via "brew install coreutils" or "sudo port install coreutils" and run gshuf instead)

and then

subsample.pl shuffled_input_file.txt subsampled_shuffled_input_file.txt

##Usage FunctionalRedundancy.pl input_file output_file tax_levels_of_interest

Example:

FunctionalRedundancy.pl subsampled_shuffled_input_file.txt Myoutput 1 2 3 4

writes Myoutput taking subsampled_shuffled_input_file.txt as input calculating FR on genus, family, order and class level (1,2,3,4)