-
Notifications
You must be signed in to change notification settings - Fork 8
/
qiime.process.txt
36 lines (20 loc) · 1.95 KB
/
qiime.process.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
##QIIME sequence processing starting from samples that have been demultiplexed on the MiSeq. Fastq.gz can be downloaded from BaseSpace, each sample gets it's own folder. These commands are assuming that you have downloaded the files into your shared folder, then within the qiime vb moved the files to the desktop. QIIME virtual box doesn't like working in the shared folder which is why you need to move your files onto the desktop.
##MARS is naming the samples by user name and the sequences by our unique barcode for the sample (this prevents us giving you the wrong sample but still lets you easily glance through your files to find samples). QIIME will use the beginning of the fastq.gz as the sample name so we need to append your name to the fastq. QIIME does not allow any characters other than "."
#cd into project folder which contains the folders of data that you downloaded from basespace
mkdir fastq
####### Sept 2016 Illumina changed their folder structure
for i in */*.gz; do cp $i "fastq""/"${i%%-*}"."`basename $i`; done
####### Older file stucture
for i in */Data/Intensities/BaseCalls/*.gz; do mv $i "fastq""/"${i%%-*}"."`basename $i`; done
mkdir qiime
multiple_join_paired_ends.py -i . -o qiime/
#remove all fastq that didn't join
mkdir nonjoin
find qiime/ -name "fastqjoin.un*" -print -exec mv {} nonjoin/ \;
find qiime/ -size "0" -print -exec mv {} nonjoin/ \;
multiple_split_libraries_fastq.py -i qiime/ -o qiime/ --demultiplexing_method sampleid_by_file --include_input_dir_path
#remove extra BaseSpace info from sequence names in seqs.fna
cp qiime/seqs.fna qiime/full.fna
sed 's/_S.*_L001.*join.fastq//g' qiime/full.fna > qiime/seqs.fna
#need to make mapfile but it doesn't need the barcode or linker info http://qiime.org/documentation/file_formats.html#handling-already-demultiplexed-samples
validate_mapping_file.py -m AHPLD_mapfile.txt -o . -b -p #as long as your errors are just about missing barcodes you can proceed to the rest of the processing