-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
002fc75
commit 92c7be5
Showing
116 changed files
with
666,716 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
|
||
Lincense: https://opensource.org/licenses/MIT | ||
|
||
Copyright 2021, Pablo Cingolani | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. | ||
|
||
|
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
00:00:39.673 Reading cancer samples pedigree from file 'examples/samples_cancer_one.txt'. | ||
##SnpEffVersion="4.1 (build 2015-01-07), by Pablo Cingolani" | ||
##SnpEffCmd="SnpEff -cancer -cancerSamples examples/samples_cancer_one.txt GRCh37.75 examples/cancer.vcf " | ||
##INFO=<ID=ANN,Number=.,Type=String,Description="Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO' "> | ||
##INFO=<ID=LOF,Number=.,Type=String,Description="Predicted loss of function effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected' "> | ||
##INFO=<ID=NMD,Number=.,Type=String,Description="Predicted nonsense mediated decay effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected' "> | ||
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Patient_01_Germline Patient_01_Somatic | ||
1 69091 . A C,G . PASS AC=1;ANN=G|start_lost|HIGH|OR4F5|ENSG00000186092|transcript|ENST00000335137|protein_coding|1/1|c.1A>G|p.Met1?|1/918|1/918|1/305||,G-C|start_lost|HIGH|OR4F5|ENSG00000186092|transcript|ENST00000335137|protein_coding|1/1|c.1A>G|p.Leu1?|1/918|1/918|1/305||WARNING_REF_DOES_NOT_MATCH_GENOME,C|initiator_codon_variant|LOW|OR4F5|ENSG00000186092|transcript|ENST00000335137|protein_coding|1/1|c.1A>C|p.Met1?|1/918|1/918|1/305||;LOF=(OR4F5|ENSG00000186092|1|1.00) GT 1/0 2/1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
##SnpEffVersion="4.1 (build 2015-01-07), by Pablo Cingolani" | ||
##SnpEffCmd="SnpEff -classic -cancer -cancerSamples examples/samples_cancer_one.txt testHg3775Chr1 examples/cancer.vcf " | ||
##INFO=<ID=EFF,Number=.,Type=String,Description="Predicted effects for this variant.Format: 'Effect ( Effect_Impact | Functional_Class | Codon_Change | Amino_Acid_Change| Amino_Acid_length | Gene_Name | Transcript_BioType | Gene_Coding | Transcript_ID | Exon_Rank | Genotype_Number [ | ERRORS | WARNINGS ] )' "> | ||
##INFO=<ID=LOF,Number=.,Type=String,Description="Predicted loss of function effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected' "> | ||
##INFO=<ID=NMD,Number=.,Type=String,Description="Predicted nonsense mediated decay effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected' "> | ||
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Patient_01_Germline Patient_01_Somatic | ||
1 69091 . A C,G . PASS AC=1;EFF=START_LOST(HIGH|MISSENSE|Atg/Gtg|M1V|305|OR4F5|protein_coding|CODING|ENST00000335137|1|G),START_LOST(HIGH|MISSENSE|Ctg/Gtg|L1V|305|OR4F5|protein_coding|CODING|ENST00000335137|1|G-C|WARNING_REF_DOES_NOT_MATCH_GENOME),NON_SYNONYMOUS_START(LOW|MISSENSE|Atg/Ctg|M1L|305|OR4F5|protein_coding|CODING|ENST00000335137|1|C);LOF=(OR4F5|ENSG00000186092|1|1.00) GT 1/0 2/1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Patient_01_Germline Patient_01_Somatic | ||
1 69091 . A C,G . PASS AC=1 GT 1/0 2/1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
##PEDIGREE=<Derived=Patient_01_Somatic,Original=Patient_01_Germline> | ||
##SnpEffVersion="4.1 (build 2015-01-07), by Pablo Cingolani" | ||
##SnpEffCmd="SnpEff -cancer testHg3775Chr1 examples/cancer_pedigree.vcf " | ||
##INFO=<ID=ANN,Number=.,Type=String,Description="Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO' "> | ||
##INFO=<ID=LOF,Number=.,Type=String,Description="Predicted loss of function effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected' "> | ||
##INFO=<ID=NMD,Number=.,Type=String,Description="Predicted nonsense mediated decay effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected' "> | ||
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Patient_01_Germline Patient_01_Somatic | ||
1 69091 . A C,G . PASS AF=0.1122;ANN=G|start_lost|HIGH|OR4F5|ENSG00000186092|transcript|ENST00000335137|protein_coding|1/1|c.1A>G|p.Met1?|1/918|1/918|1/305||,G-C|start_lost|HIGH|OR4F5|ENSG00000186092|transcript|ENST00000335137|protein_coding|1/1|c.1A>G|p.Leu1?|1/918|1/918|1/305||WARNING_REF_DOES_NOT_MATCH_GENOME,C|initiator_codon_variant|LOW|OR4F5|ENSG00000186092|transcript|ENST00000335137|protein_coding|1/1|c.1A>C|p.Met1?|1/918|1/918|1/305||;LOF=(OR4F5|ENSG00000186092|1|1.00) GT 1/0 2/1 | ||
1 69849 . G A,C . PASS AF=0.1122;ANN=A|stop_gained|HIGH|OR4F5|ENSG00000186092|transcript|ENST00000335137|protein_coding|1/1|c.759G>A|p.Trp253*|759/918|759/918|253/305||,C-A|stop_lost|HIGH|OR4F5|ENSG00000186092|transcript|ENST00000335137|protein_coding|1/1|c.759G>C|p.Ter253Cysext*?|759/918|759/918|253/305||WARNING_REF_DOES_NOT_MATCH_GENOME,C|missense_variant|MODERATE|OR4F5|ENSG00000186092|transcript|ENST00000335137|protein_coding|1/1|c.759G>C|p.Trp253Cys|759/918|759/918|253/305|| GT 1/0 2/1 | ||
1 69511 . A C,G . PASS AF=0.3580;ANN=C|missense_variant|MODERATE|OR4F5|ENSG00000186092|transcript|ENST00000335137|protein_coding|1/1|c.421A>C|p.Thr141Pro|421/918|421/918|141/305||,G|missense_variant|MODERATE|OR4F5|ENSG00000186092|transcript|ENST00000335137|protein_coding|1/1|c.421A>G|p.Thr141Ala|421/918|421/918|141/305||,G-C|missense_variant|MODERATE|OR4F5|ENSG00000186092|transcript|ENST00000335137|protein_coding|1/1|c.421A>G|p.Pro141Ala|421/918|421/918|141/305||WARNING_REF_DOES_NOT_MATCH_GENOME GT 1/1 2/2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
##PEDIGREE=<Derived=Patient_01_Somatic,Original=Patient_01_Germline> | ||
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Patient_01_Germline Patient_01_Somatic | ||
1 69091 . A C,G . PASS AF=0.1122 GT 1/0 2/1 | ||
1 69849 . G A,C . PASS AF=0.1122 GT 1/0 2/1 | ||
1 69511 . A C,G . PASS AF=0.3580 GT 1/1 2/2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
1 1060235 . G A . PASS AC=16 | ||
1 1250957 . G A . PASS AC=9 | ||
1 1310924 . T C . PASS AC=948 | ||
1 1368599 . A C . PASS AC=920 | ||
1 2182470 . G A . PASS AC=743 | ||
1 2466633 . A G . PASS AC=770 | ||
1 2480337 . G A . PASS AC=16 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
#!/bin/sh | ||
|
||
#------------------------------------------------------------------------------- | ||
# | ||
# Command lines for SnpEff's manua (examples) | ||
# | ||
# | ||
# Pablo Cingolani | ||
#------------------------------------------------------------------------------- | ||
|
||
genome="GRCh37.75" | ||
genome="testHg3775Chr1" # Note: Sometimes we can use testHg3775Chr1 instead of GRCh37.75 ('testHg3775Chr1' only loads chr1 so it's faster) | ||
genome22="testHg3775Chr22" # Note: Sometimes we can use testHg3775Chr22 instead of GRCh37.75 ('testHg3775Chr22' only loads chr22 so it's faster) | ||
|
||
#--- | ||
# Multiple annotations per variant examples | ||
#--- | ||
|
||
# java -Xmx4g -jar snpEff.jar $genome examples/variants_1.vcf > examples/variants_1.ann.vcf | ||
# | ||
# java -Xmx4g -jar snpEff.jar $genome examples/variants_2.vcf > examples/variants_2.ann.vcf | ||
|
||
#--- | ||
# Cancer examples | ||
#--- | ||
|
||
# java -Xmx4g -jar snpEff.jar -v -cancer -cancerSamples examples/samples_cancer_one.txt $genome examples/cancer.vcf > examples/cancer.ann.vcf | ||
# | ||
# java -Xmx4g -jar snpEff.jar -v -classic -cancer -cancerSamples examples/samples_cancer_one.txt $genome examples/cancer.vcf > examples/cancer.eff.vcf | ||
# | ||
# java -Xmx4g -jar snpEff.jar -v -cancer $genome examples/cancer_pedigree.vcf > examples/cancer_pedigree.ann.vcf | ||
|
||
#--- | ||
# Regulatory variants | ||
#--- | ||
|
||
# java -Xmx4g -jar snpEff.jar -v -reg HeLa-S3 -reg NHEK $genome examples/test.1KG.vcf > examples/test.1KG.ann_reg.vcf | ||
|
||
#--- | ||
# Encode example | ||
#--- | ||
|
||
# # Create a directory for ENCODE files | ||
# mkdir -p db/encode | ||
# | ||
# # Download ENCODE experimental results (BigBed file) | ||
# cd db/encode | ||
# wget "http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/byDataType/openchrom/jan2011/fdrPeaks/wgEncodeDukeDnase8988T.fdr01peaks.hg19.bb" | ||
# cd - | ||
# | ||
# # Annotate using ENCODE's data: | ||
# java -Xmx4g -jar snpEff.jar -v -interval db/encode/wgEncodeDukeDnase8988T.fdr01peaks.hg19.bb $genome examples/test.1KG.vcf > examples/test.1KG.ann_encode.vcf | ||
|
||
#--- | ||
# Annotation example | ||
#--- | ||
|
||
# java -Xmx4g -jar snpEff.jar -v $genome22 examples/test.chr22.vcf > examples/test.chr22.ann.vcf | ||
|
||
#--- | ||
# SnpSift Filter examples | ||
#--- | ||
|
||
#java -jar SnpSift.jar filter "ANN[0].EFFECT = 'missense_variant'" examples/test.chr22.ann.vcf > examples/test.chr22.ann.filter_missense_first.vcf | ||
|
||
#java -jar SnpSift.jar filter "ANN[*].EFFECT = 'missense_variant'" examples/test.chr22.ann.vcf > examples/test.chr22.ann.filter_missense_any.vcf | ||
|
||
#java -jar SnpSift.jar filter "(ANN[*].EFFECT = 'missense_variant') && (ANN[*].GENE = 'TRMT2A')" examples/test.chr22.ann.vcf > examples/test.chr22.ann.filter_missense_any_TRMT2A.vcf | ||
|
||
#java -jar SnpSift.jar filter "( GEN[HG00096].DS > 0.2 ) & ( GEN[HG00097].DS > 0.5 )" examples/1kg.head_chr1.vcf.gz > examples/1kg.head_chr1.filtered.vcf | ||
#gzip examples/1kg.head_chr1.filtered.vcf | ||
|
||
#--- | ||
# SnpSift extractFields examples | ||
#--- | ||
|
||
#java -jar SnpSift.jar extractFields -s "," -e "." examples/test.chr22.ann.vcf CHROM POS REF ALT "ANN[*].EFFECT" "ANN[*].HGVS_P" > examples/test.chr22.ann.txt | ||
|
||
#java -jar SnpSift.jar extractFields examples/test.chr22.ann.vcf CHROM POS REF ALT "ANN[*].EFFECT" > examples/test.chr22.ann.txt | ||
|
||
#cat examples/test.chr22.ann.vcf \ | ||
# | ./scripts/vcfEffOnePerLine.pl \ | ||
# | java -jar SnpSift.jar extractFields - CHROM POS REF ALT "ANN[*].EFFECT" \ | ||
# > examples/test.chr22.ann.one_per_line.txt | ||
|
||
# java -jar SnpSift.jar extractFields examples/1kg.head_chr1.vcf.gz CHROM POS REF ALT "GEN[HG00096].DS" "GEN[HG00097].DS" #> examples/1kg.head_chr1.txt |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
2L 10000 10999 | ||
2L 12000 12999 | ||
2L 14000 14999 | ||
2L 16000 16999 | ||
2L 18000 18999 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
1 10000 20000 MY_ANNOTATION |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Patient_01_Germline Patient_01_Somatic | ||
Patient_02_Germline Patient_02_Somatic | ||
Patient_03_Germline Patient_03_Somatic | ||
Patient_04_Germline Patient_04_Somatic |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Patient_01_Germline Patient_01_Somatic |
Oops, something went wrong.