Skip to content

Latest commit

 

History

History
128 lines (118 loc) · 8.34 KB

README.md

File metadata and controls

128 lines (118 loc) · 8.34 KB

   ▄▄▄▄▄▄▄▄▄▄▄  ▄▄▄▄▄▄▄▄▄▄▄        ▄▄▄▄▄▄▄▄▄▄▄  ▄▄▄▄▄▄▄▄▄▄▄  ▄▄       ▄▄  ▄ 
  ▐░░░░░░░░░░░▌▐░░░░░░░░░ ░░▌▐░░░░░░░ ░░░░▌▐░░░░░░░░░░░▌▐░░▌     ▐░░▌▐░▌
  ▐░█▀▀▀▀▀▀▀▀▀ ▐░█▀▀▀▀▀▀▀▀▀ ▐░█▀▀▀▀▀▀▀▀▀    ▀▀▀▀█░█▀▀▀▀ ▐░▌░▌   ▐░ ▐░▌▐░▌
  ▐░▌          ▐░▌          ▐░▌               ▐░▌     ▐░▌▐░▌ ▐░▌▐░▌▐░▌
  ▐░▌ ▄▄▄▄▄▄▄▄ ▐░▌ ▄▄▄▄▄▄▄▄ ▐░█▄▄▄▄▄▄▄▄▄       ▐░▌     ▐░▌ ▐░▐░▌ ▐░▐░▌
  ▐░▌▐░░░░░░░░▌▐░▌▐░░░░░░░░▌▐░░░░░░░░░░ ░▌     ▐░▌     ▐░▌  ▐░▌  ▐░▌▐░▌
  ▐░▌ ▀▀▀▀▀▀█░▌▐░▌ ▀▀▀▀▀▀█░▌ ▀▀▀▀▀▀▀▀▀█░▌     ▐░▌     ▐░▌   ▀   ▐░▌▐░▌
  ▐░▌       ▐░▌▐░▌       ▐░▌          ▐░▌     ▐░▌     ▐░▌       ▐░▌ ▀ 
  ▐░█▄▄▄▄▄▄▄█░▌▐░█▄▄▄▄▄▄▄█░▌ ▄▄▄▄▄▄▄▄▄█░▌ ▄▄▄▄█░█▄▄▄▄ ▐░▌       ▐░▌ ▄ 
  ▐░░░░░░░░░░░▌▐░░░░░░░░░░░▌▐░░░░░░░░░░░▌▐░░░░░░░░░░░▌▐░▌       ▐░▌▐░▌
   ▀▀▀▀▀▀▀▀▀▀▀  ▀▀▀▀▀▀▀▀▀▀▀  ▀▀▀▀▀▀▀▀▀▀▀  ▀▀▀▀▀▀▀▀▀▀▀  ▀         ▀  ▀ 
                                                                    

Robust simulation of whole-genome gGraphs featuring

  • Coverage of tumor and normal tracks
  • SNP phasing
  • Junction phasing
  • Coverage of SNPs

Tutorial

ggsim is the main function in this package to simulate a genome. Given a set of junctions and short-nucleotide polymorphisms, ggsim will create robust sex-informed, junction-balanced phased and unphased gGraphs with corresponding coverages that match the user's input purity and ploidy.

The essential parameters to this function are junctions, vcf, bias, and nbias, and must be supplied. junctions is a Junctions object, vcf defines the SNP profile for the sim genome, and bias/nbias represent the normal coverage vectors multiplied to the tumor/normal coverage, respectively, to simulate real-world fluctuations in read depth. More details about function parameters are outlined in the table below.

Parameter Default value Description/notes
junctions Junctions to add to gGraph as a GRangesList
vcf Phased VCF of germline heterozygous SNPs. Can use any pileup of a genome or a [Platinum Genome from Illumina](https://github.com/Illumina/PlatinumGenomes).

NOTE: the sex of this input determines the sex of the simulated genome. Presence/absence of heterozygous SNPs will define genome as F/M, with subsequent effects on the defined CN/haplotyping of sex chromosomes.
bias .rds of binned read depth bias for tumor sample e.g. read depth for a random normal sample
nbias .rds of binned read depth bias for normal sample e.g. read depth for a random normal sample
snps NULL Optional comprehensive VCF of reference snps e.g. hapmap
unmappable NULL Optional .rds of GRanges of CN unmappable regions
coverage 60 Target tumor base coverage
ncoverage 40 Target normal base coverage
alpha\ (purity) 1.00 Target purity
tau\ (ploidy) 1.00 Target ploidy
poisson TRUE Add shot noise to read depth?
numbreaks 10 Number of additional breaks to add in CN-unmappable regions
width 1000 Bin width of read depth
cnloh FALSE Add a copy-neutral loss of heterozygosity edge?
standard.chr c(1:22, "X", "Y") Defaults to human chromosomes
outdir ./ Path to save
par.path system.file("extdata", "PAR_hg19.rds", package = 'ggSim') GRanges identifying pseudoautosomal regions in X, Y chromosomes. This is used to agnosticize the sex of the bias/nbias vectors.