Skip to content

abureau/multivariateLassosum

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

multivariateLassosum

This R package extends the lassosum package (https://github.com/tshmak/lassosum) to simultaneously analyze summary statistics for multiple genetically correlated traits and produce polygenic scores for each of them. It also allows to specify weights for the Lasso penalty on each coefficient, enabling the application of adaptive versions of the Lasso.

This package was developed by Meriem Bahda with contributions from Jasmin Ricard and Alexandre Bureau.

To install, from within R, type:

library(devtools)
install_github("abureau/multivariateLassosum")

Tutorial

Data

library(multivariateLassosum)
library(data.table)
library(ggplot2)

#Path to the multivariateLassosum package data.
mvl.data <- system.file("data", package="multivariateLassosum")

#Import fictive summary statistics for trait 1 and 2 over 1800 SNPs.
#We provide fictive GWAS sample sizes.
ss.1 <- fread(paste0(mvl.data, "/sumstats.trait1.txt"))
ss.2 <- fread(paste0(mvl.data, "/sumstats.trait2.txt"))
size.1 <- 340000
size.2 <- 320000

#Path to fictive the reference and test PLINK bfiles
ref.bfile <- paste0(mvl.data, "/refpanel")
test.bfile <- paste0(mvl.data, "/testsample")

#We will use LD regions as defined in Berisa and Pickrell (2015) 
#for the European population and the hg19 genome.
LDblocks <- "EUR.hg19"

#From p-value to correlation
cor.1 <- p2cor(p = ss.1$p, n = size.1, sign = ss.1$beta)
cor.2 <- p2cor(p = ss.2$p, n = size.2, sign = ss.2$beta)

#In this tutorial, we assume that phenotypes variance is 1.
#We also provide a fictive genetic covariance matrix of the phenotypes, constant for every SNP.
Var.phenotypic <- c(1,1)
phenotypic.genetic.Var.Cov.matrix <- matrix(c(0.5784, 0.2946, 0.2946, 0.4519),  nrow = 2, ncol = 2)

Reference: Berisa and Pickrell (2015)

multivariateLassosum

These fictive summary statistics are already matched with the reference and the test bfiles. We highly recommand that the user checks the matching between each data set before using multivariateLassosum, using a function such as matchpos, even if the main multivariateLassosum function perform a matching.

#Let's run multivariateLassosum on default lambda and s values.
outMulti <- lassosum.pipeline(cor = list(cor.1, cor.2),
                              phenotypic.genetic.Var.Cov.matrix = phenotypic.genetic.Var.Cov.matrix,
                              Var.phenotypic = Var.phenotypic,
                              chr = list(ss.1$CHR, ss.2$CHR),
                              pos = list(ss.1$POS, ss.2$POS),
                              A1 = list(ss.1$A1, ss.2$A1),
                              A2 = list(ss.1$A2, ss.2$A2),
                              sample_size = c(size.1, size.2),
                              ref.bfile = ref.bfile,
                              LDblocks = LDblocks)

About

Package for Lassosum extension

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published