A v-SVR based noise constrained Recursive Feature Extraction algorithm for robust deconvolution of cell-types mixture from molecular signatures
Since the significant impact of immunotherapy in cancer, the estimation of the immune cell-type proportions present in a tumor becomes crucial. Currently, the deconvolution of the cell mixture content of a tumor is carried out by different analytic tools, yet the accuracy of inferred cell type proportions has room for improvement. We improve tumor immune environment characterization developing MIXTURE, an analytical method based on a noise constrained recursive variable selection for a support vector regression. Please Get the biorxiv 2018 manuscript or the Briefings in Bioinformatics 2020 manuscript
The MIXTURE shiny App has been only tested on Linux. The RUN_MIXTURE code was tested on Linux, Windows and Mac. On windows only one CPU core is allowed.
New! MIXTURE in Python
You may test our shiny app on the free shiny server
- You may use the following Excel file. Download it in your computer
- Launch MIXTURE Shiny web app
- How to use presentation
The current "functional like" version of the software requires the following libraries. (we recommend to use the R library version)
- data.table
- ComplexHeatmap
- ade4
- ggplot2
- circlize
- e1071
- preprocessCore
- nnls
- plyr
- abind
- openxlsx
- stringr
Here you will find information regarding how to use the Shiny appplication as well as the comman line option. Please check it to see how to prepare your data file.
This example tends to estimate the cell-types present in LM22 signature matrix from Newman et al. on some BRCA TCGA RNAseq samples
##PLS VERIFY YOUR CURRENT DIRECTORY. IT SHOULD BE THE ONE WHERE YOU DOWNLOAD THE FILES.
## something like
##My favorite dir/Utils
##My favorite dir/Data
library(openxlsx)
load("Data/LM22.RData")
source("Utils/MIXTURE.DEBUG_V0.1.R")
##Choose you sample file
sgm <- read.xlsx("Data/BRCA.subsample.xlsx")
## Verify if duplicated gene symbols
if( any(duplicated(sgm[,1]))){
m <- avereps(sgm[,-1], ID= sgm[,1])
rownames(m ) <- unique(sgm[,1])
sgm <- m
}else{
rownames(sgm ) <- sgm[,1]
sgm <- sgm[,-1]
}
### multicore
## Verify your available cpu cores
num.cores <- 3L #if winfdows, only one is possible
##
mix.test <- MIXTURE(expressionMatrix = sgm, #N x ncol(signatureMatrix) gene expresion matrix to evaluate
##rownames(M) should be the GeneSymbols
signatureMatrix = LM22, #the gene signature matrix (W) such that M = W*betas' (i.e the LM22 from Newman et al)
iter = 0L, # amount of iteration in the statistical test (null distribution)
functionMixture = nu.svm.robust.RFE, #cibersort, nu.svm.optim.rfe, nnls = the cibersort model,
##nu-svm Recursive Feature Extraction and non negative lest squares
useCores = num.cores, #cores for parallel processing
verbose = TRUE, #TRUE or FALSE mesages
nullDist = "PopulationBased", #"none" or "PopulationBased", if the statistical test should be performed
fileSave = "TETS_MIXTURE_RESULTS.xlsx") #EXCEL file name to stare the results
save(mix.test, file = "TEST_MIXTURE_FILE_LM22.RData") #save full lista as an RData object.
Download the file BRCA_TCGA_MIXTURE_paper.R and open it. Follow the code. You will need to install the following packages:
The TCGA BRCA data ready to use by the BRCA_TCGA_MIXTURE_paper.R script can be downloaded from here
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.
Unveiling the immune infiltrate modulation in cancer and response to immunotherapy by MIXTURE—an enhanced deconvolution method Elmer A Fernández, Yamil D Mahmoud, Florencia Veigas, Darío Rocha, Matías Miranda, Joaquín Merlo, Mónica Balzarini, Hugo D Lujan, Gabriel A Rabinovich, María Romina Girotti Briefings in Bioinformatics, bbaa317, https://doi.org/10.1093/bib/bbaa317 Published: 16 December 2020
This project is licensed under the MIT License - see the LICENSE.md file for details