Replies: 2 comments 1 reply
-
I think there are two options for adjusting for batch effect between Broad and Sanger: 1) as a pre-processing step, 2) as a covariate in the model. The paper referenced above experimented with normalizing using various pre-processing steps, but I think I would prefer the latter method. Below is a mock model with the variables adjusting for batch effects:
The first parameter is a standard batch effect covariate but the source (either Sanger or Broad) is added as an additional hierarchical level. The second parameter is a specific batch effect correction for the Sanger per gene. The |
Beta Was this translation helpful? Give feedback.
-
When I presented Pacini et al. at Park Lab Cancer Subgroup journal club, Peter and Hu were hesitant about using both data sets, thinking that the batch effects would be more confounding than the increase in data would be informative. Therefore, it may be more trouble than it is worth to use this dataset. |
Beta Was this translation helpful? Give feedback.
-
The Sanger performed an CRISPR/Cas9 screen independently from the Broad (DepMap), but the data is available through the DepMap portal. I should use this data for this analysis, too.
These two data sources were integrated in Pacini et al. (2021) and they specifically discuss adjusting for batch and bias. They may even provide an adjusted/harmoized data set that could be worth using directly. Some notes from their paper:
y ~ ... + (1|batch|institute)
)Pacini, Clare, Joshua M. Dempster, Isabella Boyle, Emanuel Gonçalves, Hanna Najgebauer, Emre Karakoc, Dieudonne van der Meer, et al. 2021. “Integrated Cross-Study Datasets of Genetic Dependencies in Cancer.” Nature Communications 12 (1): 1661.
Beta Was this translation helpful? Give feedback.
All reactions