Osteosarcoma annotation (SCPCP000023) #587
Replies: 4 comments 2 replies
-
This sounds great! I'll just note here that as we have already done some QC filtering, you may be able to skip some of the initial steps within this analysis (and the others you have proposed) if you use those results. In general, we would like to keep uniform filtering and QC across modules where we can. We do not yet have doublet filtering as part of our workflow, however, so it would be great to have those results available for other modules. Perhaps we could break that out into a separate module to run across all samples? Note that we have started to play with other demultiplexing methods in https://github.com/AlexsLemonade/OpenScPCA-analysis/tree/main/analyses/doublet-detection (see also the discussion in #364), but we have not yet finalized those analyses. If Finally, I'll note that because we do not have aligned bam files as part of the data generally available for these projects, we would have to figure out how best to integrate such results if you find that |
Beta Was this translation helpful? Give feedback.
-
Hi @patelgrp. Just chiming in to say thank you for sharing your analysis ideas! The rest of the team will be reviewing your proposed analyses, and we're looking forward to discussing further. We will set up an AWS account for you shortly. Once we do, you should receive an email with an invitation to finish setting up your account. I'll reach out again when you should be expecting to see this. We'll be back in touch with next steps within 3 business days. In the meantime, let us know if you have any questions. Looking forward to working together! |
Beta Was this translation helpful? Give feedback.
-
Thanks for putting this together @patelgrp! I had a few follow up questions, but generally this approach and outline sounds good to me. It sounds like you might actually have a pipeline of sorts in place for this? Is this something that already exists somewhere and you plan to apply it to this dataset and the other two datasets you filed discussion posts for (#588 and #589)? For the CNV calling, how do you plan on identifying the normal cells to use as the reference in You also mention using the LISI index on clusters, which I really like! I just wanted to note that although we do provide cluster assignments in the processed You also mention running this pipeline before doing expert-guided refinement. Do you know what that would entail at this point? I assume this is the part that will be more disease-specific and involve more manual inspection and validation of the cell-type assignments. I would also encourage you to check out these helpful sections of our documentation to get started: Technical setup Please let me know if you have specific questions on how to get started! |
Beta Was this translation helpful? Give feedback.
-
Hi @patelgrp, I just wanted to follow up to help you get started. I know Josh started to mention this in an earlier comment, but in an effort to keep the analysis across modules uniform and transparent, we ask that you start your analysis with the processed objects available on the ScPCA Portal ( When you are ready to get started, you can download the data using the As part of If you need to use filtering, normalization, or doublet detection methods that are different from the methods that are already defined within the project, please provide your rationale and evidence that supports the alternative method. To get started, we recommend that you pick one of the three projects you proposed and start an analysis module using the data from that project. Once the code has been developed and added to the repository, it should be easy to apply it to all the projects you plan to work on. When you are ready to start your analysis, please follow the below steps to start contributing to the project:
After you have initiated your module, you will be ready to continue with the rest of the analysis that you proposed. I would recommend that you break up your work into the following steps to start, where each bullet point would be an issue and at least one subsequent pull request:
For more information on contributing to the project, I recommend you review these sections of the documentation: |
Beta Was this translation helpful? Give feedback.
-
Proposed analysis
We will annotate the collection of osteosarcoma datasets covered in SCPCP000023 (n=54) samples. Our pipeline involves a series of cleanup, QC filtering, and automated annotation steps prior to expert-guided refinement. Specifically:
Finally, we will use Seurat's layer integration pipeline to generate an integrated dataset (in our hands scVI or reciprocal PCA perform the best).
Scientific goals
To annotate the non-malignant and malignant cells within SCPCP000023
Methods or approach
Finally, we will use Seurat's layer integration pipeline to generate an integrated dataset (in our hands scVI or reciprocal PCA perform the best).
Existing modules
Yes, this module is based on an existing module of Ewing sarcoma samples in #292 (comment).
Input data
The analysis will use count matrices extracted from the SingleCellExperiment objects for SCPCP000023
Scientific literature
https://www.nature.com/articles/s41467-020-20059-6
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10515803/
https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2022.732862/full
Other details
No response
Beta Was this translation helpful? Give feedback.
All reactions