Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Account for mismatches between old cell typing results and current processed object and handle missing UMAPs #596

Merged
merged 6 commits into from
Nov 27, 2023

Conversation

allyhawkins
Copy link
Member

Closes #591

Based on discussion in #591, we decided to handle any potential mismatches in barcodes between the processed object and the existing CellAssign or SingleR results by labeling any cells not present in the results as Unclassified cells. This PR makes that adjustment and fixes some other smaller errors I discovered during test runs.

  • I accounted for potential fails in density() following the suggestion in Figure out what to do when the numbers of cells don't match up between CellAssign results and processed SCE object  #591 (comment).
  • All of the missing cells have been annotated as Unclassified cells in the processed SCE object. This avoids any cells being labeled as NA accidentally. This happens as part of add_celltypes_to_sce.R.
  • In the QC report, I check for any missing cells and output a warning that some cells may be missing from the cell type results. I also removed any Unclassified cells before continuing with plotting.
  • I also had some issues where UMAP was not being calculated, causing a failure when rendering the cell type section of the report. The UMAP issues have been resolved by updating scpcaTools (Update renv & python scpcaTools#242). However, I realized we probably want to account for potentially having no UMAP results and still having cell type results. We already have handling for missing UMAPs in the main report, but we don't have that for the cell type report, so I added that here. This involved updating the function for creating celltype_df to account for potentially missing UMAP results and then using has_umap throughout the report to ensure no UMAPs are printed if UMAP is missing.

Here's a copy of a rendered main and supplemental report with these changes:
SCPCL000495_qc.html.zip

SCPCL000495_celltype-report.html.zip

Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this all looks good. I had a number of small suggestions, but I don't think anything that should require another look.

bin/add_celltypes_to_sce.R Outdated Show resolved Hide resolved
bin/add_celltypes_to_sce.R Outdated Show resolved Hide resolved
templates/qc_report/celltypes_qc.rmd Outdated Show resolved Hide resolved
templates/qc_report/celltypes_qc.rmd Outdated Show resolved Hide resolved
templates/qc_report/celltypes_qc.rmd Outdated Show resolved Hide resolved
templates/qc_report/celltypes_qc.rmd Outdated Show resolved Hide resolved
templates/qc_report/celltypes_qc.rmd Outdated Show resolved Hide resolved
templates/qc_report/celltypes_qc.rmd Outdated Show resolved Hide resolved
templates/qc_report/celltypes_qc.rmd Outdated Show resolved Hide resolved
@allyhawkins allyhawkins merged commit 01a45ca into development Nov 27, 2023
3 checks passed
@allyhawkins allyhawkins deleted the allyhawkins/celltype-NAs branch November 27, 2023 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants