-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Figure out what to do when the numbers of cells don't match up between CellAssign results and processed SCE object #591
Comments
The easy solution in that particular code is to use But that would still leave |
My interpretation of this comment is to leave everything as is, but to add a separate label for not classified by SingleR and not classified by CellAssign? And then I think when creating our diagnostic plots we remove those before plotting. |
I think to avoid two kinds of NA's, we want to grab the cell barcodes that are in the SCE object and not in the SingleR/CellAssign results and label them as "Unclassified" before even adding them into the SCE object. This would happen in |
That makes sense to me. But now I'm confused by the error again, and wondering if missing cells are really the issue? Presumably if all values were |
Also, as to timing of this error: I think we could call this a known bug and come back to it later. Running this release without any cell type skipping doesn't seem like a terrible idea. I don't mind a clean run... |
It's because we are labeling cells that are
I'm not opposed to this! The only reason I picked this up is because I went to run the workflow on the samples we've been using for development in |
It's because we have a function that sets the unknowns for both singler/ CellAssign. So we are setting both scpca-nf/templates/qc_report/utils/celltype_functions.rmd Lines 66 to 83 in c6930fe
|
Oh, I wonder if the bug may actually be different?
The above code may fail if
|
That makes sense. Though we probably shouldn't do that in the future... Marking the "missing" calls because we were repeating cell typing seems like what we want to do in the future. And we probably do still want to add the filter/ Or maybe just go for it as part of this: change the missing cells to "Untyped" or "Unassigned" in |
Note I updated the code above to handle |
Definitely making this change now.
I'm trying what I think will be an easy fix for this, if it takes too long then I'll just make the density change. |
Closed by #596 |
I was testing the workflow and ran into an issue with rendering the supplemental cell type report. The problem is in creating the CellAssign density plots. When running a library we already had processed, there were now
NA
's for the CellAssign prediction score causing an error in:scpca-nf/templates/qc_report/celltypes_supplemental_report.rmd
Lines 710 to 715 in c6930fe
After some investigation, it looks like the NA's are coming from barcodes that are not present in the CellAssign results, but are present in the processed object. The problem that I think we're going to have moving forward is sometimes the exact cells present in the final object may change slightly when filtering. I double checked that we set seeds in all the scripts where we need it (including when running CellAssign), but we have seen scenarios where the exact number of cells is altered slightly from run to run.
So we need to decide what to do about repeating CellAssign in that scenario. I think we have a few options:
I don't love option 1, but I think it's the only thing I can think of to be sure that the barcodes are the same. The only other thing would be actually checking that the barcodes in the files are the same before skipping it.
The other thing to note is that this is based on the assumption that CellAssign returns a score for every single cell that is provided as input. This is my understanding and looking back at some of the analysis I did in
sc-data-integration
this seemed to be the case. So it seems unlikely that CellAssign just didn't assign a score and also didn't return it in the predictions matrix.Tagging @jashapiro @sjspielman for any thoughts on this.
The text was updated successfully, but these errors were encountered: