Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The number of droplets in sEV_recognizer output #25

Open
HZT55 opened this issue Aug 2, 2024 · 1 comment
Open

The number of droplets in sEV_recognizer output #25

HZT55 opened this issue Aug 2, 2024 · 1 comment

Comments

@HZT55
Copy link

HZT55 commented Aug 2, 2024

Hi,
Thank you for developing a useful tool.
I am confused about the output of sEV_recognizer function. The n_obs of raw_SEVtras.h5ad file has 1129,431 lines, nearly 10 times more cells than my original cell count. The code as follows:
import SEVtras SEVtras.sEV_recognizer(sample_file='sample_file', out_path='./', species='Homo', predefine_threads=10, dir_origin=True)

There are 9 samples in this sample_file, and I compared the raw data of one sample to the output of sEV. I'm sure the sample is running successfully.
image

SEVtras output
adata_ev = sc.read_h5ad('./01.sEV_recognizing/tmp_out/cellranger_SRR17008554/cellranger_SRR17008554.h5ad')
image

cellRanger output
adata_o = sc.read_10x_mtx('/home/cellranger/SRR17008554/outs/filtered_feature_bc_matrix')
image

The number of barcodes in these two files is quite different, and I don't know why.
image

More surprising,when I compared the original barcode (cellRanger output) with the barcode in the sEV_SEVtras.h5ad file of the sample, there was no duplication between the two.
image

I don't know if this result is normal, could you please explain it?

@RuiqiaoHe
Copy link
Member

The raw gene-barcode matrix includes all valid barcodes from GEMs (Gel Bead-In EMulsions) captured in the data. However, since most GEMs do not actually contain cells, it follows that most barcodes in the data do not correspond to cells, which has the potential to identify sEVs.
The filtered gene-barcode matrix will only include barcodes where GEMs are likely to contain cells. There would be no overlapping barcodes between potential cells and potential sEVs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants