Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Bin edges must be unique #35

Open
tfujian opened this issue Jan 14, 2025 · 5 comments
Open

ValueError: Bin edges must be unique #35

tfujian opened this issue Jan 14, 2025 · 5 comments

Comments

@tfujian
Copy link

tfujian commented Jan 14, 2025

Hello, respected author, I have the following questions to ask you. I used the following command:

SEVtras.ESAI_calculator(adata_ev_path='outputs/raw_cellranger_patient1.h5ad', adata_cell_path='seurat.h5ad', out_path='./outputs', Xraw=False, OBScelltype='celltype')

I obtained the seurat.h5ad file with cell type information using Scanpy analysis. There is only one sample, so I didn't use OBSsample='batch'. However, after running the above command, I encountered the following error:

ValueError: Bin edges must be unique: Index([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan], dtype='float64'). You can drop duplicate edges by setting the 'duplicates' kwarg

Our method requires the use of the raw_feature_bc_matrix, but the seurat.h5ad file with cell type information has been filtered. Does this mean the number of cells is different? Could this be the reason for the issue? How should this problem be resolved?

Additionally, do we need to perform filtering during Scanpy analysis?

Thank you very much for your response.

@RuiqiaoHe
Copy link
Member

Thank you for your testing. Could you artificially add the 'batch' variable to the obs of the cell object? The values in it are the sample names and can be confirmed by reading 'outputs/raw_cellranger_patient1.h5ad' and viewing its adata.obs['batch']. Then re-run it and see if it reports an error.
If it still reports an error, please copy the full error for reference.
The adata_cell does not require the use of raw_feature_bc_matrix, instead it is recommended to follow the regular single cell analysis process and use the filtered_feature_bc_matrix. Only the first step requires the raw_feature_bc_matrix.

@tfujian
Copy link
Author

tfujian commented Feb 5, 2025

I'm very sorry to trouble you again with another issue.

When running the following command:
SEVtras.ESAI_calculator(adata_ev_path='outputs/sEV_SEVtras.h5ad', adata_cell_path='outputs/integrated_data.h5ad', out_path='./outputs', Xraw=False, OBSsample='batch',OBScelltype='celltype')

I encountered the following errors:

/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/SEVtras/functional.py:81: FutureWarning: Use anndata.concat instead of AnnData.concatenate, AnnData.concatenate is deprecated and will be removed in the future. See the tutorial for concat at: https://anndata.readthedocs.io/en/latest/concatenation.html
adata_combined = adata_cell_raw.concatenate(adata_ev, batch_key = OBSev)
/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/scanpy/preprocessing/_normalization.py:196: UserWarning: Some cells have zero counts
warn(UserWarning('Some cells have zero counts'))
/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/scanpy/preprocessing/_highly_variable_genes.py:226: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future defaultand silence this warning.
disp_grouped = df.groupby("mean_bin")["dispersions"]
/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/scanpy/preprocessing/_pca.py:229: ImplicitModificationWarning: Setting element .obsm['X_pca'] of view, initializing view as actual.
adata.obsm['X_pca'] = X_pca
2025-01-25 19:42:46.798297: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI AVX512_BF16 AVX_VNNI AMX_TILE AMX_INT8 AMX_BF16 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-01-25 19:42:46.852532: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders.To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2025-01-25 19:43:12,132 No gene sets passed through filtering condition!!! Try to set min_size or max_size parameters again!
Note: check gene name, gmt format, or filtering size.
Traceback (most recent call last):
File "/work/tanfj/work/SEVtras/breast/run_tets.py", line 10, in
SEVtras.ESAI_calculator(adata_ev_path='outputs/sEV_SEVtras.h5ad', adata_cell_path='integrated_data.h5ad',out_path='./outputs', Xraw=False, OBSsample='batch',OBScelltype='celltype')
File "/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/SEVtras/main.py", line 188, in ESAI_calculator
celltype_e_number, adata_evS, adata_com = deconvolver(adata_ev, adata_cell, species, OBSsample, OBScelltype, OBSev, OBSMpca, cellN, Xraw, normalW)
File "/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/SEVtras/functional.py", line 118,in deconvolver
gsea_pval_dat = source_biogenesis(adata_cell, species, OBScelltype=OBScelltype, Xraw = Xraw, normalW=normalW)
File "/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/SEVtras/functional.py", line 42, in source_biogenesis
res = gp.prerank(rnk=gene_rank, gene_sets=gmt_path)
File "/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/gseapy/init.py", line 359, inprerank
pre.run()
File "/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/gseapy/gsea.py", line 373, in run
gmt = self.load_gmt(gene_list=dat2.index.values, gmt=self.gene_sets)
File "/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/gseapy/base.py", line 207, in load_gmt
raise Exception("No gene sets passed through filtering condition")
Exception: No gene sets passed through filtering condition

Could you help me to solve this issue? I would greatly appreciate your response.

@RuiqiaoHe
Copy link
Member

May I ask what species you analyzed? Currently only human and mouse analyses are supported. You can refer to #8 (comment) for further direction.

@tfujian
Copy link
Author

tfujian commented Feb 6, 2025

Thank you very much for your reply, but the species I'm working on is human. I still encountered the same error when running the command as before.

SEVtras.ESAI_calculator(adata_ev_path='outputs/sEV_SEVtras.h5ad', adata_cell_path='outputs/integrated_data.h5ad', out_path='./outputs', Xraw=False, OBSsample='batch',OBScelltype='celltype',species='Homo')

2025-02-06 13:48:14,551 No gene sets passed through filtering condition!!! Try to set min_size or max_size parameters again!
Note: check gene name, gmt format, or filtering size.
Traceback (most recent call last):
File "/work/tanfj/work/SEVtras/breast/run_tets.py", line 10, in
SEVtras.ESAI_calculator(adata_ev_path='outputs/sEV_SEVtras.h5ad', adata_cell_path='outputs/integrated_data.h5ad', out_path='./outputs', Xraw=False, OBSsample='batch',OBScelltype='celltype',species='Homo')
File "/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/SEVtras/main.py", line 188, in ESAI_calculator
celltype_e_number, adata_evS, adata_com = deconvolver(adata_ev, adata_cell, species, OBSsample, OBScelltype, OBSev, OBSMpca,cellN, Xraw, normalW)
File "/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/SEVtras/functional.py", line 118, in deconvolver
gsea_pval_dat = source_biogenesis(adata_cell, species, OBScelltype=OBScelltype, Xraw = Xraw, normalW=normalW)
File "/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/SEVtras/functional.py", line 42, in source_biogenesis
res = gp.prerank(rnk=gene_rank, gene_sets=gmt_path)
File "/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/gseapy/init.py", line 359, in prerank
pre.run()
File "/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/gseapy/gsea.py", line 373, in run
gmt = self.load_gmt(gene_list=dat2.index.values, gmt=self.gene_sets)
File "/home/ubuntu/miniconda3/envs/santo_env/lib/python3.10/site-packages/gseapy/base.py", line 207, in load_gmt
raise Exception("No gene sets passed through filtering condition")
Exception: No gene sets passed through filtering condition

Have you ever encountered this problem? I would greatly appreciate your response.

@RuiqiaoHe
Copy link
Member

RuiqiaoHe commented Feb 6, 2025

I've run a lot of data from Homo sapiens and none of them produce this error. Could you please check the var names of the input adata? It should be in the format of "CHD8" etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants