IndexError while running sEVtras calculator #30

songnsnow · 2024-11-04T19:38:14Z

Hi, I've been trying to run the second part (SEVtras.ESAI_calculator) of the tool, but I am getting the error below.

/home/kleinman/seol.song/projects/kleinman/seol.song/from_hydra/SEVtras/venv_3.11.5/lib/python3.11/site-packages/SEVtras/functional.py:81: FutureWarning: Use anndata.concat instead of AnnData.concatenate, AnnData.concatenate is deprecated and will be removed in the future. See the tutorial for concat at: https://anndata.readthedocs.io/en/latest/concatenation.html
  adata_combined = adata_cell_raw.concatenate(adata_ev, batch_key = OBSev)
/home/kleinman/seol.song/projects/kleinman/seol.song/from_hydra/SEVtras/venv_3.11.5/lib/python3.11/site-packages/scanpy/preprocessing/_pca.py:317: ImplicitModificationWarning: Setting element `.obsm['X_pca']` of view, initializing view as actual.
  adata.obsm[key_obsm] = X_pca
/home/kleinman/seol.song/projects/kleinman/seol.song/from_hydra/SEVtras/venv_3.11.5/lib/python3.11/site-packages/SEVtras/functional.py:126: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  near_neighbor_dat.loc[i, 'times'] = tmp_times[0]
Traceback (most recent call last):
  File "/project/kleinman/seol.song/from_hydra/SEVtras/scripts/hgg/sevtras_calc.py", line 25, in <module>
    SEVtras.ESAI_calculator(
  File "/home/kleinman/seol.song/projects/kleinman/seol.song/from_hydra/SEVtras/venv_3.11.5/lib/python3.11/site-packages/SEVtras/main.py", line 188, in ESAI_calculator
    celltype_e_number, adata_evS, adata_com = deconvolver(adata_ev, adata_cell, species, OBSsample, OBScelltype, OBSev, OBSMpca, cellN, Xraw, normalW)
                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kleinman/seol.song/projects/kleinman/seol.song/from_hydra/SEVtras/venv_3.11.5/lib/python3.11/site-packages/SEVtras/functional.py", line 126, in deconvolver
    near_neighbor_dat.loc[i, 'times'] = tmp_times[0]
                                        ~~~~~~~~~^^^
  File "/home/kleinman/seol.song/projects/kleinman/seol.song/from_hydra/SEVtras/venv_3.11.5/lib/python3.11/site-packages/pandas/core/series.py", line 1118, in __getitem__
    return self._values[key]
           ~~~~~~~~~~~~^^^^^
IndexError: index 0 is out of bounds for axis 0 with size 0

I looked into the deconvolver function, and I believe the error is caused by empty tmp_times Series. I think some of the rows (cells) in near_neighbor_dat are not tracing back to a celltype value, which might be causing the error.

Also, I noticed that the X_pca and X_umap values of adata_cell is not being utilized at least up until the deconvolver function, since adata_combined is being reprocessed with scanpy downstream analysis. However, I would like to use the X_pca and X_umap values that I imported from my analysis through Seurat.

Here is my code used for converting my SeuratObject to h5ad. I used the code provided by the SEVtras tutorial (there was a code for exporting the PCA and UMAP values from Seurat, there was no code to input the values into the scanpy h5ad object, so I wrote my own for importing PCA and UMAP values).

# Load 
import scanpy as sc
import anndata
from scipy import io
from scipy.sparse import coo_matrix, csr_matrix
import os
import pandas as pd

# Provided by SEVtras
save_path = '/home/kleinman/seol.song/projects/kleinman/seol.song/from_hydra/SEVtras/data/'
X = io.mmread(os.path.join(save_path, "hgg_processed_matrix.mtx"))
adata = anndata.AnnData(X=X.transpose().tocsr())
metadata = pd.read_csv(os.path.join(save_path, "hgg_processed_metadata.csv"))

with open(os.path.join(save_path, "hgg_processed_gene_names.csv"),'r') as f:
    gene_names = f.read().splitlines()

adata.obs = metadata
adata.obs.index = adata.obs["barcode"]
adata.var.index = gene_names

# Load PCA csv
pca_df = pd.read_csv(os.path.join(save_path, "hgg_processed_pca.csv"), index_col=0)

# Add PCA and UMAP values to the AnnData object
adata.obsm['X_pca'] = pca_df.values
adata.obsm['X_umap'] = metadata[['UMAP_1', 'UMAP_2']].values

# Remove the 'barcode' column from adata.obs
adata.obs = adata.obs.drop(columns=['barcode'])

In conclusion, these are my final questions:

What exactly is causing the IndexError?
Can I use PCA and UMAP values from my Seurat analysis for running the code (+ drawing the UMAP)? This is crucial to me since I would like to reduce batch effects, which would require integrative analysis and altering of PCA and UMAP.

Thanks for your help in advance!

The text was updated successfully, but these errors were encountered:

RuiqiaoHe · 2024-11-05T00:43:32Z

Could you please check the shape of adata_ev and adata_cell that you input for SEVtras.ESAI_calculator by following code?
print(adata_ev)
print(adata_cell)
There may be a mismatch problem between the two adata objects.

songnsnow · 2024-11-05T01:53:38Z

Here is the output from print(adata_ev) and print(adata_cell).

>>> print(adata_ev)
AnnData object with n_obs × n_vars = 4383 × 24040
    obs: 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'total_counts_ribo', 'pct_counts_ribo', 'n_genes', 'score', 'batch', 'sEV', 'score1'
>>> print(adata_cell)
AnnData object with n_obs × n_vars = 36076 × 31706
    obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'raw_batch_number', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'total_counts_ribo', 'pct_counts_ribo', 'n_genes', 'score', 'batch', 'sEV', 'score1', 'barcode_prefix', 'batch_number', 'cell.barcode', 'sample.id', 'sample.id.new', 'Paper_ID', 'Sample', 'percent.mito', 'percent.ribo', 'G1.S.score_Whitfield', 'G2.M.score_Whitfield', 'S.Score', 'G2M.Score', 'Phase', 'Paper_group', 'Paper_subgroup', 'Location', 'Age_group', 'Sex', 'Technology', 'Consensus_Jessa', 'Consensus_Pombo_Raw', 'CAWPE_Score_Pombo', 'Class', 'Celltype', 'celltype', 'Level2_Celltype', 'clst_itg', 'seurat_clusters', 'UMAP_1', 'UMAP_2'
    obsm: 'X_pca', 'X_umap'

RuiqiaoHe · 2024-11-05T07:06:28Z

At the moment I can't figure out what the bug is. Would it be convenient to share testing adata objects for me? I'll test it in my environment. The address is ruiqiaohe#gmail.com (change # to @).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IndexError while running sEVtras calculator #30

IndexError while running sEVtras calculator #30

songnsnow commented Nov 4, 2024 •

edited

Loading

RuiqiaoHe commented Nov 5, 2024

songnsnow commented Nov 5, 2024

RuiqiaoHe commented Nov 5, 2024 •

edited

Loading

IndexError while running sEVtras calculator #30

IndexError while running sEVtras calculator #30

Comments

songnsnow commented Nov 4, 2024 • edited Loading

RuiqiaoHe commented Nov 5, 2024

songnsnow commented Nov 5, 2024

RuiqiaoHe commented Nov 5, 2024 • edited Loading

songnsnow commented Nov 4, 2024 •

edited

Loading

RuiqiaoHe commented Nov 5, 2024 •

edited

Loading