-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issues about inport data in SEVtras #20
Comments
The parameter In the case of your dataset, it would be: For the "max() arg" error, please first check if your dataset was generated by scRNA-seq? SEVtras doesn't support single nucleus RNA-seq data. Then, you can try to lower the parameter |
Thank you very much for your prompt response. The correct import command allowed the program to run successfully. Additionally, I saw in the guide that you suggest using multiple sample data to identify SEVs. Can I merge all my samples into one Seurat object, convert it to h5ad format, and then analyze it? If so, would clustering the cells in my object before using SEVtras affect the determination of the SEV sources? |
By the way, I found that analyzing a single sample theoretically takes an entire day. Is this normal for this software? |
Please refer to Question 9 in Troubleshooting. You can obtain the cell type information based on your own processing procedure, such as filtering or regressing. SEVtras only needs the cell type information, instead of the processed expression matrix. |
Here I have an additional question. When perform data quality control, I use this code to remove poor-quality cells: test.seu <- subset(test.seu, subset = nFeature_RNA > 200 & nFeature_RNA < 6000 & percent.mt < 5). Will this step affect the identification of SEV (Single Extracellular Vesicles)? |
The identification of sEVs is a separate step without the information of the cell matrix (SEVtras.sEV_recognizer). Cell matrix pre-processing only affects the calculation of ESAI (SEVtras.ESAI_calculator). |
I have encountered some new difficulties. After completing the calculations in part 1, I obtained 15 "itera_gene.txt" and "raw_file.h5ad" files. Before running part 2, I noticed that you mentioned, "The first two parameters represent the path to sEV- and cell- anndata objects" and "With the output of SEVtras.sEV_recognizer in Part I sEVs recognizing and cell matrix with cell type, SEVtras can track each sEV to the original cell type and calculate the sEV secretion activity index (ESAI)." However, the data I included in the first step is single-cell sequencing data directly from patients' original samples, without any clustering, meaning I do not have a cell matrix with cell type, and there is no "celltype" metadata. I would like to ask if this means I should first merge the raw_file.h5ad obtained from the first step into a Seurat object, process and cluster it, and then convert it back to an h5ad object to get the "test_cell.h5ad" file mentioned in "SEVtras.ESAI_calculator(adata_ev_path='./tests/sEV_SEVtras.h5ad', adata_cell_path='./tests/test_cell.h5ad', out_path='./outputs', Xraw=False, OBSsample='batch', OBScelltype='celltype')". But where does the "sEV_SEVtras.h5ad" file come from? |
"sEV_SEVtras.h5ad" generated by SEVtras.sEV_recognizer, located in outputs directory. |
Dear esteemed author, In your guide, regarding the second part, you mentioned, "With the output of SEVtras.sEV_recognizer in Part I sEVs recognizing and cell matrix with cell type, SEVtras can track each sEV to original cell type and calculate sEV secretion activity index (ESAI)." Could you clarify whether the "cell matrix with cell type" referred to here should be constructed using the filtered feature bc matrix from single-cell data output or if it would be more appropriate to use the raw feature bc matrix to construct an object containing cell clustering information? Thank you for your guidance. |
Please refer to question 9 in Troubleshooting. A raw feature bc matrix with cell clustering information is preferred, regardless of how you generate the cell clustering information. |
I am encountering an issue while running the SEVtras.ESAI_calculator function. /home/yeziyang/miniconda3/envs/SEVtras_env/lib/python3.7/site-packages/anndata/_core/anndata.py:1785: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass Since I am a beginner in Python, I cannot understand what went wrong with the converted data. Additionally, it doesn't seem to be a problem with the samples, as I have confirmed that both adata_ev and adata_cell use the same samples. Below is the code I used in Python to convert a Seurat object to H5AD format. I greatly need your help, thank you! import scanpy as sc directory = '/home/yeziyang/Sc/' X = io.mmread(directory + 'matrix.mtx') metadata = pd.read_csv(directory + 'metadata.csv') with open(directory + 'gene_names.csv', 'r') as f: adata.obs = metadata adata.obs['celltype'] = pd.Categorical(adata.obs['celltype'].astype(str)) adata.write_h5ad(directory + 'PTCALL_RAW.h5ad') |
adata_ev 'obs' columns: adata_cell 'obs' columns: |
Sorry for my fault. Please use following code: |
Thank you very much for your suggestion. That part of the program is now running normally. However, an error occurred during the downstream analysis when converting to a Seurat object:
This seems to indicate an error in generating the data in h5ad format. I wonder if you have encountered this error before. I would greatly appreciate any further advice you could provide. |
Please refer to question 13 at Troubleshooting html. The code would reliably convert Seurat object to h5ad format. |
Dear Author, Upon investigation, we found that the error originates from the version issue of the Seurat-disk we are using. It seems feasible to extract information from this h5ad format dataset directly in Python and then construct the Seurat object directly in R. However, we found that in the newly constructed Seurat object, the number of features and cells changed. Below is the information of the Seurat object we initially used:
After processing with SEVtras, the number of features decreased to 4542, while the number of cells increased to 192177. We have two questions: first, is this a normal phenomenon that could occur during the computation process? Second, can we extract the EVs determination information in the metadata of the newly generated Seurat object and transfer it to the initial Seurat object, 'testAB.integrated', so that we can retain the EVs information while also preserving the feature information? |
Since I don't know the full picture of your data, I can only answer with what I assume to be the case.
|
Dear Author,
I noticed your response in the forum regarding your understanding of EVs droplets. I am curious whether this implies that further transcriptome analysis of the identified droplets is actually infeasible. Or, if I import the SEVtras results from Part 1 into the tissue cell sequencing data composed of the raw_feature_matrix, and then obtain the SEVtras_sEVs.h5 in Part 2, and subsequently import it into the tissue sequencing data composed of the filter_feature_matrix, can I ensure that there is no overlap between EVs and cells as much as possible? This is because a large number of droplets carried by the raw_feature_matrix are excluded during the process, which is one of the reasons why SEVtras analysis based on the latter cannot yield results, while the former can. From your understanding, is this operation feasible?
At 2024-06-25 22:02:46, "RuiqiaoHe" ***@***.***> wrote:
Since I don't know the full picture of your data, I can only answer with what I assume to be the case.
The number of cells depends only on your input in parameter adata_cell of SEVtras.ESAI_calculator. So it won't increase. If you indicate the cell number in the output of SEVtras.sEV_recognizer (raw_SEVtras.h5ad), this is not the real cell in your analysis. I only control the total UMI in the file. As for the number of features, it can rise from different feature selection parameters between Seurat and Scanpy.
It is ok to extract the sEV information in adata or Seurat object. You can just save them in the adata.obs and use the original data for other analysis.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
SEVtras.ESAI_calculator() in Part 2 does not require the input cell matrix to be filtered or not, but is concerned with the number of genes in the matrix. The input for SEVtras.ESAI_calculator() can be raw_feature_matrix in you statement, and also can be other prepossessed cell matrix (filter_feature_matrix) that you want to analyze (with parameter Xraw). Please refer to point 9 in Troubleshooting.. |
Moreover, functional enrichment analysis of these sEV-containing droplets highlighted pathways associated with sEV formation and release (P < 0.05, hypergeometric test) (Extended Data Fig. Fig.5d).5d) |
The functional enrichment analysis used high abundance genes in identified sEVs droplets. |
作者您好,通过对raw_SEVtras.h5ad和sEV_SEVtras.h5ad的比对我们发现,在那些没有被识别为EVs的液滴中,有相当一部分的score值是大于15的,这导致了比对EVs和细胞的两个标记物CD63和CD9的表达情况时细胞部分显著高于了EVs部分,我们是否可以将raw_SEVtras.h5ad中的液滴进行手动标记,将大于15分的液滴标记为EVs,保存成h5ad格式以替代sEV_SEVtras.h5ad进行后续分析?我们已经这样做了并且发现这样区分后,EVs部分的两个标记物表达显著高于了细胞部分。但作为这个软件的初学者,我们不太确定这样的分类是否符合您创作这个软件时的逻辑。 |
The operation of manual filtering is actually equivalent to modifying the parameter |
Hi TATABOX99, can I get your email? I want to get the code to generate adata_cell in ESAI_calculator.(My email is [email protected]) |
Dear Author,
I am using this script for analysis:
import SEVtras
SEVtras.sEV_recognizer(input_path='/home/yeziyang/Sc', sample_file='/home/yeziyang/Sc/Sc1_LN', out_path='/home/yeziyang/Sc/outputs', species='Homo')
My 10x_mtx formatted data is stored in this directory: /home/yeziyang/Sc/Sc1_LN/outs/raw_feature_bc_matrix/matrix.mtx.gz. I have ensured it is extracted from the raw_feature_bc_matrix. However, I encountered the following error:
File "run_SEVtras.py", line 2, in
SEVtras.sEV_recognizer(input_path='/home/yeziyang/Sc', sample_file='/home/yeziyang/Sc/Sc1_LN', out_path='/home/yeziyang/Sc/outputs', species='Homo')
File "/home/yeziyang/miniconda3/envs/SEVtras_env/lib/python3.7/site-packages/SEVtras/main.py", line 79, in sEV_recognizer
sample_log = get_sample(sample_file)
File "/home/yeziyang/miniconda3/envs/SEVtras_env/lib/python3.7/site-packages/SEVtras/utils.py", line 18, in get_sample
with open(sample_log, 'r') as f:
IsADirectoryError: [Errno 21] Is a directory: '/home/yeziyang/Sc/Sc1_LN'
I am not sure why this error occurs. Could you please give me some advice? Thank you very much!
Additionally, when I used echo "Sc1_LN" > sample_file and ran the above code again, setting alpha to 0.08, it showed max() arg is an empty sequence. I am not sure if this is due to an error in my import process.
The text was updated successfully, but these errors were encountered: