Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SaveSeuratRds(): Is there a way to alter directory address of 'on-disk' matrices in Seurat object #198

Open
Dazcam opened this issue Apr 9, 2024 · 3 comments

Comments

@Dazcam
Copy link

Dazcam commented Apr 9, 2024

Hello,

I have been running an analysis in Seurat 5 that is partially run on a local machine and a remote server, and I've been trying to work out how to change the address of where the BP cell generated count matrices are located. After moving Seurat objects generated remotely to my local machine, I encounter errors when trying to run certain functions as the root directory for the count data is set to the remote directory, rather than the local directory. See here for more details.

The SaveSeuratRds() function looks as if it may be able to change this directory address, but there does not appear to be functionality to just change it without moving the on-disk layers from their original location. I've tried running SaveSeuratRds() with move = F, but when you reload the object the layers are missing:

> seurat_object
An object of class Seurat 
53590 features across 144380 samples within 2 assays 
Active assay: RNA (26795 features, 2000 variable features)
 0 layers present: 
 1 other assay present: sketch
 6 dimensional reductions calculated: pca, umap, harmony, umap.harmony, harmony.full, umap.full

When running SaveSeuratRds() with move = T, obviously the original path is not found (this also occurs when setting relative = T or relative = F).

SaveSeuratRds(seurat_object, paste0(R_dir, '02seurat_', region, '_test.rds'))
Error:
! Can't find path:
...

I've also tried digging around the Seurat Object, but I can't put my finger where this address is stored. The best I could do was find a list of 28 identical items containing the following, but I'm not convinced this is worth changing as it looks like a log:

Matrix_list
seurat_object@assays$RNA@layers$counts@matrix@matrix@matrix_list[[1]]
26795 x 5706 IterableMatrix object with class RenameDims

Row names: TTR, LINC01821 ... PARVG
Col names: 10X356_4:GGTGAAGCAGGTGACA, 10X356_4:TGGATGTCACGACAAG ... 10X356_4:AGTGATCAGGCCCAAA

Data type: double
Storage order: column major

Queued Operations:
1. Load compressed matrix from directory /scratch/results/01R_objects/CBL_BP
2. Select rows: 1, 5 ... 59357 and cols: 1, 2 ... 28010
3. Reset dimnames
4. Reset dimnames
5. Reset dimnames
6. Reset dimnames
7. Reset dimnames
8. Reset dimnames
9. Reset dimnames
10. Reset dimnames
11. Reset dimnames

So a couple of questions then:

  1. Is there a way to alter the address of the root directory within Seurat, wither using SaveSeuratRds() or otherwise?
  2. If not, could this functionality be added to SaveSeuratRds() to handle local / remote analyses?

Many thanks.

@Dazcam Dazcam changed the title SaveSeuratRds(): Is there a way to alter root directory of Seurat object SaveSeuratRds(): Is there a way to alter directory address of 'on-disk' matrices in Seurat object Apr 9, 2024
@jvelghe
Copy link

jvelghe commented Jul 4, 2024

Hi Dazcam, here's an example of where the directory address is stored in the Seurat V5 object. Here you can see an example path of 1 of 3 joined datasets in this BPCells Seurat object. It is stored in BP_object@assays[["RNA"]]@layers[["counts"]]@matrix@matrix_list[[1]]@matrix@dir, where 1 represents the first of the joined layers.

You change the store file path for each of the layers like this:

> BP_object@assays[["RNA"]]@layers[["counts"]]@matrix@matrix_list[[1]]@matrix@dir
[1] "/path/to/your/old/dir"
> BP_object@assays[["RNA"]]@layers[["counts"]]@matrix@matrix_list[[1]]@matrix@dir <- "/path/to/your/new/dir"
> BP_object@assays[["RNA"]]@layers[["counts"]]@matrix@matrix_list[[1]]@matrix@dir
[1] "/path/to/your/new/dir"

I'm also curious if you know how to save and then load the saved joined object as a Seurat object again?

Screenshot 2024-07-04 at 2 35 44 AM

@Dazcam
Copy link
Author

Dazcam commented Jul 4, 2024

Hi @jvelghe,

Many Thanks for this. I'll give it a go.

Regarding your question, if I understand your question correctly, I use the following for saving and loading data:

  • Save: saveRDS(seurat_object, paste0(R_dir, '02seurat_', region, '.rds'))
  • Read: seurat_object <- readRDS(paste0(R_dir, '02seurat_', region, '.rds'))

@Dazcam
Copy link
Author

Dazcam commented Jul 25, 2024

@jvelghe The directory name of the BP cells object must be stored in multiple places. After changing the location (as you describe) certain procedures, like trying to convert the 'in memory' matrix back to an 'on disk' matrix, Seurat still reports the old directory.

 seurat_obj[["RNA"]]$counts
#> 27379 x 66782 IterableMatrix object with class RenameDims

#> Row names: ABCA13, PENK-AS1 ... SLC7A7
#> Col names: 10X318_7:GGGTTTAGTTACGATC, 10X318_8:CCCGGAAGTGACTGAG ... 10X145_3:AACAGGGCAGCCGTCA

#> Data type: double
#> Storage order: column major

#> Queued Operations:
#> 1. Concatenate cols of 12 matrix objects with classes: RenameDims, RenameDims ... RenameDims (threads=0)
#> 2. Select rows: 1, 2 ... 27379 and cols: 1, 5345 ... 49485
#> 3. Reset dimnames

> as(object = seurat_obj[["RNA"]]$counts, Class = "dgCMatrix")
#> Error: Missing directory: /scratch/c.cXXXXXX/results/01R_objects/CaB_BP

> seurat_obj@assays[["RNA"]]@layers[["counts"]]@matrix@matrix@matrix_list[[1]]@matrix@matrix@matrix@matrix@matrix@matrix@matrix@matrix@matrix@matrix@dir
#> [1] "/scratch/c.cXXXXXX/results/01R_objects/CaB_BP"

> seurat_obj@assays[["RNA"]]@layers[["counts"]]@matrix@matrix@matrix_list[[1]]
#> 27379 x 5344 IterableMatrix object with class RenameDims

#> Row names: ABCA13, PENK-AS1 ... SLC7A7
#> Col names: 10X318_7:GGGTTTAGTTACGATC, 10X318_7:TGTGTGAGTTCCGCTT ... 10X318_7:GGGCTCATCCACAGGC

#> Data type: double
#> Storage order: column major

#> Queued Operations:
#> 1. Load compressed matrix from directory /scratch/c.cXXXXXX/results/01R_objects/CaB_BP
#> 2. Select rows: 1, 3 ... 59357 and cols: 1, 3 ... 32673
#> 3. Reset dimnames
#> 4. Reset dimnames
#> 5. Reset dimnames
#> 6. Reset dimnames
#> 7. Reset dimnames
#> 8. Reset dimnames
#> 9. Reset dimnames
#> 10. Reset dimnames
#> 11. Reset dimnames

> seurat_obj@assays[["RNA"]]@layers[["counts"]]@matrix@matrix@matrix_list[[1]]@matrix@matrix@matrix@matrix@matrix@matrix@matrix@matrix@matrix@matrix@dir <- '/Users/XXXXXX/Desktop/results/01R_objects/CaB_BP'

> seurat_obj@assays[["RNA"]]@layers[["counts"]]@matrix@matrix@matrix_list[[1]]
#> 27379 x 5344 IterableMatrix object with class RenameDims

#> Row names: ABCA13, PENK-AS1 ... SLC7A7
#> Col names: 10X318_7:GGGTTTAGTTACGATC, 10X318_7:TGTGTGAGTTCCGCTT ... 10X318_7:GGGCTCATCCACAGGC

#> Data type: double
#> Storage order: column major

#> Queued Operations:
#> 1. Load compressed matrix from directory /Users/XXXXXX/Desktop/results/01R_objects/CaB_BP
#> 2. Select rows: 1, 3 ... 59357 and cols: 1, 3 ... 32673
#> 3. Reset dimnames
#> 4. Reset dimnames
#> 5. Reset dimnames
#> 6. Reset dimnames
#> 7. Reset dimnames
#> 8. Reset dimnames
#> 9. Reset dimnames
#> 10. Reset dimnames
#> 11. Reset dimnames

> as(object = seurat_obj[["RNA"]]$counts, Class = "dgCMatrix")
Error: Missing directory: /scratch/c.cXXXXXX/results/01R_objects/CaB_BP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants