Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Index requested greater than vector's size" when loading pufferfish index using salmon v1.0 #463

Closed
mdshw5 opened this issue Dec 18, 2019 · 3 comments

Comments

@mdshw5
Copy link
Contributor

mdshw5 commented Dec 18, 2019

Is the bug primarily related to salmon (bulk mode) or alevin (single-cell mode)?
salmon
Describe the bug
When loading a pufferfish index using salmon v1.0 I encounter errors. I've posted to COMBINE-lab/pufferfish#8 with a full description.

When running salmon v1.0 using a rather large index, I receive an error raised from the cereal library "Index requested greater than vector's size". The log reads:

-----------------------------------------
| Loading contig table | Time = 12.954 s
-----------------------------------------
size = 35010142
-----------------------------------------
| Loading contig offsets | Time = 269.18 ms
-----------------------------------------
-----------------------------------------
| Loading reference lengths | Time = 7.8427 ms
-----------------------------------------
-----------------------------------------
| Loading eq table | Time = 3.3896 s
-----------------------------------------
-----------------------------------------
| Loading mphf table | Time = 3.8301 s
-----------------------------------------
size = 3567796961
Number of ones: 35010141
Number of ones per inventory item: 512
Inventory entries filled: 68380
-----------------------------------------
| Loading contig boundaries | Time = 11.288 s
-----------------------------------------
size = 3567796961
-----------------------------------------
| Loading sequence | Time = 7.763 s
-----------------------------------------
size = 2517492731
-----------------------------------------
| Loading positions | Time = 171.81 s
-----------------------------------------
size = 3221360466
-----------------------------------------
| Loading reference sequence | Time = 7.9564 s
-----------------------------------------
-----------------------------------------
| Loading reference accumulative lengths | Time = 35.741 ms
-----------------------------------------
Index requested greater than vector's size: 6442720932>6442720932
Index requested greater than vector's size: 6442720996>6442720932
Index requested greater than vector's size: 6442721060>6442720932
Index requested greater than vector's size: 6442721124>6442720932
Index requested greater than vector's size: 6442721188>6442720932
Index requested greater than vector's size: 6442721252>6442720932
Index requested greater than vector's size: 6442721316>6442720932
Index requested greater than vector's size: 6442721380>6442720932
Index requested greater than vector's size: 6442721444>6442720932
...

The index does not finish loading, and so salmon does not enter read quantification routines.

To Reproduce

  • Which version of salmon was used? 1.0
  • How was salmon installed (compiled, downloaded executable, through bioconda)?
    Github release tarball
  • Which reference (e.g. transcriptome) was used?
    Gencode v32, with additional elements representing genic introns and intergenic spaces.
  • Which read files were used?
    NCBI SRA run accession GSM2392582
  • Which which program options were used?
    --no-version-check --libType ISR --threads 4 --seqBias --gcBias --useVBOpt

Expected behavior
I expected index loading to complete successfully.
Desktop (please complete the following information):

  • OS: CentOS6

Additional context
I've uploaded the index file archive here and the transcripts fasta file here.

@mdshw5
Copy link
Contributor Author

mdshw5 commented Dec 18, 2019

Sorry, I submitted this prematurely - let me edit the description.

@mdshw5
Copy link
Contributor Author

mdshw5 commented Dec 18, 2019

I'm cross-posting this issue to COMBINE-lab/pufferfish#8, so please feel free to close one or the other. Also, I just noticed the archives of index files and transcripts I uploaded are named "rapmap" - that was definitely just a mental slip and I meant "pufferfish" :).

@mdshw5
Copy link
Contributor Author

mdshw5 commented Dec 19, 2019

Closing for now as the discussion occurred on the pufferfish repo.

@mdshw5 mdshw5 closed this as completed Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant