Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: substring not found when running proteome similarity search with pvacseq #1150

Open
lukaas33 opened this issue Sep 16, 2024 · 2 comments · Fixed by #1153
Open

Comments

@lukaas33
Copy link

Installation Type

Docker

pVACtools Version / Docker Image

latest

Python Version

No response

Operating System

Ubuntu

Describe the bug

When running pvacseq to make predictions including the proteome similarity search option it fails at this step.
The exact output can be viewed below.

How to reproduce this bug

`pvacseq run /shared_dir/neoantigen.stringtie.vcf tumor_sample HLA-A*02:01,HLA-A*23:01,HLA-B*27:05,HLA-B*42:01,HLA-C*01:02,HLA-C*17:01,DPB1*01:01,DPB1*02:01,DQB1*03:01,DQB1*03:01,DRB1*01:03,DRB1*11:02 MHCflurry BigMHC_IM BigMHC_EL NetMHCIIpan NetMHCIIpanEL /shared_dir/pvacseq/ -e1 10 -e2 15 --iedb-install-directory /opt/iedb -t 60 --run-reference-proteome-similarity`

Input files

neoantigen.stringtie (2).zip

Log output

Combining Parsed Prediction Files
Completed
Creating aggregated report
Tumor clonal VAF estimated as 0.5 (estimated from Tumor DNA VAF data). Assuming variants with VAF < 0.25 are subclonal
Completed
Calculating Manufacturability Metrics
Completed
Running Binding Filters
Completed
Running Coverage Filters
Completed
Running Transcript Support Level Filter
Complete
Running Top Score Filter
Completed
Calculating Reference Proteome Similarity
Traceback (most recent call last):
  File "/usr/local/bin/pvacseq", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/pvactools/tools/pvacseq/main.py", line 123, in main
    args[0].func.main(args[1])
  File "/usr/local/lib/python3.7/site-packages/pvactools/tools/pvacseq/run.py", line 142, in main
    pipeline.execute()
  File "/usr/local/lib/python3.7/site-packages/pvactools/lib/pipeline.py", line 484, in execute
    PostProcessor(**post_processing_params).execute()
  File "/usr/local/lib/python3.7/site-packages/pvactools/lib/post_processor.py", line 65, in execute
    self.calculate_reference_proteome_similarity()
  File "/usr/local/lib/python3.7/site-packages/pvactools/lib/post_processor.py", line 247, in calculate_reference_proteome_similarity
    aggregate_metrics_file=aggregate_metrics_file,
  File "/usr/local/lib/python3.7/site-packages/pvactools/lib/calculate_reference_proteome_similarity.py", line 595, in execute
    unique_peptides = pymp.shared.list(self._get_unique_peptides(mt_records_dict, wt_records_dict))
  File "/usr/local/lib/python3.7/site-packages/pvactools/lib/calculate_reference_proteome_similarity.py", line 575, in _get_unique_peptides
    peptide, full_peptide = self._get_peptide(line, mt_records_dict, wt_records_dict)
  File "/usr/local/lib/python3.7/site-packages/pvactools/lib/calculate_reference_proteome_similarity.py", line 314, in _get_peptide
    subpeptide_position = full_peptide.index(epitope)
ValueError: substring not found

Output files

No response

@susannasiebert
Copy link
Contributor

susannasiebert commented Sep 16, 2024

Thank you for this bug report. Did your run produce any of the main output files (all_epitopes.tsv or aggregated.tsv). If so, can you please attach those to this ticket as well as the .fasta file from your run? It would speed up debugging a lot to not have to redo the predictions.

@lukaas33
Copy link
Author

tumor_sample.all_epitopes.aggregated.zip
tumor_sample.zip

This is the MHC1 output for the run described abobe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants