TopHits from Binary I/O #81

arajkovic · 2024-12-02T21:31:22Z

Hi there, awesome work!
While working on improving jackhmmer performance on ebi.ac.uk/Tools/hmmer here at EBI, I realised that there is no way to construct a TopHits object from a binary io stream of any kind.
I think it would be a sensible addition to pyhmmer. We definitely need it since we want to read and deserialise hits from two different machines and the script that initiates the search and communicates with the daemons is written in perl so pickle is not an option at the moment.
Let me know what you think and also if I managed to overlook parts of the documentation and all of this is unnecessary.
Cheers!

arajkovic · 2024-12-02T21:35:06Z

Forgot to say that I'd be happy to make a PR for this!

althonos · 2024-12-02T22:36:15Z

Hi @arajkovic! I visited the group while Nicolo was working on this and implemented the daemon client in Python as well, so you can retrieve the hits you get as a result, but indeed that may not be a fine grained addition you're looking for.

The problem currently with implementing serialization/deserialization of a TopHits is that contrary to the HMMER code, the TopHits in Python store both the hits, the pipeline parameters, and a reference to the query, while only the hits are stored in binary format. So I'm not sure it would be completely possible to implement .

althonos added the question Further information is requested label Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TopHits from Binary I/O #81

TopHits from Binary I/O #81

arajkovic commented Dec 2, 2024 •

edited

Loading

arajkovic commented Dec 2, 2024

althonos commented Dec 2, 2024

TopHits from Binary I/O #81

TopHits from Binary I/O #81

Comments

arajkovic commented Dec 2, 2024 • edited Loading

arajkovic commented Dec 2, 2024

althonos commented Dec 2, 2024

arajkovic commented Dec 2, 2024 •

edited

Loading