Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TopHits from Binary I/O #81

Open
arajkovic opened this issue Dec 2, 2024 · 2 comments
Open

TopHits from Binary I/O #81

arajkovic opened this issue Dec 2, 2024 · 2 comments
Labels
question Further information is requested

Comments

@arajkovic
Copy link
Contributor

arajkovic commented Dec 2, 2024

Hi there, awesome work!
While working on improving jackhmmer performance on ebi.ac.uk/Tools/hmmer here at EBI, I realised that there is no way to construct a TopHits object from a binary io stream of any kind.
I think it would be a sensible addition to pyhmmer. We definitely need it since we want to read and deserialise hits from two different machines and the script that initiates the search and communicates with the daemons is written in perl so pickle is not an option at the moment.
Let me know what you think and also if I managed to overlook parts of the documentation and all of this is unnecessary.
Cheers!

@arajkovic
Copy link
Contributor Author

Forgot to say that I'd be happy to make a PR for this!

@althonos
Copy link
Owner

althonos commented Dec 2, 2024

Hi @arajkovic! I visited the group while Nicolo was working on this and implemented the daemon client in Python as well, so you can retrieve the hits you get as a result, but indeed that may not be a fine grained addition you're looking for.

The problem currently with implementing serialization/deserialization of a TopHits is that contrary to the HMMER code, the TopHits in Python store both the hits, the pipeline parameters, and a reference to the query, while only the hits are stored in binary format. So I'm not sure it would be completely possible to implement .

@althonos althonos added the question Further information is requested label Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants