-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: refactor/clean up GenomeData types #338
Conversation
Hey @lizgehret, do you think you could look this over? I think I'll still need to add some tests for the new type introduced here but the rest of the code should be ready. Just FYI, @nbokulich - these are changes we talked about today, in case you want to check those out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this all looks reasonable to me @misialq! feel free to assign to me again for a final review once you've added the additional tests 🙂
Hey @lizgehret, I'm done with the changes here - I added one test for the new directory format and refactored some paths to test files for consistency. |
Hmmm this is very weird - some tests are failing due to some directories supposedly not being there. I looked through setup.py and the tests themselves but cannot really see what the problem is. They also pass locally for me. Do you spot what's wrong @lizgehret? 👀 |
i think i found the issue(s) @misialq - letting the tests run now to confirm! |
Aaaaaaah, all these commas 🤦 Thanks for fixing! Feel free to merge any time 🚀 |
…2-types` (#334) * migrated partition and collating actions from moshpit to types * added everything but bug cicular imports * removed import * SQUASH: fix type drift from #338 * typos in plugin setup * lint * added package data * added collate_ortholog_annotations from moshpit * added registration for collate_ortholog_annotations * added package data for ortholog-annotations-collating --------- Co-authored-by: Evan Bolyen <[email protected]>
This PRs refactors some of the
GenomeData
types and introduces new types which will be required in q2-moshpit. Below is the summary of the changes:SampleData[BLAST6]
type - this was used to store orthologs detected by EggNOG but will be replaced bySampleData[Orthologs]
(see below)SampleData[Orthologs]
andGenomeData[Orthologs]
to store the orthologs produced by EggNOG+Diamond; the BLAST6 variant was not really the best choice as the table produced by EggNOG bears only a degree of similarity to a "proper" BLAST6 result table; using the separate type for those orthologs will allow us to be more explicit and clear about the contents (the type actually already existed but we did not use it)OG
andKEGG
types - we are not using those anywhere and will not; they had been designed in the beginning of the project, before we knew exactly what outputs from functional annotation we would want to store and howGenomeData[DNASequence]
to represent collections of genome sequences fetched from various sources; we want to be able to store each genome in a separate file so theFeatureData[Sequence]
was not a good candidate