Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Extend TransformersQueryClassifier: clean version #2965

Merged
merged 14 commits into from
Aug 9, 2022
Merged

feat: Extend TransformersQueryClassifier: clean version #2965

merged 14 commits into from
Aug 9, 2022

Conversation

anakin87
Copy link
Member

@anakin87 anakin87 commented Aug 4, 2022

Related Issue(s): fixes #2587

Proposed changes:

Pre-flight checklist

Two main doubts:

@anakin87 anakin87 requested review from a team as code owners August 4, 2022 10:27
@ZanSara ZanSara self-requested a review August 4, 2022 11:02
@ZanSara
Copy link
Contributor

ZanSara commented Aug 4, 2022

Does #2850 impact this PR? How?

It standardizes the way nodes can declare to have a variable number of outputs. Check out this method:

def _calculate_outgoing_edges(cls, component_params: Dict[str, Any]) -> int:

Right now you just declare 10 outgoing edges, as other nodes were doing, because there was no good way to tell the pipeline that your node could have a variable number of outputs. Now there is, so I recommend implementing that part in the new way.

You can have a look at FileTypeClassifier https://github.com/deepset-ai/haystack/blob/master/haystack/nodes/file_classifier/file_type.py or RouteDocuments https://github.com/deepset-ai/haystack/blob/master/haystack/nodes/other/route_documents.py to see how is it done. Documentation will follow 🙈

@ZanSara
Copy link
Contributor

ZanSara commented Aug 4, 2022

Should we provide an example of zero-shot-classification in Tutorial 14? (impacts deepset-ai/haystack-tutorials#34)

That would be fantastic! But maybe worth a separate PR. @TuanaCelik @mkkuemmel what do you think?

haystack/nodes/query_classifier/transformers.py Outdated Show resolved Hide resolved
test/nodes/test_query_classifier.py Outdated Show resolved Hide resolved
@mkkuemmel
Copy link
Member

Should we provide an example of zero-shot-classification in Tutorial 14? (impacts deepset-ai/haystack-tutorials#34)

That would be fantastic! But maybe worth a separate PR. @TuanaCelik @mkkuemmel what do you think?

Yes, I like the idea! A small section at the bottom of Tutorial 14. We just need to make sure it remains understandable (but good markdown formatting would go a long way already). I do agree that a separate PR would be good for this.

@anakin87
Copy link
Member Author

anakin87 commented Aug 5, 2022

@ZanSara I've tried to implement the lightweight approach and move labels validation in _calculate_outgoing_edges.
WDYT?

@anakin87 anakin87 requested a review from ZanSara August 6, 2022 08:33
Copy link
Contributor

@ZanSara ZanSara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you once again 😊

@ZanSara
Copy link
Contributor

ZanSara commented Aug 8, 2022

There has been another PR targeting this node with the goal of adding progress bars to the run method. I'm afraid I can't help too much with the merge, but it doesn't seem too messy either. Feel free to tag @vblagoje (PR author) if something is unclear or you need help

@anakin87
Copy link
Member Author

anakin87 commented Aug 8, 2022

I hope to have correctly fixed the conflict now.

@vblagoje 👋!
Related to #2864: in the run_batch method, I had to replace all_predictions.extend(predictions) with all_predictions.extend([predictions]) to avoid very unexpected effects (appending the keys instead of the whole prediction dictionary), which made my new tests fail; probably, the CI didn't show errors because the Query Classifier was largely uncovered by tests...

@anakin87 anakin87 changed the title Extend TransformersQueryClassifier: clean version feat: Extend TransformersQueryClassifier: clean version Aug 8, 2022
@ZanSara ZanSara merged commit 4a63484 into deepset-ai:master Aug 9, 2022
@anakin87 anakin87 deleted the extend_TransformersQueryClassifier_clean branch August 9, 2022 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve TransformersQueryClassifier
4 participants