Release 1.0.0 #3

x-tabdeveloping · 2024-08-21T12:15:40Z

Changed the API completely to make the library more comfortable to use.
Main changes:

`Trooper`

The brand new Trooper interface allows you not to have to specify what model type you wish to use.
Stormtrooper will automatically detect the model type from the specified name.

from stormtrooper import Trooper

# This loads a setfit model
model = Trooper("all-MiniLM-L6-v2")

# This loads an OpenAI model
model = Trooper("gpt-4")

# This loads a Text2Text model
model = Trooper("google/flan-t5-base")

Unified zero and few-shot classification

You no longer have to specify whether a model should be a few or a zero-shot classifier when initialising it.
If you do not pass any training examples, it will be automatically assumed that the model should be zero-shot.

# This is a zero-shot model
model.fit(None, ["dog", "cat"])

# This is a few-shot model
model.fit(["he was a good boy", "just lay down on my laptop"], ["dog", "cat"])

Reimplemented SetFit

The SetFit library caused me many headaches, and has unfixed bugs, and I didn't want to downgrade everything.
I implemented the SetFit algorithm from scratch using training utilities in sentence-transformers.

New docs in MKDocs

Switched ti MKDocs, because it's simpler to deal with and looks better.

Test suite

Added tests to ensure that the library will not break overnight.

Multiple GPU support

Run models on multiple GPU's by passing: Trooper(model_name, device_map="auto")

KennethEnevoldsen

Next time I would probably split the pr into multiple.

CI:

It might be a good idea to make a makefile to unify CI across both local and CI

I would like to know more about how it selects models. - Mixing the APIs with HF models is also a bit confusing. I would split it up.

You are also hiding a lot of complexity from the user, it might be worth letting people know what is going on below the hood.

README.md

mkdocs.yml

stormtrooper/set_fit.py

KennethEnevoldsen · 2024-09-06T14:55:34Z

tests/test_all.py

rename test_trooper.py

stormtrooper/trooper.py

x-tabdeveloping added 13 commits August 9, 2024 10:42

fix: OpenAI FewShotClassifier fixed

6d9caca

Restructured the entire library, added Trooper interface

d41fa06

Fixed zero-shot predictions for setfit

40fe12b

Adjusted dependencies

a649321

Added integration tests

202ce31

Removed old docs

e976c0c

Added doc dependencies

5ef848d

Added docstrings for trooper

7491ec0

Added docstrings

af1fe60

Added documentation in MKDocs

b829fea

Updated readme with changes

690b8ed

Version bump

0be1f72

Added actions for deploying docs and running tests

7ef7069

x-tabdeveloping requested a review from KennethEnevoldsen August 21, 2024 12:15

x-tabdeveloping added 9 commits August 21, 2024 14:21

Bumped setfit version

090f6d0

Added setfit 1.0 as test dependency

d843b86

Rewrote SetFit from scratch as the original library is a complete mess

cc7b0c4

Adjusted dependencies

100e23f

Troopers can now detect sentence transformer models

abaa5d8

Fixed pair generation when there is only one example per label

653da10

Added new dependencie to test

bcb08a1

Removed irrelevant information about setfit in docs

f8d2e08

Fixed dependencies in workflows

e464414

x-tabdeveloping requested review from linguist89, rdkm89 and MartinBernstorff August 22, 2024 13:17

x-tabdeveloping added 4 commits September 6, 2024 15:42

Made OpenAI classifier async

10ac8d1

Bumped OpenAI dependency and made it not optional

39179cc

Added option for providing device_map argument to generative models

6d0aabf

Updated docs with instructions on how to run inference on multiple GPUs.

023a879

KennethEnevoldsen approved these changes Sep 6, 2024

View reviewed changes

x-tabdeveloping added 5 commits September 7, 2024 13:46

fix: added device_map attribute to Trooper

7f0a19d

Added figure explaining how models are loaded.

65ede4a

Updated readme

456cff4

Added example to fuzzy_match docs

b892cd0

Renamed tests

774d37c

x-tabdeveloping merged commit 5e9757b into main Sep 7, 2024
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 1.0.0 #3

Release 1.0.0 #3

x-tabdeveloping commented Aug 21, 2024 •

edited

Loading

KennethEnevoldsen left a comment

KennethEnevoldsen Sep 6, 2024

Release 1.0.0 #3

Release 1.0.0 #3

Conversation

x-tabdeveloping commented Aug 21, 2024 • edited Loading

Trooper

Unified zero and few-shot classification

Reimplemented SetFit

New docs in MKDocs

Test suite

Multiple GPU support

KennethEnevoldsen left a comment

Choose a reason for hiding this comment

KennethEnevoldsen Sep 6, 2024

Choose a reason for hiding this comment

x-tabdeveloping commented Aug 21, 2024 •

edited

Loading

`Trooper`