New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Implement model quantization #84

Open

mlsw wants to merge 2 commits into main from model-quantization

Collaborator

mlsw commented Feb 26, 2025

This PR implements W8A8 model quantization for the TransformersModelForTokenClassificationNerStep. As this is an experimental feature, it is currently gated by environment variables.


          feat: implement model quantization

4019dbe

mlsw force-pushed the model-quantization branch from 4c485e3 to 4019dbe Compare

February 26, 2025 12:39

mlsw marked this pull request as ready for review

February 26, 2025 12:47

mlsw requested a review from paluchasz

February 26, 2025 12:47


          fix: update documentation link

766d076

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet