Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Enhancement #30

Open
poa010101 opened this issue Dec 18, 2023 · 1 comment
Open

Performance Enhancement #30

poa010101 opened this issue Dec 18, 2023 · 1 comment

Comments

@poa010101
Copy link

Andy and the team:

We made two performance enhancements: Flash Attention & Int8 quantization to be able to make the execution speed 4-5 times faster. Please let us know if we are allowed to contribute the source code back to the community.

Regards

Founder of ReparteeAI

Danny

@justinphan3110cais
Copy link
Collaborator

justinphan3110cais commented Dec 29, 2023

Hi @poa010101, feel free to open a PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants