Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GGML / GGUF formats #35

Open
damodharanj opened this issue Dec 28, 2023 · 2 comments
Open

GGML / GGUF formats #35

damodharanj opened this issue Dec 28, 2023 · 2 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@damodharanj
Copy link

The model is pretty amazing and thanks a lot for open sourcing it. Is there a way to size it down and run in hardwares like Apple silicon using ggml ?
GGML
Would this improve the inference times ? For me in Apple M2 it takes 12 seconds to translate 1 sentence. If you can guide me to do this would be willing to help!

@jaygala24 jaygala24 added enhancement New feature or request help wanted Extra attention is needed labels Jan 2, 2024
@jaygala24
Copy link
Member

Hi @damodharanj

Yes, this should improve the inference time. However, it would require you to write the model definitions in C++ similar to llama.cpp and convert it to ggml.

Currently, we don't have the bandwidth, experience and hardware resources to help you port the models to ggml. Please let us know if there is any progress on this thread.

@damodharanj
Copy link
Author

thanks a lot for the response! Sure will update what I can do from my end once I take this up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants