Replies: 2 comments 2 replies
-
Thanks! I'm glad you've been enjoying it! |
Beta Was this translation helpful? Give feedback.
1 reply
-
Thanks Justine! I'm using it to help kids with learning Python as well as to use Python with LEGO SPIKE education robotics. Running llava-v1.5 locally on my Thinkpad E14 gen 4 gives around 7 tokens per second. Really grateful to you and to all who made contributions to open science. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
106 tokens per second on llama v1.5 7B, all I needed to do was run the .exe with this flag (from CMD): llava-v1.5-7b-q4.llamafile.exe -ngl 999
It's unbelievable that it's this easy! I'm trying mixtral 8x7B next. I wish one of the mega context window open source models was available as a llamafile! Claude or Gemini. Or ideally foundation models, they'd be fun to explore
Thank you so much for making this tech so approachable for beginners!!!!
Beta Was this translation helpful? Give feedback.
All reactions