llamafile is incredible #293

gavdog7 · 2024-03-16T17:35:03Z

gavdog7
Mar 16, 2024

106 tokens per second on llama v1.5 7B, all I needed to do was run the .exe with this flag (from CMD): llava-v1.5-7b-q4.llamafile.exe -ngl 999

It's unbelievable that it's this easy! I'm trying mixtral 8x7B next. I wish one of the mega context window open source models was available as a llamafile! Claude or Gemini. Or ideally foundation models, they'd be fun to explore

Thank you so much for making this tech so approachable for beginners!!!!

jart · 2024-03-17T04:31:59Z

jart
Mar 17, 2024
Maintainer

Thanks! I'm glad you've been enjoying it!

1 reply

realyukii Jul 9, 2024

Is there a way to somehow contribute my compute power/computer resources as gratitude to the community? (that don't require server-side or public IP)

BenedictHW · 2024-03-26T16:28:37Z

BenedictHW
Mar 26, 2024

Thanks Justine! I'm using it to help kids with learning Python as well as to use Python with LEGO SPIKE education robotics. Running llava-v1.5 locally on my Thinkpad E14 gen 4 gives around 7 tokens per second. Really grateful to you and to all who made contributions to open science.

1 reply

jart Mar 26, 2024
Maintainer

Thank you for sharing! I'm very happy to hear this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llamafile is incredible #293

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

llamafile is incredible #293

gavdog7 Mar 16, 2024

Replies: 2 comments · 2 replies

jart Mar 17, 2024 Maintainer

realyukii Jul 9, 2024

BenedictHW Mar 26, 2024

jart Mar 26, 2024 Maintainer

gavdog7
Mar 16, 2024

Replies: 2 comments 2 replies

jart
Mar 17, 2024
Maintainer

BenedictHW
Mar 26, 2024

jart Mar 26, 2024
Maintainer