Use of all the CPU cores on Raspberry Pi 5 with llamafile #337

ramkumarkoppu · 2024-04-10T10:08:07Z

ramkumarkoppu
Apr 10, 2024

Hi,

To use all the CPU cores on Raspberry Pi 5 to speed up the inference, what configuration needs to be done with the llamafile?

ChristianWeyer · 2024-04-14T20:06:02Z

ChristianWeyer
Apr 14, 2024

Maybe @jart can help here?

0 replies

Djip007 · 2024-05-26T19:35:31Z

Djip007
May 26, 2024

Normally it will by default, no configuration is needed. (At least it do on x86)
if not try add --threads 4 and report it.

0 replies

jart · 2024-05-26T22:09:58Z

jart
May 26, 2024
Maintainer

You can speed up prompt processing by using F16 weights on RPI5 which is fast. for faster token generation speeds, your best bet at the moment would probably be Q4_0 or Q8_0. They'll have good performance all around. It's important to note Q4_0 is legacy and @ikawrakow might consider contributing new K quantizers for ARM64 soon.

0 replies

mofosyne · 2024-05-27T01:18:51Z

mofosyne
May 27, 2024
Collaborator

Do we have a survey mechanism perhaps? E.g. what if we have a binary that runs though different benchmarks, then write a single sentence telling the user which mode they should use? And perhaps better if it has an optional "do you want to upload your spec" to a leaderboard?

In the readme you can tell users to download the benchmark first, run it though then pick a model from huggingface that matches the ability of your machine.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use of all the CPU cores on Raspberry Pi 5 with llamafile #337

{{title}}

Replies: 4 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Use of all the CPU cores on Raspberry Pi 5 with llamafile #337

ramkumarkoppu Apr 10, 2024

Replies: 4 comments

ChristianWeyer Apr 14, 2024

Djip007 May 26, 2024

jart May 26, 2024 Maintainer

mofosyne May 27, 2024 Collaborator

ramkumarkoppu
Apr 10, 2024

ChristianWeyer
Apr 14, 2024

Djip007
May 26, 2024

jart
May 26, 2024
Maintainer

mofosyne
May 27, 2024
Collaborator