GitHub

Ollama has advantages for prompting a knowledge base offline but at times can get very slow. It takes too long for it to be useful in a chatbot app.

Response time average per model locally:

llama-3-405b: 30 seconds
mistral-nemo: > 4 minutes

Using cloud deployment offloads all the computation.

Response time average per model on cloud:

llama-3-405b-instruct: 3 seconds

The amount of options to pick and choose on cloud is amazing. It gets expensive to scale, but using a free trial you can test with small token amounts on IBM Cloud.

Other available models:

Model	Path
MT0_XXL	bigscience/mt0-xxl
CODELLAMA_34B_INSTRUCT_HF	codellama/codellama-34b-instruct-hf
FLAN_T5_XL	google/flan-t5-xl
FLAN_T5_XXL	google/flan-t5-xxl
FLAN_UL2	google/flan-ul2
MERLINITE_7B	ibm-mistralai/merlinite-7b
GRANITE_13B_CHAT_V2	ibm/granite-13b-chat-v2
GRANITE_13B_INSTRUCT_V2	ibm/granite-13b-instruct-v2
GRANITE_20B_CODE_INSTRUCT	ibm/granite-20b-code-instruct
GRANITE_20B_MULTILINGUAL	ibm/granite-20b-multilingual
GRANITE_34B_CODE_INSTRUCT	ibm/granite-34b-code-instruct
GRANITE_3B_CODE_INSTRUCT	ibm/granite-3b-code-instruct
GRANITE_7B_LAB	ibm/granite-7b-lab
GRANITE_8B_CODE_INSTRUCT	ibm/granite-8b-code-instruct
LLAMA_2_13B_CHAT	meta-llama/llama-2-13b-chat
LLAMA_2_70B_CHAT	meta-llama/llama-2-70b-chat
LLAMA_3_405B_INSTRUCT	meta-llama/llama-3-405b-instruct
LLAMA_3_70B_INSTRUCT	meta-llama/llama-3-70b-instruct
LLAMA_3_8B_INSTRUCT	meta-llama/llama-3-8b-instruct
MISTRAL_LARGE	mistralai/mistral-large
MIXTRAL_8X7B_INSTRUCT_V01	mistralai/mixtral-8x7b-instruct-v01

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
config		config
docs		docs
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
llama-3-405b.ipynb		llama-3-405b.ipynb
llama-3-405b.py		llama-3-405b.py
setup_server.sh		setup_server.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

jeighmz/chatbot

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages