-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Install ollama models tinyllama and phi3:mini in dev hub for experimentation #5824
Conversation
I’m sure we will want the ollama binary and the supported models to only exist once on a hub’s file system. We don’t want 3-4 GB of redundant files in every user’s homedir. When we want ollama to be used all within a user’s pod and accessed via openai libraries from a notebook, it must be run in that pod in server mode. This may be done with just running the binary in server mode. Otherwise it would need to be installed as a linux system service in the user’s pod. |
I don’t understand the got clone. Standard install is to download the binary, or download & run the little installer script that does that plus makes it a system service. Don’t “ollama run” as that will start the simple conversation app. Do ollama pull to get models. But models should be stored centrally; istr you can just tell ollama where to locally find the models, or we could symlink. |
@gmerritt Thank you! I updated the installation to use a Docker image I found for Ollama. Currently, the Ollama models are only installed in the dev hub, so this setup is specifically for our team's experimentation +(possibly demo) rather than for all users on the datahub. I would like to have the app run by default so users can directly access the chat service from Jupyter notebooks without needing to run any commands in terminals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
I don't believe running docker in our container will work in this manner. This specifies executing You'll probably need to specify what mounts those containers will need access to, what ports they should listen on, etc. I'm not sure what client programs are going to connect to those containers. If it's just whatever is in the existing singleuser environment then that's fine. |
If I were using a datahub ollama instance, I might prefer to access a shared resource on a 5x bigger model running on bigger hardware, shared by 10 or 15 students (or some math kind of like that), rather than each & every person having to bother to run their own super-tiny model...I recognize that it becomes a new & different scaling problem, of course, but note that an ollama server caches nothing in terms of input/output data, so it is a super-scalable kind of resource, imho! |
Thanks @ryanlovett! I will check the gradebook example and update the commit. @gmerritt Fair point - I would love to have such a set up as described by you in one of our hubs. I just thought we can take baby steps running a smaller model on a single user server in Dev hub to start with and scale to a bigger model in a shared setting post user demand, infra admin bandwidth, better policies around model deployment etc.. |
@ryanlovett Added ollama model images (phi3 and tinyllama) as part of the extra container stanza running on ports 5000 and 5001 respectively. Couldn't find any other service running in the same port from my limited due diligence. |
That looks like the correct syntax. I tried to find the phi3 and tinyllama containers but they don't exist, e.g. try |
From Fox in my email
|
Thanks all for your inputs. My latest commit adds a postbuild file which downloads ollama binary and runs the phi3 and tinyllama models (in Dev hub's default image) |
That linked github script is just a wrapper around the standard click-through .sh installer for linux as shown here: https://ollama.com/download/linux |
...and the local server mode of ollama does work just fine on datahub... |
Thanks folks! I will merge the postbuild script for now and do some testing in the dev-staging hub |
After a lengthy conversation with Greg and Eric in ucb-datahub-staff channel about ollama, I thought it would be nice to install couple of small ollama models in dev hub. Success criteria would be to run the ollama notebook in https://github.com/pamelafox/ollama-python-playground/tree/main in Dev hub as a proof of concept. I am also curious about the memory/cpu requirement to run these models in one of our hubs.
I don't know whether the image build will succeed as the local build stalled after 10 minutes. I can revert this if you all think there are better options to install ollama. Thanks
Ref:
https://uctech.slack.com/archives/C04NEF48SCR/p1719593690539859
https://ollama.com/library/tinyllama
https://ollama.com/library/phi3