Install ollama models tinyllama and phi3:mini in dev hub for experimentation #5824

balajialg · 2024-06-28T23:27:41Z

After a lengthy conversation with Greg and Eric in ucb-datahub-staff channel about ollama, I thought it would be nice to install couple of small ollama models in dev hub. Success criteria would be to run the ollama notebook in https://github.com/pamelafox/ollama-python-playground/tree/main in Dev hub as a proof of concept. I am also curious about the memory/cpu requirement to run these models in one of our hubs.

I don't know whether the image build will succeed as the local build stalled after 10 minutes. I can revert this if you all think there are better options to install ollama. Thanks

Ref:
https://uctech.slack.com/archives/C04NEF48SCR/p1719593690539859
https://ollama.com/library/tinyllama
https://ollama.com/library/phi3

gmerritt · 2024-06-28T23:55:13Z

I’m sure we will want the ollama binary and the supported models to only exist once on a hub’s file system. We don’t want 3-4 GB of redundant files in every user’s homedir.

When we want ollama to be used all within a user’s pod and accessed via openai libraries from a notebook, it must be run in that pod in server mode. This may be done with just running the binary in server mode. Otherwise it would need to be installed as a linux system service in the user’s pod.

gmerritt · 2024-06-29T00:02:34Z

I don’t understand the got clone. Standard install is to download the binary, or download & run the little installer script that does that plus makes it a system service.

Don’t “ollama run” as that will start the simple conversation app. Do ollama pull to get models.

But models should be stored centrally; istr you can just tell ollama where to locally find the models, or we could symlink.

balajialg · 2024-06-29T01:27:39Z

@gmerritt Thank you! I updated the installation to use a Docker image I found for Ollama. Currently, the Ollama models are only installed in the dev hub, so this setup is specifically for our team's experimentation +(possibly demo) rather than for all users on the datahub.

I would like to have the app run by default so users can directly access the chat service from Jupyter notebooks without needing to run any commands in terminals.

shaneknapp

lgtm!

ryanlovett · 2024-07-02T21:35:19Z

I don't believe running docker in our container will work in this manner. This specifies executing docker run within a container build whereas I think you want to actually run containers along side the user pod. You can specify additional containers with c.KubeSpawner.singleuser_extra_containers. I set up something like this in gradebook hub where we run another app in the user pod.

You'll probably need to specify what mounts those containers will need access to, what ports they should listen on, etc. I'm not sure what client programs are going to connect to those containers. If it's just whatever is in the existing singleuser environment then that's fine.

gmerritt · 2024-07-02T21:39:08Z

If I were using a datahub ollama instance, I might prefer to access a shared resource on a 5x bigger model running on bigger hardware, shared by 10 or 15 students (or some math kind of like that), rather than each & every person having to bother to run their own super-tiny model...I recognize that it becomes a new & different scaling problem, of course, but note that an ollama server caches nothing in terms of input/output data, so it is a super-scalable kind of resource, imho!

balajialg · 2024-07-02T21:59:29Z

Thanks @ryanlovett! I will check the gradebook example and update the commit.

@gmerritt Fair point - I would love to have such a set up as described by you in one of our hubs. I just thought we can take baby steps running a smaller model on a single user server in Dev hub to start with and scale to a bigger model in a shared setting post user demand, infra admin bandwidth, better policies around model deployment etc..

balajialg · 2024-07-02T23:05:13Z

@ryanlovett Added ollama model images (phi3 and tinyllama) as part of the extra container stanza running on ports 5000 and 5001 respectively. Couldn't find any other service running in the same port from my limited due diligence.

ryanlovett · 2024-07-02T23:29:34Z

That looks like the correct syntax.

I tried to find the phi3 and tinyllama containers but they don't exist, e.g. try docker pull phi3:latest or docker pull tinyllama:latest on your own machine. I think the image specs need to be fixed? I looked this up in order to try to understand how they expect to be communicated with. I would also suggest not using latest tag to ensure reproducibility.

ericvd-ucb · 2024-07-02T23:37:02Z

From Fox in my email


> "Just chatting with my team member about this.
> 
> I wonder if you could try adding ollama to your Docker image using a similar approach as:
> https://github.com/prulloac/devcontainer-features/blob/main/src/ollama/install.sh
> 
> And then seeing if you can do !ollama run ?
> 
> You could then make a wrapper that use subprocess to call that. Or you could try seeing if the actual local server works, that seems a bit dubious.

balajialg · 2024-07-03T00:22:19Z

Thanks all for your inputs. My latest commit adds a postbuild file which downloads ollama binary and runs the phi3 and tinyllama models (in Dev hub's default image)

gmerritt · 2024-07-03T00:48:23Z

From Fox in my email


> "Just chatting with my team member about this.
> 
> I wonder if you could try adding ollama to your Docker image using a similar approach as:
> https://github.com/prulloac/devcontainer-features/blob/main/src/ollama/install.sh
> 
> And then seeing if you can do !ollama run ?
> 
> You could then make a wrapper that use subprocess to call that. Or you could try seeing if the actual local server works, that seems a bit dubious.

That linked github script is just a wrapper around the standard click-through .sh installer for linux as shown here: https://ollama.com/download/linux

gmerritt · 2024-07-03T00:49:35Z

From Fox in my email


> "Just chatting with my team member about this.
> 
> I wonder if you could try adding ollama to your Docker image using a similar approach as:
> https://github.com/prulloac/devcontainer-features/blob/main/src/ollama/install.sh
> 
> And then seeing if you can do !ollama run ?
> 
> You could then make a wrapper that use subprocess to call that. Or you could try seeing if the actual local server works, that seems a bit dubious.

...and the local server mode of ollama does work just fine on datahub...

balajialg · 2024-07-03T21:10:29Z

Thanks folks! I will merge the postbuild script for now and do some testing in the dev-staging hub

Fixing the right model

da7ee6f

github-actions bot added bash singleuser-image hub: dev labels Jun 28, 2024

balajialg requested review from gmerritt, shaneknapp and ryanlovett June 28, 2024 23:27

balajialg changed the title ~~Installing tinyllama and phi3:mini in dev hub for experimentation~~ Installing ollama models tinyllama and phi3:mini in dev hub for experimentation Jun 28, 2024

removing pip install

6eca0c9

Change installation to docker image based

4a6b222

shaneknapp approved these changes Jul 2, 2024

View reviewed changes

Adding ollama models as part of extracontainer stanza

7ba3aef

github-actions bot added configuration and removed bash singleuser-image labels Jul 2, 2024

Reverting to postbuild based approach

851713a

github-actions bot added bash singleuser-image and removed configuration labels Jul 3, 2024

Add directory where data gets saved

24295db

balajialg changed the title ~~Installing ollama models tinyllama and phi3:mini in dev hub for experimentation~~ Install ollama models tinyllama and phi3:mini in dev hub for experimentation Jul 3, 2024

balajialg merged commit 9e4e3f1 into berkeley-dsep-infra:staging Jul 3, 2024
22 checks passed

balajialg deleted the ollama branch July 3, 2024 22:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Install ollama models tinyllama and phi3:mini in dev hub for experimentation #5824

Install ollama models tinyllama and phi3:mini in dev hub for experimentation #5824

balajialg commented Jun 28, 2024 •

edited

Loading

gmerritt commented Jun 28, 2024 •

edited

Loading

gmerritt commented Jun 29, 2024

balajialg commented Jun 29, 2024

shaneknapp left a comment

ryanlovett commented Jul 2, 2024

gmerritt commented Jul 2, 2024

balajialg commented Jul 2, 2024 •

edited

Loading

balajialg commented Jul 2, 2024 •

edited

Loading

ryanlovett commented Jul 2, 2024

ericvd-ucb commented Jul 2, 2024

balajialg commented Jul 3, 2024

gmerritt commented Jul 3, 2024

gmerritt commented Jul 3, 2024

balajialg commented Jul 3, 2024

Install ollama models tinyllama and phi3:mini in dev hub for experimentation #5824

Install ollama models tinyllama and phi3:mini in dev hub for experimentation #5824

Conversation

balajialg commented Jun 28, 2024 • edited Loading

gmerritt commented Jun 28, 2024 • edited Loading

gmerritt commented Jun 29, 2024

balajialg commented Jun 29, 2024

shaneknapp left a comment

Choose a reason for hiding this comment

ryanlovett commented Jul 2, 2024

gmerritt commented Jul 2, 2024

balajialg commented Jul 2, 2024 • edited Loading

balajialg commented Jul 2, 2024 • edited Loading

ryanlovett commented Jul 2, 2024

ericvd-ucb commented Jul 2, 2024

balajialg commented Jul 3, 2024

gmerritt commented Jul 3, 2024

gmerritt commented Jul 3, 2024

balajialg commented Jul 3, 2024

balajialg commented Jun 28, 2024 •

edited

Loading

gmerritt commented Jun 28, 2024 •

edited

Loading

balajialg commented Jul 2, 2024 •

edited

Loading

balajialg commented Jul 2, 2024 •

edited

Loading