From 9083aea378c44722fe4a2d1eba1fee94667a7577 Mon Sep 17 00:00:00 2001
From: Neko Ayaka <neko@ayaka.moe>
Date: Tue, 16 Apr 2024 17:40:32 +0800
Subject: [PATCH] docs: updated README.md with getting started docs

Signed-off-by: Neko Ayaka <neko@ayaka.moe>
---
 README.md | 146 ++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 93 insertions(+), 53 deletions(-)

diff --git a/README.md b/README.md
index 9309669..3c5b778 100644
--- a/README.md
+++ b/README.md
@@ -33,56 +33,25 @@ The journey to large language models, AIGC, localized agents, [🦜🔗 Langchai
 - ✅ Easy to expose with existing Kubernetes services, ingress, etc.
 - ✅ Doesn't require any additional dependencies, just Kubernetes
 
-## Description
+## Getting started
 
-Unlock the abilities to run the following models with the Ollama Operator over Kubernetes:
-
-> [!TIP]
-> By the power of [`Modelfile`](https://github.com/ollama/ollama/blob/main/docs/modelfile.md) backed by Ollama, you can create and bundle any of your own model. **As long as it's a GGUF formatted model.**
-
-| Model                   | Parameters | Size  | Model image         | Full model image URL                           |
-| ----------------------- | ---------- | ----- | ------------------- | ---------------------------------------------- |
-| Llama 2                 | 7B         | 3.8GB | `llama2`            | `registry.ollama.ai/library/llama2`            |
-| Mistral                 | 7B         | 4.1GB | `mistral`           | `registry.ollama.ai/library/mistral`           |
-| Dolphin Phi             | 2.7B       | 1.6GB | `dolphin-phi`       | `registry.ollama.ai/library/dolphin-phi`       |
-| Phi-2                   | 2.7B       | 1.7GB | `phi`               | `registry.ollama.ai/library/phi`               |
-| Neural Chat             | 7B         | 4.1GB | `neural-chat`       | `registry.ollama.ai/library/neural-chat`       |
-| Starling                | 7B         | 4.1GB | `starling-lm`       | `registry.ollama.ai/library/starling-lm`       |
-| Code Llama              | 7B         | 3.8GB | `codellama`         | `registry.ollama.ai/library/codellama`         |
-| Llama 2 Uncensored      | 7B         | 3.8GB | `llama2-uncensored` | `registry.ollama.ai/library/llama2-uncensored` |
-| Llama 2 13B             | 13B        | 7.3GB | `llama2:13b`        | `registry.ollama.ai/library/llama2:13b`        |
-| Llama 2 70B             | 70B        | 39GB  | `llama2:70b`        | `registry.ollama.ai/library/llama2:70b`        |
-| Orca Mini               | 3B         | 1.9GB | `orca-mini`         | `registry.ollama.ai/library/orca-mini`         |
-| Vicuna                  | 7B         | 3.8GB | `vicuna`            | `registry.ollama.ai/library/vicuna`            |
-| LLaVA                   | 7B         | 4.5GB | `llava`             | `registry.ollama.ai/library/llava`             |
-| Gemma                   | 2B         | 1.4GB | `gemma:2b`          | `registry.ollama.ai/library/gemma:2b`          |
-| Gemma                   | 7B         | 4.8GB | `gemma:7b`          | `registry.ollama.ai/library/gemma:7b`          |
-
-Full list of available images can be found at [Ollama Library](https://ollama.com/library).
-
-> [!WARNING]
-> You should have at least 8 GB of RAM available on your node to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
+### Install operator
 
-> [!WARNING]
-> The actual size of downloaded large language models are huge by comparing to the size of general container images.
->
-> 1. Fast and stable network connection is recommended to download the models.
-> 2. Efficient storage is required to store the models if you want to run models larger than 13B.
+```shell
+kubectl apply -f https://raw.githubusercontent.com/nekomeowww/ollama-operator/main/dist/install.yaml
+```
 
-## Getting Started
+### Wait for the operator to be ready
 
-```yaml
-apiVersion: ollama.ayaka.io/v1
-kind: Model
-metadata:
-  name: phi
-spec:
-  image: phi
+```shell
+kubectl wait --for=jsonpath='{.status.replicas}'=2 deployment/ollama-operator-controller-manager -n ollama-operator-system
 ```
 
+### Create model
+
 > [!IMPORTANT]
 > Working with `kind`?
-> 
+>
 > The default provisioned `StorageClass` in `kind` is `standard`, and will only work with `ReadWriteOnce` access mode, therefore if you would need to run the operator with `kind`, you should specify `persistentVolume` with `accessMode: ReadWriteOnce` in the `Model` CRD:
 > ```yaml
 > apiVersion: ollama.ayaka.io/v1
@@ -95,6 +64,41 @@ spec:
 >     accessMode: ReadWriteOnce
 > ```
 
+```yaml
+apiVersion: ollama.ayaka.io/v1
+kind: Model
+metadata:
+  name: phi
+spec:
+  image: phi
+```
+
+Apply the `Model` CRD to your Kubernetes cluster:
+
+```shell
+kubectl apply -f ollama-model-phi.yaml
+```
+
+Wait for the model to be ready:
+
+```shell
+kubectl wait --for=jsonpath='{.status.readyReplicas}'=1 deployment/ollama-model-phi
+```
+
+### Access the model
+
+1. Ready! Now let's forward the ports to access the model:
+
+```shell
+kubectl port-forward svc/ollama-model-phi ollama
+```
+
+7. Interact with the model:
+
+```shell
+ollama run phi
+```
+
 ### Full options
 
 ```yaml
@@ -116,6 +120,42 @@ spec:
     accessMode: ReadWriteOnce
 ```
 
+## Supported models
+
+Unlock the abilities to run the following models with the Ollama Operator over Kubernetes:
+
+> [!TIP]
+> By the power of [`Modelfile`](https://github.com/ollama/ollama/blob/main/docs/modelfile.md) backed by Ollama, you can create and bundle any of your own model. **As long as it's a GGUF formatted model.**
+
+| Model                   | Parameters | Size  | Model image         | Full model image URL                           |
+| ----------------------- | ---------- | ----- | ------------------- | ---------------------------------------------- |
+| Llama 2                 | 7B         | 3.8GB | `llama2`            | `registry.ollama.ai/library/llama2`            |
+| Mistral                 | 7B         | 4.1GB | `mistral`           | `registry.ollama.ai/library/mistral`           |
+| Dolphin Phi             | 2.7B       | 1.6GB | `dolphin-phi`       | `registry.ollama.ai/library/dolphin-phi`       |
+| Phi-2                   | 2.7B       | 1.7GB | `phi`               | `registry.ollama.ai/library/phi`               |
+| Neural Chat             | 7B         | 4.1GB | `neural-chat`       | `registry.ollama.ai/library/neural-chat`       |
+| Starling                | 7B         | 4.1GB | `starling-lm`       | `registry.ollama.ai/library/starling-lm`       |
+| Code Llama              | 7B         | 3.8GB | `codellama`         | `registry.ollama.ai/library/codellama`         |
+| Llama 2 Uncensored      | 7B         | 3.8GB | `llama2-uncensored` | `registry.ollama.ai/library/llama2-uncensored` |
+| Llama 2 13B             | 13B        | 7.3GB | `llama2:13b`        | `registry.ollama.ai/library/llama2:13b`        |
+| Llama 2 70B             | 70B        | 39GB  | `llama2:70b`        | `registry.ollama.ai/library/llama2:70b`        |
+| Orca Mini               | 3B         | 1.9GB | `orca-mini`         | `registry.ollama.ai/library/orca-mini`         |
+| Vicuna                  | 7B         | 3.8GB | `vicuna`            | `registry.ollama.ai/library/vicuna`            |
+| LLaVA                   | 7B         | 4.5GB | `llava`             | `registry.ollama.ai/library/llava`             |
+| Gemma                   | 2B         | 1.4GB | `gemma:2b`          | `registry.ollama.ai/library/gemma:2b`          |
+| Gemma                   | 7B         | 4.8GB | `gemma:7b`          | `registry.ollama.ai/library/gemma:7b`          |
+
+Full list of available images can be found at [Ollama Library](https://ollama.com/library).
+
+> [!WARNING]
+> You should have at least 8 GB of RAM available on your node to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
+
+> [!WARNING]
+> The actual size of downloaded large language models are huge by comparing to the size of general container images.
+>
+> 1. Fast and stable network connection is recommended to download the models.
+> 2. Efficient storage is required to store the models if you want to run models larger than 13B.
+
 ## Architecture Overview
 
 There are two major components that the Ollama Operator will create for:
@@ -128,17 +168,17 @@ There are two major components that the Ollama Operator will create for:
 
 The detailed resources it creates, and the relationships between them are shown in the following diagram:
 
- <picture>
-    <source
-      srcset="./docs/public/architecture-theme-night.png"
-      media="(prefers-color-scheme: dark)"
-    />
-    <source
-      srcset="./docs/public/architecture-theme-day.png"
-      media="(prefers-color-scheme: light), (prefers-color-scheme: no-preference)"
-    />
-    <img src="./docs/public/architecture-theme-day.png" />
-  </picture>
+<picture>
+  <source
+    srcset="./docs/public/architecture-theme-night.png"
+    media="(prefers-color-scheme: dark)"
+  />
+  <source
+    srcset="./docs/public/architecture-theme-day.png"
+    media="(prefers-color-scheme: light), (prefers-color-scheme: no-preference)"
+  />
+  <img src="./docs/public/architecture-theme-day.png" />
+</picture>
 
 ## Contributing