diff --git a/README.md b/README.md
index 9309669..3c5b778 100644
--- a/README.md
+++ b/README.md
@@ -33,56 +33,25 @@ The journey to large language models, AIGC, localized agents, [🦜🔗 Langchai
- ✅ Easy to expose with existing Kubernetes services, ingress, etc.
- ✅ Doesn't require any additional dependencies, just Kubernetes
-## Description
+## Getting started
-Unlock the abilities to run the following models with the Ollama Operator over Kubernetes:
-
-> [!TIP]
-> By the power of [`Modelfile`](https://github.com/ollama/ollama/blob/main/docs/modelfile.md) backed by Ollama, you can create and bundle any of your own model. **As long as it's a GGUF formatted model.**
-
-| Model | Parameters | Size | Model image | Full model image URL |
-| ----------------------- | ---------- | ----- | ------------------- | ---------------------------------------------- |
-| Llama 2 | 7B | 3.8GB | `llama2` | `registry.ollama.ai/library/llama2` |
-| Mistral | 7B | 4.1GB | `mistral` | `registry.ollama.ai/library/mistral` |
-| Dolphin Phi | 2.7B | 1.6GB | `dolphin-phi` | `registry.ollama.ai/library/dolphin-phi` |
-| Phi-2 | 2.7B | 1.7GB | `phi` | `registry.ollama.ai/library/phi` |
-| Neural Chat | 7B | 4.1GB | `neural-chat` | `registry.ollama.ai/library/neural-chat` |
-| Starling | 7B | 4.1GB | `starling-lm` | `registry.ollama.ai/library/starling-lm` |
-| Code Llama | 7B | 3.8GB | `codellama` | `registry.ollama.ai/library/codellama` |
-| Llama 2 Uncensored | 7B | 3.8GB | `llama2-uncensored` | `registry.ollama.ai/library/llama2-uncensored` |
-| Llama 2 13B | 13B | 7.3GB | `llama2:13b` | `registry.ollama.ai/library/llama2:13b` |
-| Llama 2 70B | 70B | 39GB | `llama2:70b` | `registry.ollama.ai/library/llama2:70b` |
-| Orca Mini | 3B | 1.9GB | `orca-mini` | `registry.ollama.ai/library/orca-mini` |
-| Vicuna | 7B | 3.8GB | `vicuna` | `registry.ollama.ai/library/vicuna` |
-| LLaVA | 7B | 4.5GB | `llava` | `registry.ollama.ai/library/llava` |
-| Gemma | 2B | 1.4GB | `gemma:2b` | `registry.ollama.ai/library/gemma:2b` |
-| Gemma | 7B | 4.8GB | `gemma:7b` | `registry.ollama.ai/library/gemma:7b` |
-
-Full list of available images can be found at [Ollama Library](https://ollama.com/library).
-
-> [!WARNING]
-> You should have at least 8 GB of RAM available on your node to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
+### Install operator
-> [!WARNING]
-> The actual size of downloaded large language models are huge by comparing to the size of general container images.
->
-> 1. Fast and stable network connection is recommended to download the models.
-> 2. Efficient storage is required to store the models if you want to run models larger than 13B.
+```shell
+kubectl apply -f https://raw.githubusercontent.com/nekomeowww/ollama-operator/main/dist/install.yaml
+```
-## Getting Started
+### Wait for the operator to be ready
-```yaml
-apiVersion: ollama.ayaka.io/v1
-kind: Model
-metadata:
- name: phi
-spec:
- image: phi
+```shell
+kubectl wait --for=jsonpath='{.status.replicas}'=2 deployment/ollama-operator-controller-manager -n ollama-operator-system
```
+### Create model
+
> [!IMPORTANT]
> Working with `kind`?
->
+>
> The default provisioned `StorageClass` in `kind` is `standard`, and will only work with `ReadWriteOnce` access mode, therefore if you would need to run the operator with `kind`, you should specify `persistentVolume` with `accessMode: ReadWriteOnce` in the `Model` CRD:
> ```yaml
> apiVersion: ollama.ayaka.io/v1
@@ -95,6 +64,41 @@ spec:
> accessMode: ReadWriteOnce
> ```
+```yaml
+apiVersion: ollama.ayaka.io/v1
+kind: Model
+metadata:
+ name: phi
+spec:
+ image: phi
+```
+
+Apply the `Model` CRD to your Kubernetes cluster:
+
+```shell
+kubectl apply -f ollama-model-phi.yaml
+```
+
+Wait for the model to be ready:
+
+```shell
+kubectl wait --for=jsonpath='{.status.readyReplicas}'=1 deployment/ollama-model-phi
+```
+
+### Access the model
+
+1. Ready! Now let's forward the ports to access the model:
+
+```shell
+kubectl port-forward svc/ollama-model-phi ollama
+```
+
+7. Interact with the model:
+
+```shell
+ollama run phi
+```
+
### Full options
```yaml
@@ -116,6 +120,42 @@ spec:
accessMode: ReadWriteOnce
```
+## Supported models
+
+Unlock the abilities to run the following models with the Ollama Operator over Kubernetes:
+
+> [!TIP]
+> By the power of [`Modelfile`](https://github.com/ollama/ollama/blob/main/docs/modelfile.md) backed by Ollama, you can create and bundle any of your own model. **As long as it's a GGUF formatted model.**
+
+| Model | Parameters | Size | Model image | Full model image URL |
+| ----------------------- | ---------- | ----- | ------------------- | ---------------------------------------------- |
+| Llama 2 | 7B | 3.8GB | `llama2` | `registry.ollama.ai/library/llama2` |
+| Mistral | 7B | 4.1GB | `mistral` | `registry.ollama.ai/library/mistral` |
+| Dolphin Phi | 2.7B | 1.6GB | `dolphin-phi` | `registry.ollama.ai/library/dolphin-phi` |
+| Phi-2 | 2.7B | 1.7GB | `phi` | `registry.ollama.ai/library/phi` |
+| Neural Chat | 7B | 4.1GB | `neural-chat` | `registry.ollama.ai/library/neural-chat` |
+| Starling | 7B | 4.1GB | `starling-lm` | `registry.ollama.ai/library/starling-lm` |
+| Code Llama | 7B | 3.8GB | `codellama` | `registry.ollama.ai/library/codellama` |
+| Llama 2 Uncensored | 7B | 3.8GB | `llama2-uncensored` | `registry.ollama.ai/library/llama2-uncensored` |
+| Llama 2 13B | 13B | 7.3GB | `llama2:13b` | `registry.ollama.ai/library/llama2:13b` |
+| Llama 2 70B | 70B | 39GB | `llama2:70b` | `registry.ollama.ai/library/llama2:70b` |
+| Orca Mini | 3B | 1.9GB | `orca-mini` | `registry.ollama.ai/library/orca-mini` |
+| Vicuna | 7B | 3.8GB | `vicuna` | `registry.ollama.ai/library/vicuna` |
+| LLaVA | 7B | 4.5GB | `llava` | `registry.ollama.ai/library/llava` |
+| Gemma | 2B | 1.4GB | `gemma:2b` | `registry.ollama.ai/library/gemma:2b` |
+| Gemma | 7B | 4.8GB | `gemma:7b` | `registry.ollama.ai/library/gemma:7b` |
+
+Full list of available images can be found at [Ollama Library](https://ollama.com/library).
+
+> [!WARNING]
+> You should have at least 8 GB of RAM available on your node to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
+
+> [!WARNING]
+> The actual size of downloaded large language models are huge by comparing to the size of general container images.
+>
+> 1. Fast and stable network connection is recommended to download the models.
+> 2. Efficient storage is required to store the models if you want to run models larger than 13B.
+
## Architecture Overview
There are two major components that the Ollama Operator will create for:
@@ -128,17 +168,17 @@ There are two major components that the Ollama Operator will create for:
The detailed resources it creates, and the relationships between them are shown in the following diagram:
-
-
-
-
-
+
+
+
+
+
## Contributing