Skip to content
@makllama

MaKLlama

MaK(Mac+Kubernetes)llama: running and orchestrating large language models (LLMs) on Kubernetes with Mac nodes.

MaKllama Organization

The following video demonstrates the below steps:

  1. Add a Mac node with Apple-Silicon chip to a Kubernetes cluster (in seconds!).
  2. Manually start Bronze Willow (BW) on the Mac node (top-right terminal).
  3. Deploy tinyllama with 2 replicas.
  4. Access the OpenAI API-compatible endpoint through mods.

Demo

Popular repositories Loading

  1. makllama makllama Public

    MaK(Mac+Kubernetes)llama - Running and orchestrating large language models (LLMs) on Kubernetes with macOS nodes.

    Go 39 3

  2. llama.cpp llama.cpp Public

    Forked from ggml-org/llama.cpp

    LLM inference in C/C++

    C++ 3

  3. containerd containerd Public

    Forked from containerd/containerd

    An open and reliable container runtime

    Go 1

  4. cri cri Public

    Forked from virtual-kubelet/cri

    Go 1 1

  5. ktransformers ktransformers Public

    Forked from kvcache-ai/ktransformers

    A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

    Python 1

  6. .github .github Public

Repositories

Showing 10 of 20 repositories

Top languages

Loading…

Most used topics

Loading…