Skip to content

ajeetraina/genai-app-demo

Repository files navigation

GenAI App Demo with Docker Model Runner

A modern, full-stack chat application demonstrating how to integrate React frontend with a Go backend and run local Large Language Models (LLMs) using Docker's Model Runner.

Overview

This project showcases a complete Generative AI interface that includes:

  • React/TypeScript frontend with a responsive chat UI
  • Go backend server for API handling
  • Integration with Docker's Model Runner to run Llama 3.2 locally
  • Comprehensive observability with metrics, logging, and tracing

Features

  • πŸ’¬ Interactive chat interface with message history
  • πŸ”„ Real-time streaming responses (tokens appear as they're generated)
  • πŸŒ“ Light/dark mode support based on user preference
  • 🐳 Dockerized deployment for easy setup and portability
  • 🏠 Run AI models locally without cloud API dependencies
  • πŸ”’ Cross-origin resource sharing (CORS) enabled
  • πŸ§ͺ Integration testing using Testcontainers
  • πŸ“Š Metrics and performance monitoring
  • πŸ“ Structured logging with zerolog
  • πŸ” Distributed tracing with OpenTelemetry
  • πŸ“ˆ Grafana dashboards for visualization
image

Architecture

The application consists of these main components:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Frontend  β”‚ >>> β”‚   Backend   β”‚ >>> β”‚ Model Runnerβ”‚
β”‚  (React/TS) β”‚     β”‚    (Go)     β”‚     β”‚ (Llama 3.2) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
      :3000              :8080               :12434
                          β”‚  β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”˜  └─────┐     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Grafana   β”‚ <<< β”‚ Prometheus  β”‚     β”‚   Jaeger    β”‚
β”‚ Dashboards  β”‚     β”‚  Metrics    β”‚     β”‚   Tracing   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
      :3001              :9091              :16686

Connection Methods

There are two ways to connect to Model Runner:

1. Using Internal DNS (Default)

This method uses Docker's internal DNS resolution to connect to the Model Runner:

  • Connection URL: http://model-runner.docker.internal/engines/llama.cpp/v1/
  • Configuration is set in backend.env

2. Using TCP

This method uses host-side TCP support:

  • Connection URL: host.docker.internal:12434
  • Requires updates to the environment configuration

Prerequisites

  • Docker and Docker Compose
  • Git
  • Go 1.19 or higher (for local development)
  • Node.js and npm (for frontend development)

Before starting, pull the required model:

docker model pull ignaciolopezluna020/llama3.2:1B

Quick Start

  1. Clone this repository:

    git clone https://github.com/ajeetraina/genai-app-demo.git
    cd genai-app-demo
  2. Start the application using Docker Compose:

    docker compose up -d --build
  3. Access the frontend at http://localhost:3000

  4. Access observability dashboards:

Development Setup

Frontend

The frontend is built with React, TypeScript, and Vite:

cd frontend
npm install
npm run dev

This will start the development server at http://localhost:3000.

Backend

The Go backend can be run directly:

go mod download
go run main.go

Make sure to set the required environment variables from backend.env:

  • BASE_URL: URL for the model runner
  • MODEL: Model identifier to use
  • API_KEY: API key for authentication (defaults to "ollama")
  • LOG_LEVEL: Logging level (debug, info, warn, error)
  • LOG_PRETTY: Whether to output pretty-printed logs
  • TRACING_ENABLED: Enable OpenTelemetry tracing
  • OTLP_ENDPOINT: OpenTelemetry collector endpoint

How It Works

  1. The frontend sends chat messages to the backend API
  2. The backend formats the messages and sends them to the Model Runner
  3. The LLM processes the input and generates a response
  4. The backend streams the tokens back to the frontend as they're generated
  5. The frontend displays the incoming tokens in real-time
  6. Observability components collect metrics, logs, and traces throughout the process

Project Structure

β”œβ”€β”€ compose.yaml           # Docker Compose configuration
β”œβ”€β”€ backend.env            # Backend environment variables
β”œβ”€β”€ main.go                # Go backend server
β”œβ”€β”€ frontend/              # React frontend application
β”‚   β”œβ”€β”€ src/               # Source code
β”‚   β”‚   β”œβ”€β”€ components/    # React components
β”‚   β”‚   β”œβ”€β”€ App.tsx        # Main application component
β”‚   β”‚   └── ...
β”œβ”€β”€ pkg/                   # Go packages
β”‚   β”œβ”€β”€ logger/            # Structured logging
β”‚   β”œβ”€β”€ metrics/           # Prometheus metrics
β”‚   β”œβ”€β”€ middleware/        # HTTP middleware
β”‚   β”œβ”€β”€ tracing/           # OpenTelemetry tracing
β”‚   └── health/            # Health check endpoints
β”œβ”€β”€ prometheus/            # Prometheus configuration
β”œβ”€β”€ grafana/               # Grafana dashboards and configuration
β”œβ”€β”€ observability/         # Observability documentation
└── ...

Observability Features

The project includes comprehensive observability features:

Metrics

  • Model performance (latency, time to first token)
  • Token usage (input and output counts)
  • Request rates and error rates
  • Active request monitoring

Logging

  • Structured JSON logs with zerolog
  • Log levels (debug, info, warn, error, fatal)
  • Request logging middleware
  • Error tracking

Tracing

  • Request flow tracing with OpenTelemetry
  • Integration with Jaeger for visualization
  • Span context propagation

For more information, see Observability Documentation.

Customization

You can customize the application by:

  1. Changing the model in backend.env to use a different LLM
  2. Modifying the frontend components for a different UI experience
  3. Extending the backend API with additional functionality
  4. Customizing the Grafana dashboards for different metrics

Testing

The project includes integration tests using Testcontainers:

cd tests
go test -v

Troubleshooting

  • Model not loading: Ensure you've pulled the model with docker model pull
  • Connection errors: Verify Docker network settings and that Model Runner is running
  • Streaming issues: Check CORS settings in the backend code
  • Metrics not showing: Verify that Prometheus can reach the backend metrics endpoint

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

About

A Sample GenAI Application

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published