A modern, full-stack chat application demonstrating how to integrate React frontend with a Go backend and run local Large Language Models (LLMs) using Docker's Model Runner.
This project showcases a complete Generative AI interface that includes:
- React/TypeScript frontend with a responsive chat UI
- Go backend server for API handling
- Integration with Docker's Model Runner to run Llama 3.2 locally
- Comprehensive observability with metrics, logging, and tracing
- π¬ Interactive chat interface with message history
- π Real-time streaming responses (tokens appear as they're generated)
- π Light/dark mode support based on user preference
- π³ Dockerized deployment for easy setup and portability
- π Run AI models locally without cloud API dependencies
- π Cross-origin resource sharing (CORS) enabled
- π§ͺ Integration testing using Testcontainers
- π Metrics and performance monitoring
- π Structured logging with zerolog
- π Distributed tracing with OpenTelemetry
- π Grafana dashboards for visualization

The application consists of these main components:
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Frontend β >>> β Backend β >>> β Model Runnerβ
β (React/TS) β β (Go) β β (Llama 3.2) β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
:3000 :8080 :12434
β β
βββββββββββββββ βββββββ βββββββ βββββββββββββββ
β Grafana β <<< β Prometheus β β Jaeger β
β Dashboards β β Metrics β β Tracing β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
:3001 :9091 :16686
There are two ways to connect to Model Runner:
This method uses Docker's internal DNS resolution to connect to the Model Runner:
- Connection URL:
http://model-runner.docker.internal/engines/llama.cpp/v1/
- Configuration is set in
backend.env
This method uses host-side TCP support:
- Connection URL:
host.docker.internal:12434
- Requires updates to the environment configuration
- Docker and Docker Compose
- Git
- Go 1.19 or higher (for local development)
- Node.js and npm (for frontend development)
Before starting, pull the required model:
docker model pull ignaciolopezluna020/llama3.2:1B
-
Clone this repository:
git clone https://github.com/ajeetraina/genai-app-demo.git cd genai-app-demo
-
Start the application using Docker Compose:
docker compose up -d --build
-
Access the frontend at http://localhost:3000
-
Access observability dashboards:
- Grafana: http://localhost:3001 (admin/admin)
- Jaeger UI: http://localhost:16686
- Prometheus: http://localhost:9091
The frontend is built with React, TypeScript, and Vite:
cd frontend
npm install
npm run dev
This will start the development server at http://localhost:3000.
The Go backend can be run directly:
go mod download
go run main.go
Make sure to set the required environment variables from backend.env
:
BASE_URL
: URL for the model runnerMODEL
: Model identifier to useAPI_KEY
: API key for authentication (defaults to "ollama")LOG_LEVEL
: Logging level (debug, info, warn, error)LOG_PRETTY
: Whether to output pretty-printed logsTRACING_ENABLED
: Enable OpenTelemetry tracingOTLP_ENDPOINT
: OpenTelemetry collector endpoint
- The frontend sends chat messages to the backend API
- The backend formats the messages and sends them to the Model Runner
- The LLM processes the input and generates a response
- The backend streams the tokens back to the frontend as they're generated
- The frontend displays the incoming tokens in real-time
- Observability components collect metrics, logs, and traces throughout the process
βββ compose.yaml # Docker Compose configuration
βββ backend.env # Backend environment variables
βββ main.go # Go backend server
βββ frontend/ # React frontend application
β βββ src/ # Source code
β β βββ components/ # React components
β β βββ App.tsx # Main application component
β β βββ ...
βββ pkg/ # Go packages
β βββ logger/ # Structured logging
β βββ metrics/ # Prometheus metrics
β βββ middleware/ # HTTP middleware
β βββ tracing/ # OpenTelemetry tracing
β βββ health/ # Health check endpoints
βββ prometheus/ # Prometheus configuration
βββ grafana/ # Grafana dashboards and configuration
βββ observability/ # Observability documentation
βββ ...
The project includes comprehensive observability features:
- Model performance (latency, time to first token)
- Token usage (input and output counts)
- Request rates and error rates
- Active request monitoring
- Structured JSON logs with zerolog
- Log levels (debug, info, warn, error, fatal)
- Request logging middleware
- Error tracking
- Request flow tracing with OpenTelemetry
- Integration with Jaeger for visualization
- Span context propagation
For more information, see Observability Documentation.
You can customize the application by:
- Changing the model in
backend.env
to use a different LLM - Modifying the frontend components for a different UI experience
- Extending the backend API with additional functionality
- Customizing the Grafana dashboards for different metrics
The project includes integration tests using Testcontainers:
cd tests
go test -v
- Model not loading: Ensure you've pulled the model with
docker model pull
- Connection errors: Verify Docker network settings and that Model Runner is running
- Streaming issues: Check CORS settings in the backend code
- Metrics not showing: Verify that Prometheus can reach the backend metrics endpoint
MIT
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request