A Next.js application that uses a large language model to control a computer through both local system control and virtual machine (Docker) environments.
🚧 Work in Progress
This project is under active development. Some features may be incomplete or subject to change.
The overall goal is to create a tool that allows a user to control their computer with any large language model in nodejs. Anthropics Computer Use Demo is the main inspirational source for this project.
Roadmap:
- ✅ Docker container management
- ✅ VNC integration
- ✅ Chat interface
- 🔳 (Generic) LLM integration
- ✅ Base architecture
- ✅ Model selection
- ✅ Model tracking
- ✅ Message history
- ✅ Local model support
- ✅ Model download tracking
- 🔳 Context management
- 🔳 Function calling
- ⬜ Streaming support
- ⬜ Computer use tooling
- ⬜ File management
- ⬜ Screenshot analysis
- ⬜ Mouse and keyboard control
- ⬜ Bash command execution
- 🔳 Launch options
- ⬜ CLI
- ✅ Web server
- ⬜ Electron app
- 🔳 Computer Use modes
- ✅ Virtual (Docker)
- ⬜ Local (direct control)
- ⬜ Conversation history
- ⬜ Multi Agent support
- ⬜ Memory management
Please check back later for updates or feel free to contribute!
- Screenshot analysis
- Mouse and keyboard control
- Bash command execution
- File management
- Chat interface for LLM interaction
- VNC-based graphical interactions
- Local Mode: Direct system control
- Docker Mode: Virtual machine control via Docker containers
- Multiple Launch Options:
- Web browser (Next.js server)
- Desktop application (Electron)
- CLI for specific LLM tasks
- Real-time container management
- Build progress streaming
- Container lifecycle control (start, stop, delete)
- Status monitoring and detailed logging
- NoVNC integration for web-based access
- Automated environment setup
- Responsive split-view layout
- Settings sidebar
- Real-time Docker status indicators
- Expandable log entries
- Copy-to-clipboard functionality
- Auto-scrolling chat interface
- Frontend: Next.js with TypeScript
- UI Components: Radix UI, Tailwind CSS
- Container Management: Dockerode
- Remote Access: VNC, SSH2
- LLM Integration: Langchain.js
- Desktop Packaging: Electron
- Terminal: node-pty, xterm.js
- Node.js (LTS version)
- Docker
- Python 3.11.6 (for certain features)
- Ollama (for local models) - See Ollama Setup section
git clone [repository-url]
cd llm-controlled-computer
npm install
cp .env.example .env
Edit .env
with your configuration.
Start the development server:
npm run dev
For production build:
npm run build
For Electron desktop app:
npm run build:electron
The application includes a custom Docker environment with:
- Ubuntu 22.04 base
- Python environment with pyenv
- Desktop environment with VNC access
- Firefox ESR with pre-configured extensions
- Various utility applications
# Using Homebrew
brew install ollama
# Start Ollama service
ollama serve
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Start Ollama service
systemctl start ollama
- Install WSL2 if not already installed:
wsl --install
- Install Ollama in WSL2:
curl -fsSL https://ollama.com/install.sh | sh
- Start Ollama service in WSL2:
ollama serve
Add the following to your .env
file:
# Ollama Configuration
NEXT_PUBLIC_OLLAMA_URL=http://localhost:11434
- Check if Ollama is running:
curl http://localhost:11434/api/health
- If not running, start the service:
# macOS/Linux
ollama serve
# Windows (in WSL2)
wsl -d Ubuntu -u root ollama serve
- Common issues:
- Port 11434 is already in use
- Insufficient disk space
- GPU drivers not properly installed (for GPU acceleration)
-
Ensure you follow the project's coding standards:
- Use TypeScript with strict typing
- Follow clean code principles
- Write comprehensive tests
- Add proper documentation
-
Submit pull requests with:
- Clear description of changes
- Test coverage
- Documentation updates
ISC