A Next.js application that uses a large language model to control a computer through both local system control and virtual machine (Docker) environments.
π§ Work in Progress
This project is under active development. Some features may be incomplete or subject to change.
The overall goal is to create a tool that allows a user to control their computer with any large language model in nodejs. Anthropics Computer Use Demo is the main inspirational source for this project.
Roadmap:
- β Docker container management
- β VNC integration
- β Chat interface
- π³ (Generic) LLM integration
- β Base architecture
- β Model selection
- β Model tracking
- β Message history
- β Local model support
- β Model download tracking
- π³ Context management
- π³ Function calling
- β¬ Streaming support
- β¬ Computer use tooling
- β¬ File management
- β¬ Screenshot analysis
- β¬ Mouse and keyboard control
- β¬ Bash command execution
- π³ Launch options
- β¬ CLI
- β Web server
- β¬ Electron app
- π³ Computer Use modes
- β Virtual (Docker)
- β¬ Local (direct control)
- β¬ Conversation history
- β¬ Multi Agent support
- β¬ Memory management
Please check back later for updates or feel free to contribute!
- Screenshot analysis
- Mouse and keyboard control
- Bash command execution
- File management
- Chat interface for LLM interaction
- VNC-based graphical interactions
- Local Mode: Direct system control
- Docker Mode: Virtual machine control via Docker containers
- Multiple Launch Options:
- Web browser (Next.js server)
- Desktop application (Electron)
- CLI for specific LLM tasks
- Real-time container management
- Build progress streaming
- Container lifecycle control (start, stop, delete)
- Status monitoring and detailed logging
- NoVNC integration for web-based access
- Automated environment setup
- Responsive split-view layout
- Settings sidebar
- Real-time Docker status indicators
- Expandable log entries
- Copy-to-clipboard functionality
- Auto-scrolling chat interface
- Frontend: Next.js with TypeScript
- UI Components: Radix UI, Tailwind CSS
- Container Management: Dockerode
- Remote Access: VNC, SSH2
- LLM Integration: Langchain.js
- Desktop Packaging: Electron
- Terminal: node-pty, xterm.js
- Node.js (LTS version)
- Docker
- Python 3.11.6 (for certain features)
- Ollama (for local models) - See Ollama Setup section
git clone [repository-url]
cd llm-controlled-computer
npm install
cp .env.example .env
Edit .env
with your configuration.
Start the development server:
npm run dev
For production build:
npm run build
For Electron desktop app:
npm run build:electron
The application includes a custom Docker environment with:
- Ubuntu 22.04 base
- Python environment with pyenv
- Desktop environment with VNC access
- Firefox ESR with pre-configured extensions
- Various utility applications
# Using Homebrew
brew install ollama
# Start Ollama service
ollama serve
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Start Ollama service
systemctl start ollama
- Install WSL2 if not already installed:
wsl --install
- Install Ollama in WSL2:
curl -fsSL https://ollama.com/install.sh | sh
- Start Ollama service in WSL2:
ollama serve
Add the following to your .env
file:
# Ollama Configuration
NEXT_PUBLIC_OLLAMA_URL=http://localhost:11434
- Check if Ollama is running:
curl http://localhost:11434/api/health
- If not running, start the service:
# macOS/Linux
ollama serve
# Windows (in WSL2)
wsl -d Ubuntu -u root ollama serve
- Common issues:
- Port 11434 is already in use
- Insufficient disk space
- GPU drivers not properly installed (for GPU acceleration)
-
Ensure you follow the project's coding standards:
- Use TypeScript with strict typing
- Follow clean code principles
- Write comprehensive tests
- Add proper documentation
-
Submit pull requests with:
- Clear description of changes
- Test coverage
- Documentation updates
ISC