Infollama Proxy

Infollama is a Python server that manages a token-protected proxy for Ollama.

Infollama also retrieves and displays in a real time UI useful details about the Ollama server, including available models, running models, file size, RAM usage, and more. It also provides hardware information, particularly GPU and RAM usage.

Features

Run a proxy to access your Ollama API server, on localhost, LAN and WAN
Protect your Ollama server with one token by user or usage
Display usefull details about Ollama server (models, running models, size) and hardware device informations (CPU, GPUS, RAM and VRAM usage).
Log Ollama API calls in a log file (as an HTTP log file type) with different levels: NEVER, ERROR, INFO, and ALL, including the full JSON prompt request

Requirements

Python 3.10 or higher
Ollama server running on your local machine (See Ollama repository)
Tested on Linux Ubuntu, Windows 10/11, macOS with Mx Silicon Chip

Installation

Clone the repository:

git clone https://github.com/toutjavascript/infollama-proxy.git
cd infollama-proxy

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install the required dependencies:
```
pip install -r requirements.txt
```

Usage

Run the script with the following command:

python proxy.py

Open the browser and navigate to http://localhost:11430/info to access the Infollama Proxy web UI.

You can modify launch configuration with theses parameters:

usage: proxy.py [-h] [--base_url BASE_URL] [--host HOST] [--port PORT] [--cors CORS] [--anonym ANONYM] [--log LOG]
  --base_url BASE_URL  The base_url of localhost Ollama server (default: http://localhost:11434)
  --host HOST          The host name for the proxy server (default: 0.0.0.0)
  --port PORT          The port for the proxy server (default: 11430)
  --cors CORS          The cors policy for the proxy server (default: *)
  --anonym ANONYM      Authorize the proxy server to be accessed anonymously without token (default: False)
  --log LOG            Define the log level that is stored in proxy.log (default: PROMPT, Could be NEVER|ERROR|INFO|PROMPT|ALL)

Update

This repository is under heavy construction. To update the source code from GitHub, open a terminal in the infollama-proxy folder and launch a pull request:

git pull

Check the status of the hardware

Infollama is not only a proxy server but also a powerfull web UI that displays hardware status, like GPU usage and temperatures, memory usage, and other information.

API Calls

You can now use the proxy to chat with your Ollama server. Infollama works as an OpenAI Compatible LLM Server, You must give ths Base URL with port 11430:

base_url is now http://localhost:11430/v1

Do not forget to provide a valid token, starting with pro_, defined in users.conf file:

api_key = "pro_xxxxxxxxxxxxxx"

Define tokens

Token definitions are set in the users.conf file. During first launch, the users.conf is created with users.default.conf file. This text file lists the tokens line by line with this format:

user_type:user_name:token

user_type can be user or admin. An admin user can access more APIs (like, pull, delete, copy, ...) and can view the full log file in the web UI. user_name is a simple string of text token is a string that needs to starts with pro_ Parameters are separated with :

If --anonym parameter is set to something at starts, the users.conf is ignored and all the accesses are authorised. User name is set to openbar.

Logging events

You can log every prompt that are sent to server. Note that responses are not logged to preserve privacy and disk size. This proxy app has several levels of logging:

NEVER: No logs at all.
ERROR: Log only error and not authorised requests.
INFO: Log usefull access (not api/ps, api/tags, ...), excluding prompts
PROMPT: Log useful access (not api/ps, api/tags, ...), including prompts
ALL: Log every event, including prompts

By default, the level is set to PROMPT.

Log file uses Apache server log format. For example, one line with PROMPT level looks like this:

127.0.0.1 - user1 [16/Jan/2025:15:53:10] "STREAM /v1/chat/completions HTTP/1.1" 200	{'model': 'falcon3:1b', 'messages': [{'role': 'system', 'content': "You are a helpful web developer assistant and you obey to user's commands"}, {'role': 'user', 'content': ' Give me 10 python web servers. Tell me cons and pros. Conclude by choosing the easiest one. Do not write code.'}], 'stream': True, 'max_tokens': 1048}

Roadmap

Correcting bug and user issues is priority.

FAQ

Why Infollama proxy ?

Beacause I needed two functionnalities :

Access to Ollama server on LAN and over the web. As Ollama is not protected by token access, I need to manage it in a simple way.
Realtime view of Ollama server status

No GPU information displayed

If you see this error message Error get_device_info(): no module name 'distutils', try to update your install with:

pip install -U pip setuptools wheel

Tunneling is already working

Fully tested with solutions like

nGrok ngrok http http://localhost:11430
bore.pub (but no SSL support) bore local 11430 --to bore.pub

IF YOU OPEN INFOLLAMA OVER THE WEB, DO NOT FORGET TO CHANGE THE DEFAULT TOKENS IN users.conf FILE

With a web access, the diagram shows you acces from outside your LAN

Contributing

We welcome contributions from the community. Please feel free to open an issue or a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
src		src
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
infollama.py		infollama.py
proxy.py		proxy.py
requirements.txt		requirements.txt
run-infollama-proxy.bat		run-infollama-proxy.bat
users.conf		users.conf
users.default.conf		users.default.conf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Infollama Proxy

Features

Requirements

Installation

Usage

Update

Check the status of the hardware

API Calls

Define tokens

Logging events

Roadmap

FAQ

Why Infollama proxy ?

No GPU information displayed

Tunneling is already working

Contributing

About

Releases

Packages

Languages

toutjavascript/infollama-proxy

Folders and files

Latest commit

History

Repository files navigation

Infollama Proxy

Features

Requirements

Installation

Usage

Update

Check the status of the hardware

API Calls

Define tokens

Logging events

Roadmap

FAQ

Why Infollama proxy ?

No GPU information displayed

Tunneling is already working

Contributing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages