Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ramalama serve doesn't work when podman is installed #442

Closed
grillo-delmal opened this issue Nov 11, 2024 · 2 comments
Closed

ramalama serve doesn't work when podman is installed #442

grillo-delmal opened this issue Nov 11, 2024 · 2 comments

Comments

@grillo-delmal
Copy link
Collaborator

grillo-delmal commented Nov 11, 2024

What is the problem:

When running ramalama on a computer with podman installed, if you try to setup a server through the ramalama serve. The server will be inaccesible from the host computer.

How to reproduce

  • Install ramalama and podman on the same computer
  • Start ramalama
ramalama serve tiny
  • Run the following script on the host server
import requests

url = "http://localhost:8080/v1/chat/completions"
headers = {
    "Content-Type": "application/json"
}
data = {
    "messages": [
        {"role": "user", "content": "Hello, world!"}
    ]
}

What is happening?

First, here is the difference between running ramalama with and without podman installed.

Without podman installed

ramalama --dryrun serve tiny
> os.execvp(llama-server, ['llama-server', '--port', '8080', '-m', '/path/to/model'])

With podman installed

ramalama --dryrun serve tiny
> podman run --rm -i --label RAMALAMA ... quay.io/ramalama/ramalama:latest llama-server --port 8080 -m /path/to/mod

As it can be seen, both run the llama-server command without the --host parameter, this defaults llama-server to listen to 127.0.0.1, which makes sense on the first case, but makes it so that it fails to listen from requests from the host environment on the second case.

I was able to verify this by entering the container through podman exec and being successfully able to run the python script from inside the container.

@grillo-delmal
Copy link
Collaborator Author

I believe that this problem was a byproduct of 8ed6f48
Before this commit, ramalama ran inside the container and processed the aruments from the inside, so it was able to propperly evaluate if the llama-server command was going to run in a container since it was running inside the container

Here is the check:
https://github.com/containers/ramalama/blob/main/ramalama/model.py#L320-L321

Here is how that check is being evaluated right now:
https://github.com/containers/ramalama/blob/main/ramalama/common.py#L17-L21

Adding an extra evaluation to see if the script will run inside a container would help fix this issue. This is being done inside the exec_model_in_container method right now
https://github.com/containers/ramalama/blob/main/ramalama/model.py#L219-L223

@grillo-delmal
Copy link
Collaborator Author

Fixed by merging #444 ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant