Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set connecttimeout with user option (Fixes #318) #321

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

CorradoLanera
Copy link

The suggestion in #318 (comment) by @hadley solves #318, and my reported query (including all the others I have that failed) succeeded.

I've:

  • added the corresponding code in the same place with timeout
  • included the corresponding NEWS item.

Copy link
Collaborator

@atheriel atheriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks right to me.

@atheriel
Copy link
Collaborator

Many of the snapshots need to be updated with this change, unfortunately.

Co-authored-by: Aaron Jacobs <[email protected]>
@hadley
Copy link
Member

hadley commented Feb 12, 2025

@atheriel what are you checking for in the tests when you print the request object? Maybe it would be better to test it with req_dry_run()?

@atheriel
Copy link
Collaborator

I was largely testing if the body and headers look right. Very open to suggestions on a better way to do that.

@hadley
Copy link
Member

hadley commented Feb 12, 2025

@atheriel maybe with req_dry_run()? There are two current drawbacks to using it:

  1. It's slow because it starts and stops a httpuv server on each run.
  2. It includes a few headers that a known to change each run (like Date)

It wouldn't be hard to fix those problems in httr2.

@hadley
Copy link
Member

hadley commented Mar 6, 2025

I'm not convinced that this is the correct solution because it means that (e.g.) chat_openai(base_url = "https://doesntexist.com")$chat("Hi") will take 60s to time out.

@schelhorn
Copy link

schelhorn commented Mar 17, 2025

I just wanted to add as a user perspective here: the flexible timeout setting is very important when using local models (as for instance served by llama.cpp's llama-server).

The reason is that even with Metal execution on an M1 Max, larger models such as the new gemma-3-27b will need more than the default 60 seconds {httr2} timeout to generate a complete response if tool use via {ragnar} is specified.

This is because as far as I could find out, tool use does not work with a streaming response, so I have to use ellmer::chat_openai with echo='none'. Consequently, llama-server will send back the response only when it has been fully generated, which often takes longer than 60 seconds for a 2048 token response, and in the meantime {httr2} times out.

So I (and probably other local LLM users) would certainly appreciate some sort of configurable option here since unfortunately neither {http2} nor {curl} allow setting a global timeout option in the user's R session.

@hadley
Copy link
Member

hadley commented Mar 17, 2025

@schelhorn I get that, but I'm not convinced that this is the right timeout to set — even if it takes llama-server a long time to generate the complete request, it should still connect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants