Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

httpstan fails to give a response for large outputs #652

Open
wm1995 opened this issue Dec 6, 2023 · 1 comment
Open

httpstan fails to give a response for large outputs #652

wm1995 opened this issue Dec 6, 2023 · 1 comment

Comments

@wm1995
Copy link

wm1995 commented Dec 6, 2023

This is following on from stan-dev/pystan#396, which ultimately seems to root from an httpstan issue. I'm working on an MWE, but the case I've identified so far seems to happen with a very large dataset. I used curl to interact with httpstan, to isolate whether it was a Pystan or httpstan error.

In the case I've identified, I'm able to compile and sample a large model, which appears to complete successfully according to the operation endpoint. However, when I do the final GET request to retrieve the model results, httpstan sends a header for the response, but then closes the connection without sending any data. There's no visible error on the console running httpstan (which I started using python -m httpstan), but the curl request fails:

$ curl localhost:8080/v1/models/ex6bdm7x/fits/wxft2x3c
curl: (18) transfer closed with 3144360709 bytes remaining to read

and packet sniffing suggests that httpstan is closing the connection:

12	5.048341	127.0.0.1	127.0.0.1	TCP	68	54876 → 8080 [SYN] Seq=0 Win=65535 Len=0 MSS=16344 WS=64 TSval=3703300602 TSecr=0 SACK_PERM
13	5.048735	127.0.0.1	127.0.0.1	TCP	68	8080 → 54876 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=16344 WS=64 TSval=157578390 TSecr=3703300602 SACK_PERM
14	5.048759	127.0.0.1	127.0.0.1	TCP	56	54876 → 8080 [ACK] Seq=1 Ack=1 Win=408256 Len=0 TSval=3703300602 TSecr=157578390
15	5.048771	127.0.0.1	127.0.0.1	TCP	56	[TCP Window Update] 8080 → 54876 [ACK] Seq=1 Ack=1 Win=408256 Len=0 TSval=157578390 TSecr=3703300602
16	5.048832	127.0.0.1	127.0.0.1	HTTP	165	GET /v1/models/ex6bdm7x/fits/wxft2x3c HTTP/1.1 
17	5.048851	127.0.0.1	127.0.0.1	TCP	56	8080 → 54876 [ACK] Seq=1 Ack=110 Win=408128 Len=0 TSval=157578390 TSecr=3703300602
34	17.511368	127.0.0.1	127.0.0.1	TCP	216	8080 → 54876 [PSH, ACK] Seq=1 Ack=110 Win=408128 Len=160 TSval=157590864 TSecr=3703300602 [TCP segment of a reassembled PDU]
35	17.511463	127.0.0.1	127.0.0.1	TCP	56	54876 → 8080 [ACK] Seq=110 Ack=161 Win=408128 Len=0 TSval=3703313076 TSecr=157590864
36	17.513067	127.0.0.1	127.0.0.1	HTTP	56	HTTP/1.1 200 OK 
37	17.513113	127.0.0.1	127.0.0.1	TCP	56	54876 → 8080 [ACK] Seq=110 Ack=162 Win=408128 Len=0 TSval=3703313077 TSecr=157590865
38	17.513293	127.0.0.1	127.0.0.1	TCP	56	54876 → 8080 [FIN, ACK] Seq=110 Ack=162 Win=408128 Len=0 TSval=3703313077 TSecr=157590865
39	17.513350	127.0.0.1	127.0.0.1	TCP	56	8080 → 54876 [ACK] Seq=162 Ack=111 Win=408128 Len=0 TSval=157590865 TSecr=3703313077

Packet 36 has [FIN, ACK] set.

The corresponding TCP stream is

GET /v1/models/ex6bdm7x/fits/wxft2x3c HTTP/1.1
Host: localhost:8080
User-Agent: curl/8.1.2
Accept: */*

HTTP/1.1 200 OK
Content-Type: text/plain; charset=utf-8
Content-Length: 3144360709
Date: Wed, 06 Dec 2023 16:26:16 GMT
Server: Python/3.10 aiohttp/3.8.4
@riddell-stan
Copy link
Contributor

I wonder if there isn't an aiohttp timeout that's being reached. You mentioned earlier that the data transfer didn't start for 9 seconds. That seems like a long time.

There are lots of timeouts in https://docs.aiohttp.org/en/stable/client_reference.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants