Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tcp server load test #7

Open
dvv opened this issue Oct 9, 2012 · 28 comments
Open

tcp server load test #7

dvv opened this issue Oct 9, 2012 · 28 comments

Comments

@dvv
Copy link
Contributor

dvv commented Oct 9, 2012

Hi!
I've drafted a simple hello-world HTTP server, to test luv under the load via ab -n100000 -c500 ...

The result is that the server stopped responding after circa 50000 requests.What can be wrong?

I wonder do we have any explicit ot implicit limit on the number of concurrent coroutines?

@richardhundt
Copy link
Owner

Hey, thanks for the feedback.

No there's no limit on the number of coroutines. If you're throwing 500 concurrent connections at it, then you need to increase your listen backlog:

server:listen(1024) -- default is 128

I don't think that was the problem though, ab would just get connection refused and it would have to back off.

So if you try calling client:shutdown() after client:write() I think you'll get better behaviour from ab. close is a bit harsh, so it's probably getting incomplete responses. I've also seen these 20 second pauses (although node.js was much worse under the same work load), which might have to do with file-descriptor reuse and linger on them.

Finally, you can do ab -k for keepalive and try this guy:

https://gist.github.com/3857202

and run with:

ab -k -r -n 10000 -c 500 http://127.0.0.1:8080/

I'm getting in the order of 16k req/sec with very few I/O errors.

@miko
Copy link

miko commented Oct 9, 2012

Seems I have the same issue:
$ ab -k -r -n 10000 -c 500 http://127.0.0.1:8080/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
apr_poll: The timeout specified has expired (70007)
Total of 3955 requests completed

The server printed lots of:
got zero, EOF?

I have modified the line to:
print("got zero, EOF?", str)

and then got:

accept: userdata<luv.net.tcp>: 0x945124c
accept: userdata<luv.net.tcp>: 0x94517ec
enter child
enter child
got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

ApacheBench/2.3
Accept: /

got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

Accept: /

got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

So clearly there is some bufferring issue: request data gets mixed up.

@richardhundt
Copy link
Owner

I've pushed a change to git head which should fix it. I wasn't pushing the length of the read buffer if there was pending data.

@richardhundt
Copy link
Owner

Try git head now.

On Oct 9, 2012, at 10:27 AM, miko wrote:

Seems I have the same issue:
$ ab -k -r -n 10000 -c 500 http://127.0.0.1:8080/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
apr_poll: The timeout specified has expired (70007)
Total of 3955 requests completed

The server printed lots of:
got zero, EOF?

I have modified the line to:
print("got zero, EOF?", str)

and then got:

accept: userdata: 0x945124c
accept: userdata: 0x94517ec
enter child
enter child
got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

ApacheBench/2.3
Accept: /

got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

Accept: /

got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

So clearly there is some bufferring issue: request data gets mixed up.


Reply to this email directly or view it on GitHub.

@miko
Copy link

miko commented Oct 9, 2012

Much better now: I get all the request data. But when it runs succesfully; I get at the exit:
lua: src/unix/stream.c:934: uv__read: Assertion `!uv__io_active(&stream->read_watcher)' failed.

But most of the time it just hangs as in the original report, until ab times out.

@miko
Copy link

miko commented Oct 9, 2012

Repeated the test with luajit instead of lua (and modified print line), got:
got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

�� �
got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

�� �
got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

�� �
got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

(in case it is not visible: the data has appended some binary bytes, which my terminal interprets as unicode)

After another modification:
print("got zero, EOF?", #str, str)
I can see thet all data has the size of 65536 bytes, which can't be true, so some garbage is added somewhere to the received data.

@richardhundt
Copy link
Owner

Ah, okay, buf.len is the size of the buffer and not how much is read. Try now.

On Oct 9, 2012, at 10:54 AM, miko wrote:

Repeated the test with luajit instead of lua (and modified print line), got:
got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

� �
got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

� �
got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

� �
got zero, EOF? GET / HTTP/1.0
Connection: Keep-Alive
Host: 127.0.0.1:8080
User-Agent: ApacheBench/2.3
Accept: /

(in case it is not visible: the data has appended some binary bytes, which my terminal interprets as unicode)

After another modification:
print("got zero, EOF?", #str, str)
I can see thet all data has the size of 65536 bytes, which can't be true, so some garbage is added somewhere to the received data.


Reply to this email directly or view it on GitHub.

@richardhundt
Copy link
Owner

This error comes when you try to read after an error has occurred.

try doing:

if client:write(http_response) == -1 then
client:close()
end

I'll have to fix it so that it doesn't crash like that.

Thanks again for the report.

On Oct 9, 2012, at 10:49 AM, miko wrote:

Much better now: I get all the request data. But when it runs succesfully; I get at the exit:
lua: src/unix/stream.c:934: uvread: Assertion `!uvio_active(&stream->read_watcher)' failed.

But most of the time it just hangs as in the original report, until ab times out.


Reply to this email directly or view it on GitHub.

@dvv
Copy link
Contributor Author

dvv commented Oct 9, 2012

seems behaving much better, thank you.
am still not fluent in coro-based async way -- couldn't you mangle my example to add delay to the response logic, to simulate async work. tia

@miko
Copy link

miko commented Oct 9, 2012

The buffer size issue is now resolved: when using ab everything works great! I do still get uv__read assertion, and I do use client:close(), as it is your httpd-k.lua example. The reason may be the ab just drops connection, but still the server should not crash like this. I get this also for ab -k -n 1 -c 1 http://127.0.0.1:8080/

@miko
Copy link

miko commented Oct 9, 2012

dvv: I have put my version of httpd-k.lua with timers in https://gist.github.com/3857202#comments

@richardhundt
Copy link
Owner

Hi miko,

can you try git head now? I've made stream:close() yield and made a couple of other changes which hopefully should fix it.

Thanks for all your support.

On Oct 9, 2012, at 12:21 PM, miko wrote:

The buffer size issue is now resolved: when using ab everything works great! I do still get uv__read assertion, and I do use client:close(), as it is your httpd-k.lua example. The reason may be the ab just drops connection, but still the server should not crash like this. I get this also for ab -k -n 1 -c 1 http://127.0.0.1:8080/


Reply to this email directly or view it on GitHub.

@richardhundt
Copy link
Owner

I've been throwing ab at this thing all morning and it seems that it's a bit pathological. It'll pump in the same headers repeatedly even without keepalive, so I'm see some 20k buffers coming in in a single libuv read_cb from the socket.

So basically libuv will keep happily reading from the socket as long as ab is writing, and ab doesn't stop writing as long as libuv is reading. So if you get these big chunks occasionally.

What I've done now is give stream:read and parameter which specifies the buffer size to allocate. So stream:read(1024) stops libuv reading past that size and it fires the callback to rouse the coroutine.

Another other issue is that I'm getting occasional 20 second stalls and I think it's related to this discussion:

joyent/libuv#498

Unfortunately there's no uv_tcp_linger implementation (just the keepalive probes) :(

Another problem is when reading EOF. Sometimes, immediately after a socket is accepted, reading from it gets an EOF from libuv. I think the only sane thing to do is to propagate this back to the caller and wake the coroutine, but
what's tricky then, is knowing whether to close the socket, or to try again.

If you close the socket, ab will barf with EPIPE or ECONNRESET, which sucks.

To summarize, I'm finding that libuv + ab aren't really happy with each other. Perhaps it's just my code. I'll keep at it, since you guys seem hell bent on building an HTTP deamon out of this :)

@dvv
Copy link
Contributor Author

dvv commented Oct 9, 2012

i switched to https://github.com/wg/wrk and things are going well so far. ab is too slow and dumb to test luv :)

@richardhundt
Copy link
Owner

awesome, thanks for the tip!

On Oct 9, 2012, at 2:05 PM, Vladimir Dronnikov wrote:

i switched to https://github.com/wg/wrk and things are going well so far. ab is too slow and dumb to test luv :)


Reply to this email directly or view it on GitHub.

@dvv
Copy link
Contributor Author

dvv commented Oct 9, 2012

That's what I got on my slow setup:

$ wrk -t8 -c2048 -r1m http://localhost:8080/
Making 1000000 requests to http://localhost:8080/
  8 threads and 2048 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   214.97ms    2.25s    1.88m    98.43%
    Req/Sec     0.07      8.53     1.00k    99.99%
  1000035 requests in 4.85m, 60.08MB read
  Socket errors: connect 0, read 2406, write 0, timeout 159084
Requests/sec:   3438.60
Transfer/sec:    211.55KB

@miko
Copy link

miko commented Oct 9, 2012

Thanks, that fixed it for me! And yes, http daemon built into an application is nice ;)

@dvv
Copy link
Contributor Author

dvv commented Oct 9, 2012

i believe we just need http-parser binding and the lua-level http request/response logic. i wonder if @creationix 's luvit/web collection would fit

@creationix
Copy link

I want luvit/web to be as portable as possible. But I don't think I can get away from having a defined spec for readable and writable streams as well as a data format (currently lua strings). We can probably add support for multiple data formats (lua strings, ffi cdata, and lev cdata buffers).

My stream interface is very simple, but is callback based. I'm not sure how that fits into this project.

Readable stream: any table that has a :read()(callback(err, chunk)) method. Read is a method that returns a function that accepts a callback that gets err and chunk.

Writable stream: any table that has a :write(chunk)(callback(err)) method. Write is a method that accepts the chunk and returns a function that accepts the callback that gets the err.

Since the continuable is returned from the methods, it should be easy to write wrappers for this for other systems. I have coro sugar in my continuable library where you can do things like chunk = await(stream:read())

@creationix
Copy link

For data encoding, we could add encoding arguments to both :read(encoding) and :write(chunk, encoding) to allow supporting multiple data types. :write could probably even auto-detect the type and convert for you.

The harder issue for "luv" is I use continuables (functions that return functions that accept callbacks)

@richardhundt
Copy link
Owner

On Oct 9, 2012, at 5:23 PM, Tim Caswell wrote:

For data encoding, we could add encoding arguments to both :read(encoding) and :write(chunk, encoding) to allow supporting multiple data types. :write could probably even auto-detect the type and convert for you.

The harder issue for "luv" is I use continuables (functions that return functions that accept callbacks)

Yeah, I think we'd need to start with https://github.com/joyent/http-parser.git and work our way up from there, as you did.

@dvv
Copy link
Contributor Author

dvv commented Oct 9, 2012

wonder how to write a wrapper for callback-style logic to be used in these project fibers?
we could keep http stuff extrrnal then easily

@dvv
Copy link
Contributor Author

dvv commented Oct 9, 2012

just fyi i explored common logic for http_parser in C at https://github.com/dvv/luv/blob/master/src/uhttp.c#L270 ago -- got stuck at C memory management. but if we want kinda generic http parser which reports data/end, i believe most of its callbacks may be hardcoded in C.

@richardhundt
Copy link
Owner

You could do something like:

my.wrapper.read = function()
local curr = luv.self() -- store the current state
local data
some.event.library.read(fd, function(...)
data = ...
curr:ready()
end)
curr:suspend() -- go to sleep until the callback fires
return data
end

On Oct 9, 2012, at 7:46 PM, Vladimir Dronnikov wrote:

wonder how to write a wrapper for callback-style logic to be used in these project fibers?
we could keep http stuff extrrnal then easily


Reply to this email directly or view it on GitHub.

@richardhundt
Copy link
Owner

I'd start at the top. I think the API should look something like this:

local req = httpd:recv()

where req is a RACK/PSGI [1] like environment table with:

req = {
REQUEST_METHOD = …,
SCRIPT_NAME = …,
CONTENT_LENGTH = …,

 … the rest of the CGI environment vars… 

 luv_input  = <input stream>,
 luv_output = <output stream>,
 luv_errors = <errors stream>,

}

The body of the request (if any) is read, by the application, from the luv_input stream once the headers are parsed and you know your CONTENT_LENGTH. For all the rest, the parsing and it's C callbacks should be internal. There's no need to expose that. You can still do non-blocking read via libuv callbacks, and feed the HTTP parser with chunks, you just don't call luvL_state_ready() until you've got the headers, wired up the pipes and built the req table. There does not need to be a 1-1 correspondence between a libuv callback and waking a suspended fiber. You can run all the C callbacks you like until you're ready.

A response can be either streamed out via luv_output (streaming video, or whatever), or sent as:

httpd:send({ 200, { ["Content-Type"] = "text/html", … }, })

[1] http://search.cpan.org/~miyagawa/PSGI-1.101/PSGI.pod#The_Environment

On Oct 9, 2012, at 8:11 PM, Vladimir Dronnikov wrote:

just fyi i explored common logic for http_parser in C at https://github.com/dvv/luv/blob/master/src/uhttp.c#L270 ago -- got stuck at C memory management. but if we want kinda generic http parser which reports data/end, i believe most of its callbacks may be hardcoded in C.


Reply to this email directly or view it on GitHub.

@dvv
Copy link
Contributor Author

dvv commented Oct 10, 2012

that looks interesting, i'd love to try it

@miko
Copy link

miko commented Oct 10, 2012

After update to the latest head I no longer get seg faults on archlinux. Thanks! I think this issue can be closed now.

Regarding parsing http, I suggest opening a new issue (feature request), as it is hard to follow.

@dvv
Copy link
Contributor Author

dvv commented Oct 10, 2012

indeed. #10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants