Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nighthawk doesn't correctly drain the connection pool #873

Open
mum4k opened this issue Jul 14, 2022 · 3 comments
Open

Nighthawk doesn't correctly drain the connection pool #873

mum4k opened this issue Jul 14, 2022 · 3 comments
Labels
bug Something isn't working tech-debt

Comments

@mum4k
Copy link
Collaborator

mum4k commented Jul 14, 2022

The pool termination code is here.

While the code does register an idle callback, that callback only gets called if the pool is draining. Nighthawk doesn't seem to trigger pool drain anywhere. We should call drainConnections(DrainAndDelete) while waiting for the pool to drain.

The current state can cause Nighthawk to crash, because the shutdown thread owns the client dispatcher and may end up reading packets while shutting down resulting in a call to pure virtual method.

This can be reproduced by running high QPS test using HTTP3/QUIC, inducing a situation where Nighthawk has to deal with some late responses after it started shutting down.

@mum4k mum4k added bug Something isn't working tech-debt labels Jul 14, 2022
@oschaaf
Copy link
Member

oschaaf commented Jul 14, 2022

We used to call drainConnections() .. but checking this call was removed in 15b86be#diff-d638190c1ff04b6501032412ab8ae3d4322513fd75e43006ac399fc829be8328L112

That was a while ago, and I have no recollection of any reason why that call had to be removed. Therefore, I think that it may have been dropped by accident :-(

@mum4k
Copy link
Collaborator Author

mum4k commented Jul 16, 2022

Thank you for adding the details @oschaaf. This gives us reassurance that adding it back it should do no harm and should be an easy fix.

@dubious90
Copy link
Contributor

Did a quick check into this and this might require an upstream envoy change. https://github.com/envoyproxy/envoy/pull/16544/files obfuscated the connection pool behind a new object called PoolData. https://github.com/envoyproxy/envoy/pull/16865/files is an example change made previously by @mum4k that re-exposed some methods. A similar change is probably required to allow us to drain connections.

alyssawilk pushed a commit to envoyproxy/envoy that referenced this issue Aug 4, 2022
#22554)

Exposing the drainConnections method on the connection pool, so that Nighthawk can correctly drain the connection pools used by its individual workers.

Fixes #22527.
Works on envoyproxy/nighthawk#873.

/cc @alyssawilk

Risk Level: low, existing functionality isn't affected.
Testing: unit tests.
Docs Changes: n/a.
Release Notes: n/a.
Platform Specific Features: n/a.

Signed-off-by: Jakub Sobon <[email protected]>
oschaaf pushed a commit to maistra/envoy that referenced this issue Oct 26, 2022
…. (#22554)

Exposing the drainConnections method on the connection pool, so that Nighthawk can correctly drain the connection pools used by its individual workers.

Fixes #22527.
Works on envoyproxy/nighthawk#873.

/cc @alyssawilk

Risk Level: low, existing functionality isn't affected.
Testing: unit tests.
Docs Changes: n/a.
Release Notes: n/a.
Platform Specific Features: n/a.

Signed-off-by: Jakub Sobon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working tech-debt
Projects
None yet
Development

No branches or pull requests

3 participants