Update h2 windowing algo & Http Client benchmark #388

TingDaoK · 2022-08-26T00:04:56Z

Initial build of our http client benchmark
Test using a local host, that using our http client to connect to the host and collect how many requests are made during a certain time.
To run:
- start the python local host: https://github.com/awslabs/aws-c-http/tree/main/tests/py_localhost
- build aws-c-http with AWS_BUILD_CANARY=ON cmake flag
- cd build/bin/canary && ./canary

ISSUE FOUND & FIXED

We update the window for connection for each data frame received, which really slows us down for frequent small chunk of data frames receiving.
- We fixed it by only update the connection window whenever it drops to 50% of the max.
- Same issue may happens to streams windows
- Or the padding of the connection window.
Providing tiny increments to flow control in WINDOW_UPDATE frames can cause a sender to generate a large number of DATA frames. from here
- Is the client's responsibility to make sure not doing this? Even if the user do manual window update and doing small window update?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

lgtm-com · 2022-09-12T21:49:46Z

This pull request introduces 1 alert when merging 4cd7338 into f81ee94 - view on LGTM.com

new alerts:

1 for Variable defined multiple times

lgtm-com · 2022-09-12T22:02:00Z

This pull request introduces 1 alert when merging f445de0 into f81ee94 - view on LGTM.com

new alerts:

1 for Variable defined multiple times

lgtm-com · 2022-09-12T22:09:04Z

This pull request introduces 1 alert when merging f86a9f7 into f81ee94 - view on LGTM.com

new alerts:

1 for Variable defined multiple times

lgtm-com · 2022-09-12T22:17:37Z

This pull request introduces 1 alert when merging ca0c7c5 into f81ee94 - view on LGTM.com

new alerts:

1 for Variable defined multiple times

bin/canary/CMakeLists.txt

bin/canary/main.c

DmitriyMusatkin · 2022-09-16T18:36:01Z

source/h2_connection.c

@@ -1762,6 +1767,8 @@ static void s_handler_installed(struct aws_channel_handler *handler, struct aws_
        aws_linked_list_push_back(
            &connection->thread_data.outgoing_frames_queue, &connection_window_update_frame->node);
        connection->thread_data.window_size_self += initial_window_update_size;
+        /* For automatic window management, we only update connectio windows when it droped blow 50% of MAX. */
+        connection->thread_data.window_size_self_dropped_threshold = AWS_H2_WINDOW_UPDATE_MAX / 2;


nit: pull this magic number into a constant?

it's derive from a constant... and it's more clear about where it comes from/

codecov-commenter · 2025-04-01T22:31:35Z

Codecov Report

Attention: Patch coverage is 89.69072% with 10 lines in your changes missing coverage. Please review.

Project coverage is 79.52%. Comparing base (6586c80) to head (a2cee45).

Files with missing lines	Patch %	Lines
source/h2_connection.c	89.28%	6 Missing ⚠️
source/h2_stream.c	90.00%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #388      +/-   ##
==========================================
+ Coverage   79.48%   79.52%   +0.03%     
==========================================
  Files          27       27              
  Lines       11686    11701      +15     
==========================================
+ Hits         9289     9305      +16     
+ Misses       2397     2396       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

TingDaoK · 2025-04-02T22:10:22Z

tests/test_localhost_integ.c

@@ -408,12 +408,6 @@ static int s_localhost_integ_h2_upload_stress(struct aws_allocator *allocator, v
    s_tester.alloc = allocator;

    size_t length = 2500000000UL;
-#ifdef AWS_OS_LINUX


remove this, since it seems to be the flow control window issue.

we sent out too much

the initial window from the local server is small (default to the magic 65535). https://httpwg.org/specs/rfc7540.html#iana-settings

The local server code also update the window as data received.

Don't know why it matters that much for linux comparing to the other platform. (I think we do found this windowing issue affect linux more as well from the canary before, which matches this)

graebm · 2022-09-14T22:45:02Z

bin/canary/CMakeLists.txt

+
+list(APPEND CMAKE_MODULE_PATH "${CMAKE_INSTALL_PREFIX}/lib/cmake")
+
+file(GLOB ELASTICURL_SRC


~~ELASTICURL~~

graebm · 2022-09-14T23:08:26Z

bin/canary/CMakeLists.txt

+target_link_libraries(${PROJECT_NAME} aws-c-http)
+
+if (BUILD_SHARED_LIBS AND NOT WIN32)
+    message(INFO " elasticurl will be built with shared libs, but you may need to set LD_LIBRARY_PATH=${CMAKE_INSTALL_PREFIX}/lib to run the application")


~~elasticurl~~

graebm · 2025-04-02T21:19:54Z

CMakeLists.txt

+        if (AWS_BUILD_CANARY)
+            add_subdirectory(bin/canary)
+        endif()


why bother having an AWS_BUILD_CANARY option? why not just always build it if we're building tests, like we're already doing with elasticurl

graebm · 2025-04-02T21:22:27Z

bin/canary/CMakeLists.txt

+        EXPORT ${PROJECT_NAME}-targets
+        COMPONENT Runtime
+        RUNTIME
+        DESTINATION bin


Suggested change

DESTINATION bin

DESTINATION ${CMAKE_INSTALL_BINDIR}

graebm · 2025-04-02T21:25:50Z

bin/canary/CMakeLists.txt

@@ -0,0 +1,28 @@
+project(canary C)
+
+list(APPEND CMAKE_MODULE_PATH "${CMAKE_INSTALL_PREFIX}/lib/cmake")


not necessary anymore

Suggested change

list(APPEND CMAKE_MODULE_PATH "${CMAKE_INSTALL_PREFIX}/lib/cmake")

graebm · 2025-04-02T23:47:11Z

integration-testing/http_client_canary.py

+                raise RuntimeError("Return code {code} from: {cmd}".format(
+                    code=process.returncode, cmd=args_str))
+        else:
+            print(output.decode("utf-8"))


wait.
we print the output even if everything went right?
In that case, just DON'T capture output at all. It will print to the console. you don't need to pass anything to suprocess.run() except args and timeout.

and remove the comment about "gather all stderr and stdout to a single string that we print only if things go wrong"

and remove like 80% of the code in this script because the whole reason it's so complicated was to suppress output if the test passed

graebm · 2025-04-02T23:54:08Z

bin/canary/main.c

I've been looking at this PR a while and have no idea what "canary" does. Can you add a README.md with a very brief description? and very instructions instructions

graebm · 2025-04-02T23:57:44Z

tests/py_localhost/server.py

@@ -51,6 +51,7 @@ def __init__(self):
    def connection_made(self, transport: asyncio.Transport):
        self.transport = transport
        self.conn.initiate_connection()
+        self.conn.increment_flow_control_window(int(2147483647/2))


add comment explaining why you're doing this, and why this magic number

graebm · 2025-04-02T23:59:30Z

tests/py_localhost/server.py

                if isinstance(event, RequestReceived):
                    self.request_received(event.headers, event.stream_id)
+                    self.conn.increment_flow_control_window(
+                        int(2147483647/2), event.stream_id)


why this change? I understand you need to increment the window after DataReceived (below). But why this change?

graebm · 2025-04-03T00:02:20Z

tests/py_localhost/server.py

+                    self.conn.increment_flow_control_window(event.flow_controlled_length)
+                    self.conn.increment_flow_control_window(
+                        event.flow_controlled_length, event.stream_id)
                    self.receive_data(event.data, event.stream_id)


why did you move these calls from the def receive_data() function, out to here? Seems weird to make a function to handle everything related to this event ... and then move some of the code outside that function

graebm · 2025-04-03T18:43:44Z

include/aws/http/private/h2_connection.h

@@ -98,6 +98,17 @@ struct aws_h2_connection {
         * Reduce the space after receiving a flow-controlled frame. Increment after sending WINDOW_UPDATE for
         * connection */
        size_t window_size_self;


trivial / kinda-related: maybe change window_size_self and window_size_peer from ~~size_t~~ to uint32_t

they're not legally allowed to exceed 2^31-1. Having them be variably sized is just ... confusing

graebm · 2025-04-03T19:08:18Z

include/aws/http/private/h2_connection.h

+         * received.
+         * When manual management for connection window is on, the dropped size equals to the size of all the padding in
+         * the data frame received */
+        uint32_t window_size_self_dropped;


trivial / naming: "dropped" doesn't make immediate sense to me. maybe:

pending_window_update_size and window_update_threshold?

window_increment_pending_size and window_increment_threshold_size?

graebm · 2025-04-03T19:33:54Z

source/h2_connection.c

@@ -384,6 +384,8 @@ static struct aws_h2_connection *s_connection_new(
    connection->thread_data.window_size_peer = AWS_H2_INIT_WINDOW_SIZE;
    connection->thread_data.window_size_self = AWS_H2_INIT_WINDOW_SIZE;

+    connection->thread_data.window_size_self_dropped_threshold = 0;


Shouldn't this have a non-zero value?

even if someone is doing "manual window management" on their stream, the HTTP client should still batch up its window update frames

graebm · 2025-04-03T19:36:15Z

source/h2_connection.c

+        CONNECTION_LOGF(TRACE, connection, "%" PRIu32 " Bytes of padding received.", total_padding_bytes);
+    }
+    connection->thread_data.window_size_self_dropped += auto_window_update;
+    if (connection->thread_data.window_size_self_dropped > connection->thread_data.window_size_self_dropped_threshold) {


This logic holds back a WINDOW_UPDATE frame until its size would be > the threshold.

Should there be another threshold, where we also send it immediately if window_size_self gets too low?

also trivial, but maybe >= instead of >, since the threshold is probably going to be a nice round number

graebm · 2025-04-03T19:41:28Z

source/h2_connection.c

@@ -1762,6 +1767,8 @@ static void s_handler_installed(struct aws_channel_handler *handler, struct aws_
        aws_linked_list_push_back(
            &connection->thread_data.outgoing_frames_queue, &connection_window_update_frame->node);
        connection->thread_data.window_size_self += initial_window_update_size;
+        /* For automatic window management, we only update connection windows when it droped blow 50% of MAX. */


trvial

Suggested change

/* For automatic window management, we only update connection windows when it droped blow 50% of MAX. */

/* For automatic window management, we only update connection window when it drops below 50% of MAX. */

graebm · 2025-04-03T19:52:44Z

source/h2_connection.c

            return aws_h2err_from_last_error();
        }
+        connection->thread_data.window_size_self_dropped = 0;


Should we move this logic into s_connection_send_update_window() ? I know it's only called from this one place, but it seems like if anywhere else ever wanted to call it, they should be minding all this math as well. I guess if we do that, we should rename it just s_connection_update_window()

graebm · 2025-04-03T20:55:40Z

source/h2_connection.c

-
-    if (auto_window_update != 0) {
-        if (s_connection_send_update_window(connection, auto_window_update)) {
+    if (total_padding_bytes) {


utterly trivial: put some whitespace between the if/else and the next if. Otherwise it looks like an if/else-if/else-if chain

Suggested change

if (total_padding_bytes) {

if (total_padding_bytes) {

graebm · 2025-04-03T20:59:52Z

source/h2_connection.c

-    if (auto_window_update != 0) {
-        if (s_connection_send_update_window(connection, auto_window_update)) {
+    if (total_padding_bytes) {
+        CONNECTION_LOGF(TRACE, connection, "%" PRIu32 " Bytes of padding received.", total_padding_bytes);


Maybe just remove this LOG. The old statement was about how the connection window was being updated, but now it's just about how much padding was received, but the decoder is already logging about padding.

source/h2_stream.c

graebm · 2025-04-18T20:29:33Z

include/aws/http/connection.h

+     * drops below the threshold.
+     * Default to half of the initial connection flow-control window size, which is 32767.
+     */
+    uint32_t conn_window_size_threshold_to_send_update;


extremely debatable: you could just leave these out of the public options, until someone actually asks for it

graebm · 2025-04-18T20:34:05Z

include/aws/http/private/h2_connection.h

+         * The client will send the WINDOW_UPDATE frame to the server only valid.
+         * If the pending_window_update_size is too large, we will leave the excess to send it out later.
+         */
+        uint64_t pending_window_update_size_thread;


utterly trivial. It's already in a struct named thread_data, no need to repeat

Suggested change

uint64_t pending_window_update_size_thread;

uint64_t pending_window_update_size;

or

Suggested change

uint64_t pending_window_update_size_thread;

uint64_t pending_window_update_size_self;

graebm · 2025-04-18T20:35:18Z

include/aws/http/private/h2_connection.h

@@ -150,7 +165,7 @@ struct aws_h2_connection {
        bool is_cross_thread_work_task_scheduled;

        /* The window_update value for `thread_data.window_size_self` that haven't applied yet */
-        size_t window_update_size;
+        uint64_t pending_window_update_size_sync;


trivial: it's already in synced_data.

Suggested change

uint64_t pending_window_update_size_sync;

uint64_t pending_window_update_size;

or

Suggested change

uint64_t pending_window_update_size_sync;

uint64_t pending_window_update_size_self;

graebm · 2025-04-18T20:38:37Z

include/aws/http/private/h2_connection.h

@@ -150,7 +165,7 @@ struct aws_h2_connection {
        bool is_cross_thread_work_task_scheduled;

        /* The window_update value for `thread_data.window_size_self` that haven't applied yet */


Suggested change

/* The window_update value for `thread_data.window_size_self` that haven't applied yet */

/* Value for `thread_data.pending_window_update_size` that we haven't applied yet */

graebm · 2025-04-18T20:43:54Z

include/aws/http/private/h2_stream.h

+         * The client will send the WINDOW_UPDATE frame to the server only valid.
+         * If the pending_window_update_size is too large, we will leave the excess to send it out later.
+         **/
+        uint64_t pending_window_update_size_thread;


graebm · 2025-04-18T22:07:38Z

source/h2_connection.c

        }
        s_unlock_synced_data(connection);
    } /* END CRITICAL SECTION */
-    if (err) {


🎉 yay less chance for errors!

graebm · 2025-04-18T22:11:48Z

source/h2_stream.c

@@ -209,24 +209,62 @@ static struct aws_h2err s_check_state_allows_frame_type(
    return aws_h2err_from_h2_code(h2_error_code);
 }

-static int s_stream_send_update_window_frame(struct aws_h2_stream *stream, size_t increment_size) {
+static int s_stream_send_update_window_if_needed(struct aws_h2_stream *stream, uint64_t window_size) {


trivial: again, ~~window_size~~ sounds like the absolute value, but it's a relative value

Suggested change

static int s_stream_send_update_window_if_needed(struct aws_h2_stream *stream, uint64_t window_size) {

static int s_stream_send_update_window_if_needed(struct aws_h2_stream *stream, uint64_t window_update_size) {

graebm · 2025-04-18T22:15:55Z

source/h2_stream.c

+    if (connection->stream_window_size_threshold_to_send_update) {
+        stream->window_size_threshold_to_send_update = connection->stream_window_size_threshold_to_send_update;
+    } else {
+        stream->window_size_threshold_to_send_update =


Suggested change

stream->window_size_threshold_to_send_update =

/* Set reasonable default of: 50% initial window size (same as Netty's DEFAULT_WINDOW_UPDATE_RATIO) */

stream->window_size_threshold_to_send_update =

graebm · 2025-04-18T22:18:05Z

source/h2_stream.c


-    if (!stream_window_update_frame) {
+    /* Cap the window to AWS_H2_WINDOW_UPDATE_MAX */
+    int32_t previous_window = stream->thread_data.window_size_self;


trivial: same thing again where we don't need this variable

Suggested change

int32_t previous_window = stream->thread_data.window_size_self;

graebm · 2025-04-18T22:18:34Z

source/h2_stream.c

+                stream,
+                "WINDOW_UPDATE frame on stream failed to be sent, error %s",
+                aws_error_name(aws_last_error()));
+            stream->thread_data.window_size_self = previous_window;


Suggested change

stream->thread_data.window_size_self = previous_window;

graebm

all my feedback was style stuff. Looks good overall

TingDaoK added 4 commits August 25, 2022 15:26

http2 stream manager canary

c28c43b

add an error handling to make sure no request failed in the middle

c0b4ce4

add direct connection test

0247e69

add todo

06ab9a8

TingDaoK changed the title ~~Canary~~ Http Client benchmark Aug 26, 2022

TingDaoK added 3 commits August 26, 2022 11:06

nginx test

def775f

try to get it running from CI

9197dc6

maybe we will have canary as elasticurl in the future

4cd7338

fix build

f445de0

TingDaoK added 2 commits September 12, 2022 15:03

increase timeout

37696e2

print out output

f86a9f7

give me a nicer printout

ca0c7c5

TingDaoK added 4 commits September 12, 2022 22:47

unused arg

53d8d43

not updating the window so frequently

6f2c30a

my computers dead, disable it for now

2199444

try to fix it

2a23d4f

TingDaoK marked this pull request as ready for review September 12, 2022 23:53

TingDaoK added 8 commits September 12, 2022 17:00

why it's committed

a34e410

update window when larger

8e9ffd5

format

fc4769c

check the number is not slower than expected

2f0dcf4

result mem

c6a1676

add documentation and same thing for stream

e4c01fe

fix tests

db1a728

30 secs instead

9c89ade

DmitriyMusatkin reviewed Sep 16, 2022

View reviewed changes

TingDaoK added 11 commits September 26, 2022 15:36

get rid of the linux only thing

c72d9b0

Merge branch 'main' into canary

9ed287c

Merge branch 'main' into canary

19bfb86

update the port

74c7b06

Merge branch 'main' into canary

28ff84d

Merge branch 'main' into canary

56e84e3

Merge branch 'main' into canary

65c8a7f

trivial

e112d01

just use 32 bit

9287f0b

things has been updated

ed923c3

typo

29ec995

TingDaoK added 3 commits April 2, 2025 10:50

you need venv now for new python

2066110

make the flow control window wide open for server

bd3bef5

increament once at the beginning

e4e6cbe

TingDaoK commented Apr 2, 2025

View reviewed changes

graebm reviewed Apr 3, 2025

View reviewed changes

get the benchmark

46445c0

TingDaoK changed the title ~~Http Client benchmark~~ Update windowing algo & Http Client benchmark Apr 7, 2025

TingDaoK changed the title ~~Update windowing algo & Http Client benchmark~~ Update h2 windowing algo & Http Client benchmark Apr 7, 2025

TingDaoK added 2 commits April 7, 2025 16:05

Cap the window to max

e6eaa1d

almost there

7779645

TingDaoK marked this pull request as draft April 8, 2025 22:58

TingDaoK added 3 commits April 17, 2025 10:06

Merge branch 'main' into canary

4c8d477

a bunch of fix

eef6e31

conversion

a2cee45

TingDaoK marked this pull request as ready for review April 18, 2025 20:21

graebm reviewed Apr 18, 2025

View reviewed changes

graebm approved these changes Apr 18, 2025

View reviewed changes


		list(APPEND CMAKE_MODULE_PATH "${CMAKE_INSTALL_PREFIX}/lib/cmake")

		file(GLOB ELASTICURL_SRC

		@@ -0,0 +1,28 @@
		project(canary C)

		list(APPEND CMAKE_MODULE_PATH "${CMAKE_INSTALL_PREFIX}/lib/cmake")

	/* For automatic window management, we only update connection windows when it droped blow 50% of MAX. */
	/* For automatic window management, we only update connection window when it drops below 50% of MAX. */

	uint64_t pending_window_update_size_thread;
	uint64_t pending_window_update_size;

	uint64_t pending_window_update_size_thread;
	uint64_t pending_window_update_size_self;

	uint64_t pending_window_update_size_sync;
	uint64_t pending_window_update_size;

		@@ -150,7 +165,7 @@ struct aws_h2_connection {
		bool is_cross_thread_work_task_scheduled;

		/* The window_update value for `thread_data.window_size_self` that haven't applied yet */

	/* The window_update value for `thread_data.window_size_self` that haven't applied yet */
	/* Value for `thread_data.pending_window_update_size` that we haven't applied yet */

	static int s_stream_send_update_window_if_needed(struct aws_h2_stream *stream, uint64_t window_size) {
	static int s_stream_send_update_window_if_needed(struct aws_h2_stream *stream, uint64_t window_update_size) {

	stream->window_size_threshold_to_send_update =
	/* Set reasonable default of: 50% initial window size (same as Netty's DEFAULT_WINDOW_UPDATE_RATIO) */
	stream->window_size_threshold_to_send_update =

Update h2 windowing algo & Http Client benchmark #388

Are you sure you want to change the base?

Update h2 windowing algo & Http Client benchmark #388

Conversation

TingDaoK commented Aug 26, 2022 • edited Loading

ISSUE FOUND & FIXED

lgtm-com bot commented Sep 12, 2022

lgtm-com bot commented Sep 12, 2022

lgtm-com bot commented Sep 12, 2022

lgtm-com bot commented Sep 12, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Apr 1, 2025 • edited Loading

Codecov Report

TingDaoK Apr 2, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

graebm left a comment

Choose a reason for hiding this comment

TingDaoK commented Aug 26, 2022 •

edited

Loading

codecov-commenter commented Apr 1, 2025 •

edited

Loading

TingDaoK Apr 2, 2025 •

edited

Loading