Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mirage: data race in write buffer handling #29

Closed
ansiwen opened this issue Apr 28, 2022 · 4 comments
Closed

mirage: data race in write buffer handling #29

ansiwen opened this issue Apr 28, 2022 · 4 comments

Comments

@ansiwen
Copy link
Contributor

ansiwen commented Apr 28, 2022

When using the "direct" network interface of MirageOS, there is the possibility that the write buffer is overwritten before it is flushed out. For example the following writev sequence from a H2 request (custom debug logging):

2022-04-28 20:59:13 +02:00: DBG [gluten] writev:
2022-04-28 20:59:13 +02:00: DBG [gluten] buffer off:0 len:58 dump:
50 52 49 20 2a 20 48 54  54 50 2f 32 2e 30 0d 0a
0d 0a 53 4d 0d 0a 0d 0a  00 00 0c 04 00 00 00 00
00 00 02 00 00 00 00 00  04 08 00 00 00 00 00 04
08 00 00 00 00 00 07 ff  00 01
2022-04-28 20:59:13 +02:00: DBG [gluten] writev:
2022-04-28 20:59:13 +02:00: DBG [gluten] buffer off:0 len:9 dump:
00 00 08 06 00 00 00 00  00
2022-04-28 20:59:13 +02:00: DBG [gluten] buffer off:0 len:8 dump:
00 00 00 00 00 00 00 00
2022-04-28 20:59:13 +02:00: DBG [gluten] writev:
2022-04-28 20:59:13 +02:00: DBG [gluten] buffer off:0 len:9 dump:
00 00 00 04 01 00 00 00  00
2022-04-28 20:59:13 +02:00: DBG [gluten] writev:
2022-04-28 20:59:13 +02:00: DBG [gluten] buffer off:0 len:9 dump:
00 00 31 01 04 00 00 00  01
2022-04-28 20:59:13 +02:00: DBG [gluten] buffer off:0 len:49 dump:
83 04 90 60 a9 24 88 2d  9d cb 65 71 af 9b 8b 1b
47 54 c5 86 40 82 49 7f  86 4d 83 35 05 b1 1f 5f
90 1d 75 d0 62 0d 26 3d  4c 4d 65 64 ff 75 d8 74
2022-04-28 20:59:13 +02:00: DBG [gluten] writev:
2022-04-28 20:59:13 +02:00: DBG [gluten] buffer off:0 len:9 dump:
00 00 15 00 00 00 00 00  01
2022-04-28 20:59:13 +02:00: DBG [gluten] buffer off:0 len:21 dump:
00 00 00 00 10 0a 0e 2f  2e 64 6f 65 73 6e 6f 74
65 78 69 73 74
2022-04-28 20:59:13 +02:00: DBG [gluten] writev:
2022-04-28 20:59:13 +02:00: DBG [gluten] buffer off:0 len:9 dump:
00 00 00 00 01 00 00 00  01
2022-04-28 20:59:13 +02:00: DBG [gluten] shutdown flow

Resulted in the following frame being sent out:

00000000  50 52 49 20 2a 20 48 54  54 50 2f 32 2e 30 0d 0a   PRI * HT TP/2.0..
00000010  0d 0a 53 4d 0d 0a 0d 0a  00 00 0c 04 00 00 00 00   ..SM.... ........
00000020  00 00 02 00 00 00 00 00  04 08 00 00 00 00 00 04   ........ ........
00000030  08 00 00 00 00 00 07 ff  00 01                     ........ ..
0000003A  00 00 08 06 00 00 00 00  00 00 00 00 00 00 00 00   ........ ........
0000004A  00                                                 .
0000004B  00 00 00 04 01 00 00 00  00                        ........ .
00000054  00 00 00 00 01 00 00 00  01 83 04 90 60 a9 24 88   ........ ....`.$.
00000064  2d 9d cb 65 71 af 9b 8b  1b 47 54 c5 86 40 82 49   -..eq... [email protected]
00000074  7f 86 4d 83 35 05 b1 1f  5f 90 1d 75 d0 62 0d 26   ..M.5... _..u.b.&
00000084  3d 4c 4d 65 64 ff 75 d8  74 9f 00 00 00 00 01 00   =LMed.u. t.......
00000094  00 00 01 00 00 00 00 10  0a 0e 2f 2e 64 6f 65 73   ........ ../.does
000000A4  6e 6f 74 65 78 69 73 74  00 00 00 00 01 00 00 00   notexist ........
000000B4  01                                                 .

As you can see, the last byte sequence written out 00 00 00 00 01 00 00 00 01 shows up not only at the end of the frame, but also in the middle at 00000054 although the writev call before sent 00 00 31 01 04 00 00 00 01 instead, which means that the same buffer is used twice in one frame, resulting in data being overwritten.

If the buffer is copied before Flow.writev is called here

let writev flow iovecs =
let cstruct_iovecs =
List.map
(fun { Faraday.buffer; off; len } ->
Cstruct.of_bigarray ~off ~len buffer)
iovecs
in
Lwt.catch

like this:

   let writev flow iovecs =
     let cstruct_iovecs =
       List.map
         (fun { Faraday.buffer; off; len } ->
-          Cstruct.of_bigarray ~off ~len buffer)
+          let cs = Cstruct.of_bigarray ~off ~len buffer in
+          let new_cs = Cstruct.create_unsafe len in
+          Cstruct.blit cs 0 new_cs 0 len;
+          new_cs)
         iovecs
     in
     Lwt.catch

the issue is gone.

The implementation of Flow.writev for TCP in the direct interface implementation is here
https://github.com/mirage/mirage-tcpip/blob/a7b799cb36435ecb810a92df05ff49fbd264bfbb/src/tcp/flow.ml#L623

@ansiwen
Copy link
Contributor Author

ansiwen commented Apr 29, 2022

There is also a writev_nodelay variant in the TCP implementation (not exposed yet over the Flow interface) which blocks the thread until the buffers are sent, but that creates a lot of small tcp packages: https://github.com/mirage/mirage-tcpip/blob/a7b799cb36435ecb810a92df05ff49fbd264bfbb/src/core/tcp.mli#L64

@haesbaert
Copy link

The buffer is owned by writev and should not be re-used, it's documented here:
https://github.com/mirage/mirage-flow/blob/e661203b4305b6ea34d625ec725337d854c4ae83/src/mirage_flow.mli#L78-L84

@anmonteiro
Copy link
Owner

Aha, thanks for digging further into this. It looks like the correct fix is to modify the Mirage bindings to always make a copy in its writev implementation.

@anmonteiro
Copy link
Owner

This was fixed in #32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants