-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POSIX-compliant read on the Mirage_flow.S interface #46
Comments
Thanks for bringing this up, although I'm surprised you brought up Flow.read seems to be perfectly "POSIX compliant" to me; could you tell me where the below is incorrect so I can understand your issue better?
These two can probably be answered in one go. The Cstruct.t is freshly allocated by the Flow implementation. This is an internal detail since the driver might use a local pool of buffers, or be passing through buffers from another implementation. Remember that Cstruct.t is a view onto an underlying buffer, so it is cheap to allocate the Cstruct. The difficult question here is the lifetime of the underlying memory buffer, not the Cstruct (which is disposable). Is that where you view difficulties with POSIX?
The Cstruct is precise: the length of the cstruct (not the underlying buffer) is accurate. So if you get a short read on POSIX, then the implementation will give you back a cstruct of the length read from the socket, but the underlying buffer may be longer (e.g. a page).
Good question! No. If we use the ro/rw capabilities in new cstruct, then the call should under most circumstances have to allocate a fresh cstruct. It's pretty hard to have a zero-copy read and write interface that's safe without far stronger lifetime guarantees. I didn't understand the other points about TLS and so on -- perhaps if we focus on "why isn't Mirage_flow.S.read POSIX-compliant" that might help answer the rest. |
POSIX compliant means this signature for me:
For my point of view, a protocol implementation such as TCP/IP should not implement an other mechanism such as a local pool of buffers, or rely an a pre-existing implementation which does the job. A protocol implementation should just decode and encode - and I think it's enough of complexity for the developper. But in the case of the protocol handles its own pool of buffer, how such things is concretized in Then, allocate a
My main difficult is about control. When we want to implement a higher protocol such as TLS or Git Smart we discover one main design which leads several others protocols designs: we know the length of our payload (because these protocols use the variable-length packet design). Even SMTP advises to limit the packet/the line by 1000 characters. In that case we know at most how many bytes we should get from the underlying protocol (in that case, TCP/IP). The spontaneous way to implement these protocols is to pre-allocate a buffer which corresponds to the maximum number of bytes that our peer can send us - of course, I simplify a lot the problematic, but the idea still is the same. By this way, we expect that the underlying protocol fills this buffer and it's about us to: 1) take care when we prompt/ask more inputs/when we start to parse. From the current status of
This is where I have some trouble. We don't have any control about how many bytes we should receive. We just receive something which can be larger than what we expect or smaller. In the first case, we need to keep remaining bytes into a ring-buffer (and pray to never to full fill it - in that case, we fail or we need to re-allocate a bigger version of our ring-buffer). In the second case, we must re-read until we have enough bytes to start our process. But in anyways, to fit into some constraints about protocol implementation, we do a copy and such slight layer is done in many ways along our MirageOS ecosystem - I can not count but I saw different times the same and unique goal: fallback to But all of these describe the half part of the problem because it's mostly about this simple problem: I have a
The other half problemThe other half problem is when we have a protocol implementation which does not respect In such transition, an allocation is required, and in that case, we talk about a linger - and, as we can see, it allocates. Such layer is required for any others protocols which want to have a chance to be used in the MirageOS ecosystem. And the first entry-point for a protocol implementation to be compatible with MirageOS is Conduit - it's why I made the choice to diverge from
It's why I consider that we have a split from MirageOS and the rest of world due to this simple function Several others questionsAgain, the documentation is not clear about the pool of buffer, the implementation of it and where we are able to tweak it. The ownership is an other problem and the POSIX-compliant About implementation, we can see a diverge between And, again, if this interface wants to presuppose the zero-copy, it's true (and again, I'm not sure) only for PS: I can go to details and check some code to highlight the existence of ring-buffer and linger in our code-base but I think it's true that we tried to fallback to the POSIX-compliant read several times in some many contexts (and we must maintain them according their details, etc.). |
I can confirm that this is an issue and caused a bug in the read function in gluten-mirage. Also in this case an intermediate buffer must be implemented to fix the issue, because @avsm While implementing that, I also needed to know: how long is the This is my implementation, that I'm not sure is safe: module Gluten_flow (Flow : Mirage_flow.S) :
S with module Mirage = Flow = struct
module Mirage = Flow
type t = {
flow: Flow.flow;
mutable buf: Cstruct.t;
}
let create flow = {
flow = flow;
buf = Cstruct.empty;
}
let read flow req_len =
(if flow.buf.len = 0 then
Flow.read flow.flow
else
let res = (Ok (`Data flow.buf)) in
flow.buf <- Cstruct.empty;
Lwt.return res)
>|= function
| Ok (`Data buf) when buf.len > req_len ->
let head, rest = Cstruct.split buf req_len in
flow.buf <- rest;
Ok (`Data head)
| x -> x
end |
As you said, a solution like an intermediate buffer - eg. a ring-buffer such as
Currently, the story behind the zero-copy is false for Solo5. That mostly means that a fresh More generally, a change to the Mirage_flow.S interface has big implications for several packages. In other words, a true consensus must be found for everyone before any changes are made. |
I'm still not sure: does that mean that my code (see above) is safe, as long as nobody writes to the buffer, because the buffer is never reused by the lower layers? |
Yes, but your logic is a bit weird. I mean, Otherwise, for ownership even if it depends on the implementation, it's "safe" to pass the underlying buffer to the upper layer. Again, we do a fresh copy for every ethernet packets. |
Ok, we are becoming off topic now, but I think you got the code wrong. If I'm not mistaken, the split with req_len is always done if the buffer is longer, no matter if the the buffer comes from |
But it's not concurrency-safe if several |
Long thread, if I read it correctly, the proposal is to revise the val read : flow -> Cstruct.t -> ([ `Input of int | `End_of_flow ], error) result Lwt.t My observation in MirageOS is (lower layers, zero copy, pretty sane)
Now, we're finally eventually at the FLOW layer - where TCP is responsible to consumer the received network packet, and provide data to the application. And the question is, how should this be done? On the one side, the lower layers (packet received at the network card) may need properly aligned buffers, thus having a good ownership and lifetime discipline may help to reuse buffers (e.g. in the ixy case). So, my preferred solution would be if the buffer from the network device would be owned by the network device, and already the IPv4 fragment cache should do a copy. Now, depending on your TCP implementation, it may be worth to copy over data to a user buffer (and have an internal buffer) or not. I'm not finally decided on this. In uTCP, I thought, maybe if read returns a I do like to not have I did not push on this for 4.0.0 since integrating a |
This issue will be large enough to explain all the situation according to the current MirageOS eco-system.
Currently,
Mirage_flow.S
has a strange definition of theread
function which does not respect the POSIX's one:mirage-flow/src/mirage_flow.mli
Line 56 in 4e7df2c
Indeed, several questions comes from a non-seasoned user:
Cstruct.t
?The documentation can help us a bit but it does not give to us a strict and full definition of behaviors and finally it lets the underlying implementation to complete that - if the maintainer is aware about what he should, at least, respect.
From this definition, I got many troubles when:
read
(such asLwt_io.make
,colombe
,conduit.3.0.0
orgluten
)read
(liketls.lwt
orssl
)From all of these, it's hard to do the glue from a protocol to an implementation (
Mirage_flow.S
to something) or from a protocol to our interface (something toMirage_flow.S
). And the solution is not good, it requires:ring-buffer
(Mirage_flow.S
to something)linger
(something toMirage_flow.S
)Of course, this solution implies a regression about performances because we do some copies in any cases. The problem becomes hard when it's about protocols composition where we must agree about one and unique interface to be able to provide:
So, if we look between lines, it seems that we have a systematic and error-prone layer between the MirageOS ecosystem and the rest of the OCaml ecosystem. Mostly because
read : flow -> (Cstruct.t or_eof, [> error ]) result Lwt.t
is specific to one project:mirage-tcpip
.From what I know,
Mirage_flow.S.read
wants to provide a zero-copyread
function. I don't know if it's true or not but even if this function can increase performances on some specific application (and I said specific!), this choice becomes a pain where the zero-copy assumption is not true for others protocols/flows (at least TLS does, in any cases, a copy). So it's strange for me to enforce an assumption for one protocol to any others protocols and assume, by this way, a split between the MirageOS ecosystem and the UNIX ecosystem (which should live together - due to the UNIX target).It seems that the Linux kernel provides a zero-copy interface of its TCP/IP stack too - but it seems that this part is specific too and it should not lead the development of protocols, in first instance - but it can be an open-door to improve performance then.
A possible solution is a major release of
Mirage_flow.S
and an update ofread
to be POSIX compliant.It will unlock many issues discovered a long the MirageOS ecosystem (which are considered for me as a technical debt). The result is not a deletion of the zero-copy
read
but move to consider it as specific tomirage-tcpip
. Then, it will unlock:read
when he wantstls.lwt
andtls.mirage
into one and unique package)/cc @mirage/core
The text was updated successfully, but these errors were encountered: