-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement new format API for JSON output #385
Conversation
When calling `snprintf`, we need to pass the buffer size + 1 to account for the null terminator. Moreover, we should always respect the return value of `snprintf`, because this is the number of bytes actually being written.
The new header `broker/format/json.hh` implements the `encode`-API for `broker::variant`, akin to `broker/format/txt.hh`. The generated JSON format is tailored to the WebSocket interface. The format itself is documented in `web-socket.rst`.
A `data_message` embeds a `data` value. According to the spec, the JSON output contains the `@data-type` and `data` members of the embedded `data` value at the top level, alongside the `topic` and (message) `type` members. Hence, we need to extend the existing `encode` functions with a (template) parameter that configures whether or not to print the enclosing curly braces. The default behavior remains unchanged, we only omit the curly braces from the `encode` overload for `data_message` in order to "merge" to output of the nested `encode` call into the output.
Replace the `caf::json_writer` member with a simple character buffer. Instead of using the writer to generate JSON output, we call into the new `format::json::v1` API. Further, add an `append_field` function to the format API to allow us to render the simple JSON objects for ACK and error messages directly (instead of going through a map that we then convert to JSON).
We skip over the first character in the buffer that we pass to `strftime`. Hence, we also need subtract one from the total size that `strftime` is allowed to write to.
In order to achieve a deterministic output for a `timestamp`, we need to calculate the value dynamically since the exact value depends on the local timezone.
The new JSON rendering is more compact than previously and omits any unnecessary whitespace.
7cc6d9a
to
ef16f51
Compare
ef16f51
to
e75c36f
Compare
if constexpr (std::is_same_v<T, bool>) { | ||
return append_encoded<Policy>("boolean", value ? "true"sv : "false"sv, out); | ||
} else if constexpr (std::is_same_v<T, count>) { | ||
// An integer can at most have 20 digits (UINT64_MAX). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vanilla JavaScript cannot correctly parse numbers so large. Nothing to do, but that would probably be a topic if we had more browsers interact with Zeek.
This even made it into the RFC :-)
https://datatracker.ietf.org/doc/html/rfc7159#section-6
Note that when such software is used, numbers that are integers and
are in the range [-(253)+1, (253)-1] are interoperable in the
sense that implementations will agree exactly on their numeric
values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably because JavaScript will store numbers as 64-bit floating point numbers? Probably not much we could do about that. Except sending integers as strings... 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't think we should worry too much. If someone actually wants to work with numbers that large, they'll need to use BigInt anyhow and there's json-bigint
that can parse this properly.
Where I saw this first:
In newer versions of the API, all large integer values are represented as strings by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Dominik — my suggestions are all about comments or about adding comments to help readers understand what the formatting looks like. I'll leave that to you to handle as you see fit.
Apologies, once more, for the delay.
@ckreibich thanks for the feedback! I've updated the comments with the missing examples / explanations. I hope I didn't miss something. 🙂 |
As discussed in #360: we want to offer a
v2
endpoint for WebSocket clients. The new endpoint will render timestamps in UTC. This PR is prep work that re-implements the current encoding for WebSocket client in the namespacebroker::format::json::v1
and swaps out the rendering logic injson_client
to use the new encoding logic. All of this only touches internals and should not affect clients.Adding a
v2
endpoint is then going to be a followup.Also contains a bugfix for include/broker/format/txt.hh where very large floating point numbers would miss the last character. We use the
txt
format only for internal logging, so I hope it's OK if I leave the commit here. I've noticed it only because I've used thetxt
implementation as a template.@ckreibich I've split the work into multiple commits with a high-level description for each chunk of work. I hope this makes it easier to follow along when reviewing.