Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement new format API for JSON output #385

Merged
merged 10 commits into from
Jan 30, 2024
Merged

Conversation

Neverlord
Copy link
Member

As discussed in #360: we want to offer a v2 endpoint for WebSocket clients. The new endpoint will render timestamps in UTC. This PR is prep work that re-implements the current encoding for WebSocket client in the namespace broker::format::json::v1 and swaps out the rendering logic in json_client to use the new encoding logic. All of this only touches internals and should not affect clients.

Adding a v2 endpoint is then going to be a followup.

Also contains a bugfix for include/broker/format/txt.hh where very large floating point numbers would miss the last character. We use the txt format only for internal logging, so I hope it's OK if I leave the commit here. I've noticed it only because I've used the txt implementation as a template.

@ckreibich I've split the work into multiple commits with a high-level description for each chunk of work. I hope this makes it easier to follow along when reviewing.

When calling `snprintf`, we need to pass the buffer size + 1 to account
for the null terminator. Moreover, we should always respect the return
value of `snprintf`, because this is the number of bytes actually being
written.
The new header `broker/format/json.hh` implements the `encode`-API for
`broker::variant`, akin to `broker/format/txt.hh`. The generated JSON
format is tailored to the WebSocket interface.

The format itself is documented in `web-socket.rst`.
A `data_message` embeds a `data` value. According to the spec, the JSON
output contains the `@data-type` and `data` members of the embedded
`data` value at the top level, alongside the `topic` and (message)
`type` members. Hence, we need to extend the existing `encode` functions
with a (template) parameter that configures whether or not to print the
enclosing curly braces. The default behavior remains unchanged, we only
omit the curly braces from the `encode` overload for `data_message` in
order to "merge" to output of the nested `encode` call into the output.
Replace the `caf::json_writer` member with a simple character buffer.
Instead of using the writer to generate JSON output, we call into the
new `format::json::v1` API. Further, add an `append_field` function to
the format API to allow us to render the simple JSON objects for ACK and
error messages directly (instead of going through a map that we then
convert to JSON).
We skip over the first character in the buffer that we pass to
`strftime`. Hence, we also need subtract one from the total size that
`strftime` is allowed to write to.
In order to achieve a deterministic output for a `timestamp`, we need to
calculate the value dynamically since the exact value depends on the
local timezone.
The new JSON rendering is more compact than previously and omits any
unnecessary whitespace.
@Neverlord Neverlord force-pushed the topic/neverlord/json-format branch 3 times, most recently from 7cc6d9a to ef16f51 Compare December 17, 2023 19:32
@Neverlord Neverlord force-pushed the topic/neverlord/json-format branch from ef16f51 to e75c36f Compare December 17, 2023 19:54
if constexpr (std::is_same_v<T, bool>) {
return append_encoded<Policy>("boolean", value ? "true"sv : "false"sv, out);
} else if constexpr (std::is_same_v<T, count>) {
// An integer can at most have 20 digits (UINT64_MAX).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vanilla JavaScript cannot correctly parse numbers so large. Nothing to do, but that would probably be a topic if we had more browsers interact with Zeek.

This even made it into the RFC :-)

https://datatracker.ietf.org/doc/html/rfc7159#section-6

Note that when such software is used, numbers that are integers and
are in the range [-(253)+1, (253)-1] are interoperable in the
sense that implementations will agree exactly on their numeric
values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably because JavaScript will store numbers as 64-bit floating point numbers? Probably not much we could do about that. Except sending integers as strings... 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think we should worry too much. If someone actually wants to work with numbers that large, they'll need to use BigInt anyhow and there's json-bigint that can parse this properly.


Where I saw this first:

In newer versions of the API, all large integer values are represented as strings by default.

https://developer.twitter.com/en/docs/twitter-ids

Copy link
Member

@ckreibich ckreibich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Dominik — my suggestions are all about comments or about adding comments to help readers understand what the formatting looks like. I'll leave that to you to handle as you see fit.

Apologies, once more, for the delay.

include/broker/format/json.hh Outdated Show resolved Hide resolved
include/broker/format/json.hh Outdated Show resolved Hide resolved
include/broker/format/json.hh Outdated Show resolved Hide resolved
include/broker/format/json.hh Show resolved Hide resolved
include/broker/format/json.hh Outdated Show resolved Hide resolved
include/broker/format/json.hh Outdated Show resolved Hide resolved
include/broker/format/txt.hh Show resolved Hide resolved
@Neverlord
Copy link
Member Author

@ckreibich thanks for the feedback! I've updated the comments with the missing examples / explanations. I hope I didn't miss something. 🙂

Neverlord added a commit that referenced this pull request Jan 30, 2024
@Neverlord Neverlord merged commit bc25aec into master Jan 30, 2024
22 checks passed
@Neverlord Neverlord deleted the topic/neverlord/json-format branch January 30, 2024 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants