Is it possible to just decode specific parts of a message? #1052

oliveryasuna · 2025-01-09T21:16:20Z

Our API works by sending a serialized protobuf in the request/response body. No gRPC involved. We have a mobile app that is mainly a web view (using Capacitor). One of our endpoints has begun to return a large response (~30 MB). In turns out that iOS' WebKit has a longtime bug where, if the memory usage spikes, it can crash. So, when we attempt to deserialize the response body, the web view crashes.

We're discussing many potential solutions. One solution could be, if we could decode certain parts of a message at a time, we can avoid a large memory spike.

For example, consider the following message:

message MyResponse {
  repeated Foo foos = 1;
  repeated Bar bars = 2;
  repeated Baz bazs = 3;
}

Is there a way to just decode one-or-more fields at a time?

Note that we use ts-proto on top of protobuf-es.

timostamm · 2025-01-10T15:16:10Z

I'm not aware of any API in ts-proto that would allow you to parse messages one field at a time.

You can split your message definition. For example:

message MyResponseFoos {
  repeated Foo foos = 1;
}
message MyResponseBars {
  repeated Bar bars = 2;
}
message MyResponseBazs {
  repeated Baz bazs = 3;
}

Then parse the response data with each message in subsequently. Optionally, create a MyResponse message from the individual messages at the end, so that downstream code does not need to change.

When you parse MyResponseFoos, the fields 2 and 3 are considered "unknown fields", and will not be interpreted. Protobuf-ES stores unknown fields in the message, which you wouldn't want in this case, so you'll want to set the option readUnknownFields: false when parsing. I'm not sure if ts-proto stores unknown fields, you'll have to take at look at the documentation or source.

oliveryasuna · 2025-01-12T17:42:16Z

@timostamm Thank you. That was also a potential solution I considered. In the end, I took out all binary data from the message (which took up almost 50% the size) and instead appended them to the response body.

Before:

Protobuf:

message Foo {
  bytes data = 1;
}

message MyResponse {
  repeated Foo foos = 1;
  repeated Bar bars = 2;
  repeated Baz bazs = 3;
}

Body:

HTTP/1.1 200 OK
Content-Type: application/protobuf

<serialized MyResponse>

After:

Protobuf:

message Foo {
  uint64 length = 1;
}

message MyResponse {
  repeated Foo foos = 1;
  repeated Bar bars = 2;
  repeated Baz bazs = 3;
}

Body:

HTTP/1.1 200 OK
Content-Type: application/protobuf
X-Message-Length: #####

<serialized MyResponse>
<data1>
<data2>
<...>

This way, I can read the first X-Message-Length-bytes as decode as MyResponse. Then, in a different scope (to ensure WebKit performs GC), use the lengths to read the files (data#) in the rest of the buffer.

Admittedly, the Content-Type header is no longer valid, but this is an internal API, so it's not much of a concern.

timostamm closed this as completed Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to just decode specific parts of a message? #1052

Is it possible to just decode specific parts of a message? #1052

oliveryasuna commented Jan 9, 2025 •

edited

Loading

timostamm commented Jan 10, 2025

oliveryasuna commented Jan 12, 2025

Is it possible to just decode specific parts of a message? #1052

Is it possible to just decode specific parts of a message? #1052

Comments

oliveryasuna commented Jan 9, 2025 • edited Loading

timostamm commented Jan 10, 2025

oliveryasuna commented Jan 12, 2025

Before:

Protobuf:

Body:

After:

Protobuf:

Body:

oliveryasuna commented Jan 9, 2025 •

edited

Loading