Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to just decode specific parts of a message? #1052

Closed
oliveryasuna opened this issue Jan 9, 2025 · 2 comments
Closed

Is it possible to just decode specific parts of a message? #1052

oliveryasuna opened this issue Jan 9, 2025 · 2 comments

Comments

@oliveryasuna
Copy link

oliveryasuna commented Jan 9, 2025

Our API works by sending a serialized protobuf in the request/response body. No gRPC involved. We have a mobile app that is mainly a web view (using Capacitor). One of our endpoints has begun to return a large response (~30 MB). In turns out that iOS' WebKit has a longtime bug where, if the memory usage spikes, it can crash. So, when we attempt to deserialize the response body, the web view crashes.

We're discussing many potential solutions. One solution could be, if we could decode certain parts of a message at a time, we can avoid a large memory spike.

For example, consider the following message:

message MyResponse {
  repeated Foo foos = 1;
  repeated Bar bars = 2;
  repeated Baz bazs = 3;
}

Is there a way to just decode one-or-more fields at a time?

Note that we use ts-proto on top of protobuf-es.

@timostamm
Copy link
Member

I'm not aware of any API in ts-proto that would allow you to parse messages one field at a time.

You can split your message definition. For example:

message MyResponseFoos {
  repeated Foo foos = 1;
}
message MyResponseBars {
  repeated Bar bars = 2;
}
message MyResponseBazs {
  repeated Baz bazs = 3;
}

Then parse the response data with each message in subsequently. Optionally, create a MyResponse message from the individual messages at the end, so that downstream code does not need to change.

When you parse MyResponseFoos, the fields 2 and 3 are considered "unknown fields", and will not be interpreted. Protobuf-ES stores unknown fields in the message, which you wouldn't want in this case, so you'll want to set the option readUnknownFields: false when parsing. I'm not sure if ts-proto stores unknown fields, you'll have to take at look at the documentation or source.

@oliveryasuna
Copy link
Author

@timostamm Thank you. That was also a potential solution I considered. In the end, I took out all binary data from the message (which took up almost 50% the size) and instead appended them to the response body.

Before:

Protobuf:

message Foo {
  bytes data = 1;
}

message MyResponse {
  repeated Foo foos = 1;
  repeated Bar bars = 2;
  repeated Baz bazs = 3;
}

Body:

HTTP/1.1 200 OK
Content-Type: application/protobuf

<serialized MyResponse>

After:

Protobuf:

message Foo {
  uint64 length = 1;
}

message MyResponse {
  repeated Foo foos = 1;
  repeated Bar bars = 2;
  repeated Baz bazs = 3;
}

Body:

HTTP/1.1 200 OK
Content-Type: application/protobuf
X-Message-Length: #####

<serialized MyResponse>
<data1>
<data2>
<...>

This way, I can read the first X-Message-Length-bytes as decode as MyResponse. Then, in a different scope (to ensure WebKit performs GC), use the lengths to read the files (data#) in the rest of the buffer.

Admittedly, the Content-Type header is no longer valid, but this is an internal API, so it's not much of a concern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants