Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make framing less error prone and requiring buffered IO #2

Open
daniel-j-h opened this issue Jan 13, 2023 · 0 comments
Open

Make framing less error prone and requiring buffered IO #2

daniel-j-h opened this issue Jan 13, 2023 · 0 comments

Comments

@daniel-j-h
Copy link
Member

At the moment we implement framing by size-prefixing graph chunk messages with a varint.

file header  # fixed size
graph header  # fixed size
size0
chunk0
size1
chunk1
..

this is the recommended way to implement framing; there's just one issue right now:

  • we do not hard-code the number of chunks, neither in the file header nor in the graph header
  • to check if there are still chunks in the file, we peek ahead and read four bytes; the idea was that we will either hit EOF and are done, or we will find a varint which then tells us about the size of the chunks to decode

This approach has the following downsides

  • in case the file contains en empty chunk, the varint will be a single byte. If such a chunk is at the very end of the file, we will hit EOF
  • we need buffered IO (e.g. a file or BufferedReader) to support peeking without reading; it would be great if we could avoid that

We need to figure out how we can change our approach and iterate some more. This really only came up when implementing the Python lib. Maybe we just encode how many chunks there are in the graph header. Or we use a fixed size prefix instead of a varint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant