-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
force_align
on vectors missing (or: how to align vectors in flatcc?)
#179
Comments
it looks like the google flatbuffers compiler turns the Is there an equivalent on the flatcc builder API? |
Hi Ben, to your last question first. Flatcc does not have ForceVectorAlignment equivalent. Alignment is taken from the schema during construction where supported. (EDIT: not entirely true, see my follow up comment). (Though there is force assignment to table field members similar to what flatcdoes to bypass skipping default values, but that is something different. I don't really recall exactly what is supposed to be supported. You report either suggests a regression or something that hasn't yet been implemented. If it is supported by Googles flatc compiler it certainly should be in flatcc as well. In either case I fully support adding force_align to vectors, I'm just not sure how much time I can commit to that for the time being. |
If this is really important to you, you can call the low level builder interface to create the vector without generated code. This can be freely mixed with other code, you just have to be somewhat careful. flatcc/include/flatcc/flatcc_builder.h Lines 1407 to 1464 in 988b149
Also, if you can help investigate if this is a regression or a missing feature, it would be helpful. |
Another way to work around this is to create a struct with the necessary alignment and store that struct in a vector. It slowly comes back now - I think flatc originally skipped vector force_align because you could do this instead. Using a struct in this way will not physically change the buffer and you can cast from a simple array if you create the vector in one operation from a C array, and likewise when reading. EDIT: I should read more close:
Can you please elaborate on why this doesn't work? You can also use the verifier to test buffer alignment. (For the same reason the verifier must be updated upon changes to force_align support). |
Thanks for the suggestions! I agree that the support for I think I noticed what was throwing me off about my direct Today I'm doing:
flatcc_builder_init(&fbb);
flatcc_builder_start_vector(&fbb, 1, 16, FLATBUFFERS_COUNT_MAX(1));
uint8_t* p = flatbuffers_uint8_vec_extend(&fbb, 1234);
memset(p, 'x', 1234);
flatbuffers_uint8_vec_ref_t data = flatcc_builder_end_vector(&fbb);
MyTable_start_as_root(&fbb);
MyTable_data_add(&fbb, data);
MyTable_end_as_root(&fbb);
// etc The output buffer then has Thoughts? And thanks for the help! If I can get something working I'm happy to write it up in the docs so that others coming along can use it too :) |
I'd be happy to help you progress on this. The following is an explanation of how things work or are supposed to work, but it probably won't help much. But it is a start:
This is not correct. The header is included in the alignment. Are you sure the buffer is placed in sufficiently aligned memory? There are subtle differences in the way a buffer is created if you create it with an identifier or with a null identifier (I don't recall all the details up front). The buffer can also be created with an optional size prefix. The buffer can have a header in a few different ways: there can optionally be a file identifer, and there can optionally be a size prefix. You need to all the verifier in the proper way, and then it will very from the buffer start. There are some variants start_as_root_with_identifier and with _size. The header format is : (personally I think this is a mess, but it is what we have) It is actually rather complex, but when creating a buffer, flatcc adjusts alignment according these different kinds of headers. Technically it builds the buffer backwards, properly aligned relative to a virtual zero point, and then inserts enough prepadding between the header and data to ensure that everything keeps staying aligned from the buffer start based on the largest alignment requirement seen during construction. Related, but not very important here: |
Here is the padding before the header: Line 769 in e327f24
|
Documentation is always welcome. Buf if you dig sufficiently into this, you might consider implementing support for vector force align. I think it might be relatively easy, albeit hard to grasp. Lines 223 to 228 in e327f24
Here struct sets the align field flatcc/src/compiler/semantics.c Lines 668 to 675 in e327f24
In code generation, the align field might be ignored where not supported, i.e. non structs, but it could also already be supported, since e.g. integers have different alignment requirements already, so merely setting the align field could possibly work. Complications: The generated verifier must also support this, and the JSON parser must also, but again, if the align field is already inspected, it might happen automatically. On the other hand, if structs are not aligned correctly in vectors right now, there are bigger problems. Also important to never reduce alignment below natural alignment. flatcc/src/compiler/semantics.c Line 112 in e327f24
|
I wrote the following just before receiving your last comment: I was wondering: there could be a bug in the builder api where it does align vectors correctly but it fails to record the alignment size in the buffer state such that the prepadding works on wrong assumptions. This would also affect 64-bit integers because their alignment are larger than what is minimally required, so it ought to have been detected by now if that were the case. |
This means that if a something is aligned to, say, 16 bytes, then that something is placed at multiple of 16 bytes relative to the start of buffer where the header is placed. At least it is supposed to. |
Ok cool - sounds like a reproducer is worth it! I'll put one together now. |
flatcc/include/flatcc/flatcc_builder.h Lines 407 to 408 in e327f24
This min_align should be updated whenever adding something to the builder. You could try adding a function flatcc_builder_set_min_align(B, align); and see if that helps. It just needs to set min_align to max of existing min_align and the new argument. Call it after starting the buffer and before ending it. See if that helps. |
Reproducer here: benvanik@017023d I tried adding the |
Exit frame calls set_min_align when a stack object pops, so end_vector should pop the frame and effectively call set_min_align at that point, provided the frame context has the proper local aligment stored. Line 623 in e327f24
|
The fact the vectors are correctly aligned to something, here +8, is not surprising, but it confirms the suspicion that the padding between the header and data is incorrect sometimes. The min_align field should control this. It would be helpful with traces of the min_align value. |
I added some prints of it:
It looks like it is staying as 16 after initially set in the first vector, but 1 right before I call the |
See my comments to the repro. That example is not really working as intended, I presume. |
Are you saying it suddently drops down to 1? That seems very odd. Can you track down where that happens? |
(sorry didn't see comment on repro) Yes - and if I set it back to 16 before ending the root then I get the correctly aligned output: B->min_align = 16;
Eclectic_FooBar_end_as_root(B); The verifier does succeed:
Regarding the comment on the repo: I would have expected the vector contents to be aligned to the alignment (the xxxx's). What does the align on start_vector do if not align the contents? (according to something I read - which I'll try to find - the alignment specifies the start of the vector contents but not the size prefix, which will remain 4 byte aligned) |
My comment was misguided - I somehow thought you created a high level example using standard constructs, but you were using the low level features we discussed all along - brain fart. My comments are still relevant if you skip the low level details and want to test struct alignment. I think I found the issue: Line 823 in e327f24
You cannot create the vector before you start the buffer. Usually it is harmless, especially outside of nested buffers, but this is an example of where it does not work. |
Also note: MyTable_start_as_root is really a combined start_buffer and start_table call. In principle you must start the buffer, then create the vector, then start the root table. That is how it works in other languages. But in flatcc you are allowed to create objects while parent objects are open. But you are not allowed to create objects before the buffer is initialized. EDIT: fixed misleading text: changed "create the table" to "create the vector". |
You might want to read here: https://github.com/dvidelabs/flatcc/blob/master/doc/builder.md#buffers |
Ahhhh that did it! I am now calling I had seen the buffer calls but must have just skipped over them as the _as_root was working (well, not as I wanted, but I didn't think to look there!). Very useful to know that the implicit start_as_root behavior may mask issues like this. I'll try these changes in my application and verify it works there too. |
I actually wonder if the following example under Tables is correct:
|
It shouldn't be necessary to use the alignment of 16 in your start buffer call, provided that ordering is correct. 0 should work just as well. |
You're right - I may have a use for flatcc_builder_init(B);
// Start the root table before doing anything else:
Eclectic_FooBar_start_as_root(B);
// Specify an alignment for the vector contents:
flatcc_builder_start_vector(B, 1, /*align=*/16, FLATBUFFERS_COUNT_MAX(1));
uint8_t* p = flatcc_builder_extend_vector(B, 12345);
memset(p, 'x', 12345);
flatcc_builder_ref_t data_vec = flatcc_builder_end_vector(B);
// Finish building the root table:
Eclectic_FooBar_data_add(B, data_vec);
Eclectic_FooBar_end_as_root(B); |
Note that you can use start_as_root and then create you vectors. It is slightly less efficient because the root table is parked on the stack while the vectors are created, but nothing significant, and it can be much more convenient since you can add the vectors to parent one at time, something you cannot do in C++ or other languages. |
On your example: while it probably is correct, a flatcc principle is to always pair start and end calls with _as_root or not. |
Totally correct - updated (and tested to verify my sanity :) |
Also, the example requires use of little endian because the endian conversion is not automatic here. As long as the data is 8 bit units it doesn't matter. But given the nature of the problem, there are probably higher level structures stored in that vector, that isn't platform portable anyway. |
Correct - in my particular case it's an entire file of a format that has its own specifications (WAV, PNG, etc) but you are right that if it was something like a list of floats then it would need special handling. |
I still wonder why it didn' work with vectors of structs? |
Now that I have this reproducer set up I'll give that a try too - it's possible you'll be able to spot what I was doing wrong :) |
This required a minor tweak to start_as_root the flatbuffer root tables prior to recording any data. This was discovered thanks to the gracious help (and patience) of of the flatcc author: dvidelabs/flatcc#179 With this all of our binary blobs embedded into the flatbuffers are now 16-byte aligned with the option to tweak it further via a vm.rodata attribute. We don't need utf8 strings to be 16-byte aligned, for example.
This required a minor tweak to start_as_root the flatbuffer root tables prior to recording any data. This was discovered thanks to the gracious help (and patience) of of the flatcc author: dvidelabs/flatcc#179 With this all of our binary blobs embedded into the flatbuffers are now 16-byte aligned with the option to tweak it further via a vm.rodata attribute. We don't need utf8 strings to be 16-byte aligned, for example.
You are welcome, and thanks for the update. It is a feature that should be added, so I'll leave it open. |
This required a minor tweak to start_as_root the flatbuffer root tables prior to recording any data. This was discovered thanks to the gracious help (and patience) of of the flatcc author: dvidelabs/flatcc#179 With this all of our binary blobs embedded into the flatbuffers are now 16-byte aligned with the option to tweak it further via a vm.rodata attribute. We don't need utf8 strings to be 16-byte aligned, for example.
This required a minor tweak to start_as_root the flatbuffer root tables prior to recording any data. This was discovered thanks to the gracious help (and patience) of of the flatcc author: dvidelabs/flatcc#179 With this all of our binary blobs embedded into the flatbuffers are now 16-byte aligned with the option to tweak it further via a vm.rodata attribute. We don't need utf8 strings to be 16-byte aligned, for example.
[ Summary: flatcc should eventually support force_align on vectors - meanwhile use vectors of structs and force_align structs, or use low level builder calls to create aligned vectors access as discussed below - the problems investigated were eventually decided to be incorrect use of API - low level alignment doesn't work if the buffer isn't started first. ]
Hello! I have some variable-length data that must be aligned on a 16-byte boundary that I'm storing in a
[uint8]
:At some point it looks like the google flatbuffers library started supporting
force_align
on these vector fields, so this works there:The current flatcc compiler doesn't support this new usage of
force_align
yet though:My use requires me being able to ensure alignment for compatibility with DMA operations and SIMD code, but there are also some other big flatbuffers users that are now requiring it: https://github.com/tensorflow/tensorflow/blob/1e66b5790c93a1752fd8ea92a146aff811f2ee95/tensorflow/lite/schema/schema_v3a.fbs
If
force_align
on vectors could be supported in the flatcc parser/builder that'd be fantastic, but I was also wondering if there was a workaround that could be used (I suspect this isn't the first time alignment has come up as a question :). I tried to directly do this by passing an alignment during construction of the vector but that doesn't seem to do what I expected (16 byte alignment of the data in the flatbuffer):I'd expect the
flatbuffers_uint8_vec_t
when loaded from the buffer to be at a 16 byte offset (or, ignoring file header weirdness, would at least be consistently stride 16) however the pointers I get back seem rather arbitrary (4-, 8-, and 12-byte alignment).I've started trying to use vectors of structs (that do support
force_align
in flatcc) but that also doesn't seem to do what I expect, but would love to hear if any of you have seen solutions in the past that are known to work. Also I could be doing something very wrong - I just started digging into the internals of flatcc and don't know much yet :)The text was updated successfully, but these errors were encountered: