You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The GroupsValues for aggregates need to handle "emitTo" for streaming groups so that the can flush groups that have already been built but will never be seen again.
The initial implementation of the specialized accumulator for Uft8/LargeUtf8 in #8827 is inefficient in that it copies / rehashes any strings remaining in the set after emission
This is likely not a large performance overhead in practice as most groups should be emitted so only a few groups will need to be rehashed. However, if it turns out it is a problem, we can make something more optimized
I have one proposal in #9188 (look at ArrowStringSet::emit_first_n) -- it works and passes tests but I think is very complicated and hard to convince onesself that the unsafe usage is sound
Additional context
No response
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem or challenge?
Follow on to #7064
The
GroupsValues
for aggregates need to handle "emitTo" for streaming groups so that the can flush groups that have already been built but will never be seen again.The initial implementation of the specialized accumulator for Uft8/LargeUtf8 in #8827 is inefficient in that it copies / rehashes any strings remaining in the set after emission
This is likely not a large performance overhead in practice as most groups should be emitted so only a few groups will need to be rehashed. However, if it turns out it is a problem, we can make something more optimized
Describe the solution you'd like
Optimize emitTo for binary groups
#9188
Describe alternatives you've considered
I have one proposal in #9188 (look at ArrowStringSet::emit_first_n) -- it works and passes tests but I think is very complicated and hard to convince onesself that the
unsafe
usage is soundAdditional context
No response
The text was updated successfully, but these errors were encountered: