You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently once a SpillableHostBuffer is spilled from memory to disk, all subsequent invocations of SpillableHostBuffer#getHostBuffer will read and deserialize from disk. It's very costly and won't be acceptable in cases where we will call the getHostBuffer multiple times.
One example would be the Kudo shuffle read concat case, let's asssume the read KudoTables are placed into a spillable state (by wrapping the HostMemoryBuffer in SpillableHostBuffer), then when doing the kudo concat, we will have to frequently and randomly call SpillableHostBuffer#getHostBuffer, since we know kudo concat adopts a random read visitor to read all the input KudoTables. It's a performance nightmere if we have to read from disk every time.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Currently once a SpillableHostBuffer is spilled from memory to disk, all subsequent invocations of
SpillableHostBuffer#getHostBuffer
will read and deserialize from disk. It's very costly and won't be acceptable in cases where we will call thegetHostBuffer
multiple times.One example would be the Kudo shuffle read concat case, let's asssume the read KudoTables are placed into a spillable state (by wrapping the HostMemoryBuffer in SpillableHostBuffer), then when doing the kudo concat, we will have to frequently and randomly call
SpillableHostBuffer#getHostBuffer
, since we know kudo concat adopts a random read visitor to read all the input KudoTables. It's a performance nightmere if we have to read from disk every time.The text was updated successfully, but these errors were encountered: