You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, we keep appending bad rows to conv till we hit the byte limit and then dump them to dropped.txt. When dealing with large tables, usually we end up storing all rows from one table in the dropped.txt because a single issue is occuring across many rows.
There is scope for improvement by adding bad rows from different tables by removing some of the earlier ones, as more rows caused by the same error does not provide more information. It is more efficient to report a few samples of multiple types of bad rows.
The text was updated successfully, but these errors were encountered:
Currently, we keep appending bad rows to conv till we hit the byte limit and then dump them to dropped.txt. When dealing with large tables, usually we end up storing all rows from one table in the dropped.txt because a single issue is occuring across many rows.
There is scope for improvement by adding bad rows from different tables by removing some of the earlier ones, as more rows caused by the same error does not provide more information. It is more efficient to report a few samples of multiple types of bad rows.
The text was updated successfully, but these errors were encountered: