Discussion: Update dataloader to skip rows that dont require training #2344
Labels
best practice
Things we should be doing but aren't
discussion
Start a discussion
triage review
This issue should be discussed in weekly review
#2341
When a)
train_on_input=False
and b) message is too long that output is truncated, there may be a batch without trainable tokens, raising an error on the loss because of division by zero.Beyond raising an inconvenient bug, this is a waste of compute, and fixing the loss seems to be fixing a symptom, instead of the root cause.
In the dataloader, should we skip rows that dont have trainable embeddings?
The text was updated successfully, but these errors were encountered: