You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Originally posted by korchi August 3, 2023
Hi. First of all, thank you for the great tool you are developing. However, I am puzzled already for a week about how can I replicate trainer.evaluation() results with inferencing the model.
My initial idea was to truncate every sessions by one (removing last item_id), call trainer.predict(truncated_sessions), and then compute recall(last_item_ids, predictions[:20]). However, I am getting different recall metric.
The only way I managed to "replicate" evaluate() results is by: (1) providing not-truncated inputs to the trainer.predict() and (2) changing -1 into -2 in
Is it because trainer.evaluate() shifts the inputs to the left by one position? Or what am I doing incorrectly? Could any provide me insights how to do it "correctly", please?
Thanks a lot.
The text was updated successfully, but these errors were encountered:
@korchi In the predict, say this is your original input sequence list: [1, 2, 3, 4, 5] and these are the item ids. If you want to compare trainer.evaluate() and trainer.predict() your input sequences that you feed to the model should be different.
For predict, say your target item is the last item which is 5, then you should feed this as an input: [1, 2, 3, 4] .
what's gonna happen the model is gonna use first four entires and predict the fifth one, and then you can compare with your ground truth item id (it is 5) in this example:
[1, 2, 3, 4] --> [1, 2, 3, 4, predicted_item_id]
compare predicted_item_id with the ground truth.
Whereas, for trainer.evaluate() you feed entire sequence [1,2,3,4,5] and the evaluate func will do its job and generate a predicted item id for the last item in the sequence by using the items before that.
Discussed in #736
Originally posted by korchi August 3, 2023
Hi. First of all, thank you for the great tool you are developing. However, I am puzzled already for a week about how can I replicate
trainer.evaluation()
results with inferencing the model.My initial idea was to truncate every sessions by one (removing last item_id), call
trainer.predict(truncated_sessions)
, and then computerecall(last_item_ids, predictions[:20])
. However, I am getting different recall metric.The only way I managed to "replicate" evaluate() results is by: (1) providing not-truncated inputs to the
trainer.predict()
and (2) changing-1
into-2
inTransformers4Rec/transformers4rec/torch/model/prediction_task.py
Line 460 in 348c963
I am puzzled why, but this was the only way I could ensure that the
x
inTransformers4Rec/transformers4rec/torch/model/prediction_task.py
Line 464 in 348c963
x
inTransformers4Rec/transformers4rec/torch/model/prediction_task.py
Line 444 in 348c963
Is it because
trainer.evaluate()
shifts the inputs to the left by one position? Or what am I doing incorrectly? Could any provide me insights how to do it "correctly", please?Thanks a lot.
The text was updated successfully, but these errors were encountered: