You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is worth noting that if the length of prediction is not consistent (1028 != 1034), the evaluation does not make sense as there are mismatchs between the groundtruth and prediction.
Hi, I'm having the same issue. The eval.sh script by default generates an output file of 1028 samples. Any advice on how to have it output 1034 samples so the spider evaluator can be used to replicate the leaderboard result?
I evaluate using
eval.sh
withIRNet_pretrained.model
, and run spider official script. But I got strange result.Did I do something wrong?
Thanks!
BTW, the length of prediction of IRNet is 1028, and the length of official dev_gold.sql is 1034.
The text was updated successfully, but these errors were encountered: