[help] Rerunning Experiments for Replicating Paper Results #22

kevin3567 · 2024-12-07T16:35:10Z

For paper replication purposes, in the Readme, it is stated that for "Every metric was collected by running the experiment 10 times separately and calculating the average value." Is this done only for collecting the training/inference speed/gpu-usage, or is this also applicable for getting the reported task-specific (i.e. arc-e, boolq, etc.) accuracy scores?

mikecovlee · 2024-12-08T04:43:32Z

We collected these performance metrics by running the evaluation 10 separate times and calculating the average values. For accuracy scores, running multiple trials is unnecessary because our code ensures reproducibility when the random seed is fixed in the same environment and on the same device.

mikecovlee changed the title ~~Rerunning Experiments for Replicating Paper Results~~ [Reproduction] Rerunning Experiments for Replicating Paper Results Dec 8, 2024

mikecovlee added the help wanted Extra attention is needed label Dec 8, 2024

mikecovlee changed the title ~~[Reproduction] Rerunning Experiments for Replicating Paper Results~~ [help] Rerunning Experiments for Replicating Paper Results Dec 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[help] Rerunning Experiments for Replicating Paper Results #22

[help] Rerunning Experiments for Replicating Paper Results #22

kevin3567 commented Dec 7, 2024

mikecovlee commented Dec 8, 2024

[help] Rerunning Experiments for Replicating Paper Results #22

[help] Rerunning Experiments for Replicating Paper Results #22

Comments

kevin3567 commented Dec 7, 2024

mikecovlee commented Dec 8, 2024