You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @charlesfrye , I'm not sure if this repo is accepting PRs, but I spotted a bug in the data/emnist.py file. It concerns the _sample_to_balance function and the usage of np.bincount in it here.
Because you offset the labels by NUM_SPECIAL_TOKENShere and here before calling the subsampling function, np.bincount will prepend zeros to the missing elements from 0 to y_min_element-1 inclusive and will bias the mean towards zero. This could lead to a smaller dataset.
I checked this issue has not been duplicated.
Hi @charlesfrye , I'm not sure if this repo is accepting PRs, but I spotted a bug in the
data/emnist.py
file. It concerns the_sample_to_balance
function and the usage ofnp.bincount
in it here.Because you offset the labels by
NUM_SPECIAL_TOKENS
here and here before calling the subsampling function,np.bincount
will prepend zeros to the missing elements from0
toy_min_element-1
inclusive and will bias the mean towards zero. This could lead to a smaller dataset.Example behaviour of
np.bincount
:I have proposed a solution to the described bug in this PR.
The text was updated successfully, but these errors were encountered: