Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix bugs in criteo.py that leads to NaN problem
The original script simply added 3 to the target value before taking the log. This led to the issue that in data preprocessing, if there was a value of -3, it would result in a value of -inf. This problem was mentioned in the issue facebookresearch/dlrm#363 (comment). I changed the preprocessing operation to dense_np -= dense_np.min() - 2 in the tsv_to_npys function, and correctly handled the Criteo Kaggle dataset.
- Loading branch information