Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sum_over_batch_size clarification #18818

Closed
jackd opened this issue Nov 23, 2023 · 2 comments
Closed

sum_over_batch_size clarification #18818

jackd opened this issue Nov 23, 2023 · 2 comments
Assignees

Comments

@jackd
Copy link
Contributor

jackd commented Nov 23, 2023

Losses all take a reduction argument which can either be sum or sum_over_batch_size. sum is fairly straight forward, and there are no suprises with the implemention - a summation, or a weighted summation if sample_weight or mask is present. sum_over_batch_size is ambiguously named and inconsistent with the implementation.

Ambiguity: I originally thought it meant "sum over the batch dimension". Looking at previous implementations, it looks like it's meant to mean "sum divided by the batch size". The keras 3.0 implementation looks like it just computes a weighted mean.

If it's meant to be the mean, why not call it "mean"? If it's not meant to be the mean, then consider this a bug report, because the current implementation just does that.

Note I'm not just being pedantic - I want to submit a PR that mixes masking (multiplication by zero is not masking when infs and nans are around), but I need to know exactly what the implementation is supposed to be to fix this.

@fchollet
Copy link
Collaborator

sum_over_batch_size and mean are the same thing. It should more naturally be called mean but we kept the Keras 2 terminology for backwards compatibility.

@jackd jackd closed this as completed Nov 24, 2023
@jackd
Copy link
Contributor Author

jackd commented Nov 30, 2023

For anyone coming back here in the future: I actually prefer sum_over_batch_size to mean now, because the weighted interpretations are different. While a sum reduction with sample_weights is interpreted as a weighted sum, a sum_over_batch_size is interpreted as a weighted sum divided by the number of unmasked entries (the "batch size"), not the weighted mean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants