Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation incorrect #9

Open
j-hartshorn opened this issue Nov 27, 2020 · 1 comment
Open

Implementation incorrect #9

j-hartshorn opened this issue Nov 27, 2020 · 1 comment

Comments

@j-hartshorn
Copy link

The ale plot should not be anchored at the midpoint of the bucket. This is because the ALE value represents the average change in response from the bottom of the bucket to the top.

As a simple example:

If each bucket, the average local effect in each bucket, and the observation weight (usually number of observations) are as follows:

bucket average_local_effect weight
[0, 1] 22 15
(1, 2] 36 25
(2, 3] -10 35
(3, 4] -41 25

Then the (non-centered) ALE will be:

bucket_edge ALE
0 0
1 22
2 58
3 48
4 7

This is because 22 is the change in prediction between 0 and 1, 36 is the change between 1 and 2, and so on.

We then center with the constant -1.45 (from average_local_effect @ weight / sum(weight)) to get the final ALE:

bucket_edge ALE
0 -1.45
1 20.55
2 56.55
3 46.55
4 5.55

See the original implementation here.

@akuhnregnier
Copy link
Contributor

Hi @j-hartshorn , thanks for bringing this up. I actually noticed this as well - I think I fixed this here: https://github.com/akuhnregnier/ALEPython (specifically in this commit).

Does my updated implementation match the original implementation in this regard? It would be great to get a second opinion on this before I invest the time to prepare a pull request (many of my changes are probably not very relevant in general).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants