Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about computing contribution utility #9

Open
kiranchari opened this issue Dec 25, 2024 · 1 comment
Open

question about computing contribution utility #9

kiranchari opened this issue Dec 25, 2024 · 1 comment

Comments

@kiranchari
Copy link

Hi, I am trying to understand the calculation of utility in cbp_conv.py:

output_weight_mag = self.out_layer.weight.data.abs().mean(dim=(0, 2, 3))

self.util.data = output_weight_mag * self.features.abs().mean(dim=(0, 2, 3))

Based on the call to mean(dim=(0,2,3)), it appears that the utility is computed per channel in the conv2d output rather than per neuron - is this correct? However, equation 1 in the paper (https://www.nature.com/articles/s41586-024-07711-7#Sec6) mentions that the utility is computed per neuron in a layer.

Could you please elaborate on the utility computation and clarify if my understanding is correct? Also, could you comment on why the utility is computed per neuron rather than per channel?

Thanks

@kiranchari
Copy link
Author

@shibhansh I have a question about the utility calculation. I see that the mean of the weights across dimensions (0,2,3) is computed as follows:

output_weight_mag = self.out_layer.weight.data.abs().mean(dim=(0, 2, 3))

The Conv2d weight tensor has shape (in_channels, out_channels, kernel_size, kernel_size). So the above mean is computed over in_channels, kernel_size, kernel_size and should have a final shape of (out_channels).

The mean over features is computed as follows:

self.util.data = output_weight_mag * self.features.abs().mean(dim=(0, 2, 3))

Assuming self.features has shape (batch_size, in_channels, height, width), the above mean is averaging across batch_size, height and width and should have a final shape of (in_channels).

It appears to me there is a mismatch in the dimensions that are being multiplied to compute utility i.e. out_channels vs in_channels. It doesn't raise a runtime error because these two dimensions happen to be the same (due to the pooling operation:

x1 = self.cbp1(self.pool(self.act(self.conv1(x))))
)

Is my understanding correct? Please let me know if I am missing something.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant