question about computing contribution utility #9

kiranchari · 2024-12-25T15:51:48Z

Hi, I am trying to understand the calculation of utility in cbp_conv.py:

loss-of-plasticity/lop/algos/cbp_conv.py

Line 89 in 63c35f3

output_weight_mag = self.out_layer.weight.data.abs().mean(dim=(0, 2, 3))

loss-of-plasticity/lop/algos/cbp_conv.py

Line 90 in 63c35f3

self.util.data = output_weight_mag * self.features.abs().mean(dim=(0, 2, 3))

Based on the call to mean(dim=(0,2,3)), it appears that the utility is computed per channel in the conv2d output rather than per neuron - is this correct? However, equation 1 in the paper (https://www.nature.com/articles/s41586-024-07711-7#Sec6) mentions that the utility is computed per neuron in a layer.

Could you please elaborate on the utility computation and clarify if my understanding is correct? Also, could you comment on why the utility is computed per neuron rather than per channel?

Thanks

The text was updated successfully, but these errors were encountered:

kiranchari · 2025-01-01T14:54:46Z

@shibhansh I have a question about the utility calculation. I see that the mean of the weights across dimensions (0,2,3) is computed as follows:

loss-of-plasticity/lop/algos/cbp_conv.py

Line 89 in 63c35f3

output_weight_mag = self.out_layer.weight.data.abs().mean(dim=(0, 2, 3))

The Conv2d weight tensor has shape (in_channels, out_channels, kernel_size, kernel_size). So the above mean is computed over in_channels, kernel_size, kernel_size and should have a final shape of (out_channels).

The mean over features is computed as follows:

loss-of-plasticity/lop/algos/cbp_conv.py

Line 90 in 63c35f3

self.util.data = output_weight_mag * self.features.abs().mean(dim=(0, 2, 3))

Assuming self.features has shape (batch_size, in_channels, height, width), the above mean is averaging across batch_size, height and width and should have a final shape of (in_channels).

It appears to me there is a mismatch in the dimensions that are being multiplied to compute utility i.e. out_channels vs in_channels. It doesn't raise a runtime error because these two dimensions happen to be the same (due to the pooling operation:

loss-of-plasticity/lop/nets/conv_net2.py

Line 52 in 63c35f3

x1 = self.cbp1(self.pool(self.act(self.conv1(x))))

)

Is my understanding correct? Please let me know if I am missing something.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about computing contribution utility #9

question about computing contribution utility #9

kiranchari commented Dec 25, 2024

kiranchari commented Jan 1, 2025

question about computing contribution utility #9

question about computing contribution utility #9

Comments

kiranchari commented Dec 25, 2024

kiranchari commented Jan 1, 2025