-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About the dimension discrepancy of the magnitude vector #11
Comments
I've noticed a similar issue regarding the calculation of the norm for a matrix of size d*k (where d is the output dimension and k is the input dimension, if I'm not mistaken). In the code, the norm is calculated with Additionally, I observed that the Hugging Face weight_norm = torch.linalg.norm(weight, dim=1) |
Hi, the normalization is always applied on the out_feature dimension. We did not formulate W in the first figure in the paper the same as nn.Linear.weight. The shape of W in this figure is of the shape (in_features, out_features) and the shape of nn.Linear.weight is of the shape (out_features,in_features) which might be causing your confusion. |
Thank you for your response, but I am still confused.
Then isn't the shape of |
In our paper, W is (in_feature, out_features), but for our implementation W is (out_features, in_features) following the config of nn.Linear |
Hello,
Thank you for sharing your great research results. I enjoyed reading your paper and have been trying to run your code.
However, while reviewing the code, I encountered an issue regarding the dimension of the magnitude vector.
In the paper, it is mentioned that the size of the magnitude vector is$\mathbb{R}^{1 \times k}$ . However, in the actual code, the magnitude is calculated as follows:
This results in$(d \times 1)$ , if $(d \times k)$ .
magnitude.shape
beingnew_module.weight.shape
isTherefore, it seems that there is a discrepancy between the description of the magnitude shape in the paper and the actual shape in the code. Could you please explain this?
Thank you.
The text was updated successfully, but these errors were encountered: