About the dimension discrepancy of the magnitude vector #11

Swiminn · 2024-06-02T14:45:01Z

Hello,

Thank you for sharing your great research results. I enjoyed reading your paper and have been trying to run your code.

However, while reviewing the code, I encountered an issue regarding the dimension of the magnitude vector.

In the paper, it is mentioned that the size of the magnitude vector is $\mathbb{R}^{1 \times k}$. However, in the actual code, the magnitude is calculated as follows:

magnitude = (torch.linalg.norm(new_module.weight.detach(), dim=1)).unsqueeze(1).detach()

This results in magnitude.shape being $(d \times 1)$, if new_module.weight.shape is $(d \times k)$.

Therefore, it seems that there is a discrepancy between the description of the magnitude shape in the paper and the actual shape in the code. Could you please explain this?

Thank you.

The text was updated successfully, but these errors were encountered:

dt-3t · 2024-06-07T11:19:48Z

Hello,

Thank you for sharing your great research results. I enjoyed reading your paper and have been trying to run your code.

However, while reviewing the code, I encountered an issue regarding the dimension of the magnitude vector.

In the paper, it is mentioned that the size of the magnitude vector is R1×k. However, in the actual code, the magnitude is calculated as follows:
magnitude = (torch.linalg.norm(new_module.weight.detach(), dim=1)).unsqueeze(1).detach()
This results in magnitude.shape being (d×1), if new_module.weight.shape is (d×k).

Therefore, it seems that there is a discrepancy between the description of the magnitude shape in the paper and the actual shape in the code. Could you please explain this?

Thank you.

I've noticed a similar issue regarding the calculation of the norm for a matrix of size d*k (where d is the output dimension and k is the input dimension, if I'm not mistaken). In the code, the norm is calculated with dim=1, which actually computes the norm of each row vector. However, in the paper, it is stated that the calculation should be "vector-wise norm of a matrix across each column". So according to the paper, we should get a tensor of length k (column number), not d as currently implemented.

Additionally, I observed that the Hugging Face peft library also specifies dim=1 in their implementation:

weight_norm = torch.linalg.norm(weight, dim=1)

nbasyl · 2024-06-11T17:19:40Z

Hi, the normalization is always applied on the out_feature dimension. We did not formulate W in the first figure in the paper the same as nn.Linear.weight. The shape of W in this figure is of the shape (in_features, out_features) and the shape of nn.Linear.weight is of the shape (out_features,in_features) which might be causing your confusion.

Swiminn · 2024-06-11T17:45:58Z

Hi, the normalization is always applied on the out_feature dimension. We did not formulate W in the first figure in the paper the same as nn.Linear.weight. The shape of W in this figure is of the shape (in_features, out_features) and the shape of nn.Linear.weight is of the shape (out_features,in_features) which might be causing your confusion.

Thank you for your response, but I am still confused.
In equation (1), it is formulated as

$W' = W_0 + \Delta W = W_0 + BA$
where $B \in \mathbb{R}^{d \times r}$ and $A \in \mathbb{R}^{r \times k}$

Then isn't the shape of $W$ is (out_features, in_features)?

nbasyl · 2024-06-12T01:58:44Z

In our paper, W is (in_feature, out_features), but for our implementation W is (out_features, in_features) following the config of nn.Linear

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the dimension discrepancy of the magnitude vector #11

About the dimension discrepancy of the magnitude vector #11

Swiminn commented Jun 2, 2024

dt-3t commented Jun 7, 2024 •

edited

Loading

nbasyl commented Jun 11, 2024

Swiminn commented Jun 11, 2024

nbasyl commented Jun 12, 2024

About the dimension discrepancy of the magnitude vector #11

About the dimension discrepancy of the magnitude vector #11

Comments

Swiminn commented Jun 2, 2024

dt-3t commented Jun 7, 2024 • edited Loading

nbasyl commented Jun 11, 2024

Swiminn commented Jun 11, 2024

nbasyl commented Jun 12, 2024

dt-3t commented Jun 7, 2024 •

edited

Loading