You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Figure 7 in your paper plots the weights of each residual branch for post-LN and pre-LN. I am wondering how to get these weights exactly. I note that there is a parameter called plot_variance, looks like it outputs sqrt(Var[a_j]). Then how can I get the beta_{i, j}?
Thx for advancing!
The text was updated successfully, but these errors were encountered:
Figure 7 in your paper plots the weights of each residual branch for post-LN and pre-LN. I am wondering how to get these weights exactly. I note that there is a parameter called plot_variance, looks like it outputs
sqrt(Var[a_j])
. Then how can I get thebeta_{i, j}
?Thx for advancing!
The text was updated successfully, but these errors were encountered: