You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the paper Algorithm 3, for hyena order N, there are (N+1) projections, and N filters
with order=2, it returns mlp2(x) * FFTConv(mlp1(x) * FFTConv(mlp0(x), filter0), filter1)
For hyena order N, there are (N+1) projections and (N-1) filters
In the code, for example, with order=2,
it will do mlp2(x) * FFTConv(mlp0(x) * mlp1(x), filter0)
i.e., for order=N there is only (N-1) FFTConv applications.
is it intentional or am I missing something (the code is quite convoluted) ?
A lot of the experiment had done with order=2. Does that mean one application of FFTConv per layer is enough ?
The text was updated successfully, but these errors were encountered:
In the paper Algorithm 3, for hyena order N, there are (N+1) projections, and N filters
with order=2, it returns
mlp2(x) * FFTConv(mlp1(x) * FFTConv(mlp0(x), filter0), filter1)
However, in the implementation e.g.
safari/standalone_hyena.py
Line 244 in 4f5972c
For hyena order N, there are (N+1) projections and (N-1) filters
In the code, for example, with order=2,
it will do
mlp2(x) * FFTConv(mlp0(x) * mlp1(x), filter0)
i.e., for order=N there is only (N-1) FFTConv applications.
is it intentional or am I missing something (the code is quite convoluted) ?
A lot of the experiment had done with order=2. Does that mean one application of FFTConv per layer is enough ?
The text was updated successfully, but these errors were encountered: