Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert sum benchmark to use var<mat> #2

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

bbbales2
Copy link

Converted one! As we get the rest of the var<mat> stuff in place we can convert the rest.

FastAD sum:

-----------------------------------------------------------------------------------
Benchmark                         Time             CPU   Iterations UserCounters...
-----------------------------------------------------------------------------------
BM_fastad<SumFunc>/1           6.81 ns         6.81 ns    102241091 N=1
BM_fastad<SumFunc>/2           7.57 ns         7.57 ns     92824386 N=2
BM_fastad<SumFunc>/4           6.88 ns         6.88 ns    101760162 N=4
BM_fastad<SumFunc>/8           7.59 ns         7.59 ns     92351222 N=8
BM_fastad<SumFunc>/16          9.59 ns         9.59 ns     73186002 N=16
BM_fastad<SumFunc>/32          14.0 ns         14.0 ns     49094946 N=32
BM_fastad<SumFunc>/64          27.4 ns         27.3 ns     26011579 N=64
BM_fastad<SumFunc>/128         53.2 ns         53.2 ns     13048065 N=128
BM_fastad<SumFunc>/256         95.4 ns         95.4 ns      7325127 N=256
BM_fastad<SumFunc>/512          179 ns          179 ns      3912272 N=512
BM_fastad<SumFunc>/1024         348 ns          348 ns      2019448 N=1024
BM_fastad<SumFunc>/2048         687 ns          687 ns      1023662 N=2.048k
BM_fastad<SumFunc>/4096        1427 ns         1426 ns       490766 N=4.096k
BM_fastad<SumFunc>/8192        2808 ns         2807 ns       245426 N=8.192k
BM_fastad<SumFunc>/16384       5619 ns         5619 ns       124105 N=16.384k

Stan sum:

-----------------------------------------------------------------------------------------
Benchmark                               Time             CPU   Iterations UserCounters...
-----------------------------------------------------------------------------------------
BM_stan<SumFunc, varmat>/1           26.8 ns         26.8 ns     26350903 N=1
BM_stan<SumFunc, varmat>/2           31.8 ns         31.8 ns     22029484 N=2
BM_stan<SumFunc, varmat>/4           35.8 ns         35.8 ns     19562051 N=4
BM_stan<SumFunc, varmat>/8           41.0 ns         41.0 ns     16943482 N=8
BM_stan<SumFunc, varmat>/16          45.6 ns         45.6 ns     15352545 N=16
BM_stan<SumFunc, varmat>/32          54.2 ns         54.2 ns     12865175 N=32
BM_stan<SumFunc, varmat>/64          66.4 ns         66.4 ns     10597147 N=64
BM_stan<SumFunc, varmat>/128         95.3 ns         95.3 ns      7351159 N=128
BM_stan<SumFunc, varmat>/256          168 ns          168 ns      4160385 N=256
BM_stan<SumFunc, varmat>/512          281 ns          281 ns      2478569 N=512
BM_stan<SumFunc, varmat>/1024         514 ns          514 ns      1358337 N=1024
BM_stan<SumFunc, varmat>/2048        1127 ns         1127 ns       616329 N=2.048k
BM_stan<SumFunc, varmat>/4096        2041 ns         2040 ns       342880 N=4.096k
BM_stan<SumFunc, varmat>/8192        3947 ns         3946 ns       178015 N=8.192k
BM_stan<SumFunc, varmat>/16384       8133 ns         8132 ns        86724 N=16.384k
BM_stan<SumFunc, matvar>/1           41.8 ns         41.8 ns     16774336 N=1
BM_stan<SumFunc, matvar>/2           43.9 ns         43.9 ns     15897233 N=2
BM_stan<SumFunc, matvar>/4           50.3 ns         50.3 ns     14074020 N=4
BM_stan<SumFunc, matvar>/8           63.8 ns         63.8 ns     10681440 N=8
BM_stan<SumFunc, matvar>/16          90.4 ns         90.4 ns      7792413 N=16
BM_stan<SumFunc, matvar>/32           144 ns          144 ns      5011206 N=32
BM_stan<SumFunc, matvar>/64           341 ns          341 ns      2106592 N=64
BM_stan<SumFunc, matvar>/128          627 ns          627 ns      1089710 N=128
BM_stan<SumFunc, matvar>/256         1252 ns         1252 ns       559918 N=256
BM_stan<SumFunc, matvar>/512         2466 ns         2466 ns       284112 N=512
BM_stan<SumFunc, matvar>/1024        4959 ns         4959 ns       140986 N=1024
BM_stan<SumFunc, matvar>/2048        9966 ns         9965 ns        70066 N=2.048k
BM_stan<SumFunc, matvar>/4096       19909 ns        19907 ns        35190 N=4.096k
BM_stan<SumFunc, matvar>/8192       52597 ns        52590 ns        13336 N=8.192k
BM_stan<SumFunc, matvar>/16384     129659 ns       129640 ns         5416 N=16.384k

@bbbales2
Copy link
Author

bbbales2 commented Oct 23, 2020

Actually maybe I made mat<var> slower with this lol. I think it was only like 80-90us before (Edit: at N = 16384).

@bbbales2
Copy link
Author

@JamesYang007 I was converting more of these, in the StochasticVolatility example as-is I'm getting outputs like:

BM_stan<StochasticVolatilityFunc, matvar>/32          2013 ns         2013 ns       352799 N=35
WARNING (stan-stochastic_volatility) MAX ABS ERROR PROP: 2.50972e-15
WARNING (stan-stochastic_volatility) MAX ABS ERROR PROP: 2.69971e-15
WARNING (stan-stochastic_volatility) MAX ABS ERROR PROP: 0.815661
WARNING (stan-stochastic_volatility) MAX ABS ERROR PROP: 3.19019e-15
WARNING (stan-stochastic_volatility) MAX ABS ERROR PROP: 2.10088e-15
WARNING (stan-stochastic_volatility) MAX ABS ERROR PROP: 1.87386e-15
WARNING (stan-stochastic_volatility) MAX ABS ERROR PROP: 2.452e-15

The 0.815 makes me think something is broken, and so I'll look into that, but the way this is written the h variable is kinda part of the input. Like why is this:

auto operator()(Eigen::Matrix<stan::math::var, Eigen::Dynamic, 1>& x) const
    {
        using namespace stan::math;
        using vec_t = Eigen::Matrix<var, Eigen::Dynamic, 1>;
        size_t N = (x.size() - 3) / 2;
        Eigen::Map<vec_t> h_std(x.data(), N);
        Eigen::Map<vec_t> h(x.data() + N, N);
        auto& phi = x(2*N);
        auto& sigma = x(2*N + 1);
        auto& mu = x(2*N + 2);
        h = h_std * sigma;
        ...;
     }

Not something like:

auto operator()(Eigen::Matrix<stan::math::var, Eigen::Dynamic, 1>& x) const
    {
        using namespace stan::math;
        using vec_t = Eigen::Matrix<var, Eigen::Dynamic, 1>;
        size_t N = (x.size() - 3) / 2;
        Eigen::Map<vec_t> h_std(x.data(), N);
        auto& phi = x(N);
        auto& sigma = x(N + 1);
        auto& mu = x(N + 2);
        vec_t h = h_std * sigma;
        ...;
     }

I see the default implementation is like this too. I wanna change it :D.

@JamesYang007
Copy link
Owner

Ah I didn't want to allocate more than I needed to. The parameter x for operator() is supposed to represent the entire parameter vector and h is a (transformed) parameter. Some libraries (like Stan) allow for this kind of "viewer" logic which generally saves time so I wanted to give them the advantage if they supported it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants