-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multinomial sampling is very slow #183
Comments
Did you compile the Rust code with optimizations enabled?
|
Yep. It appears the compiled C that numpy uses employs an algorithm known at BTPE which is significantly faster binomial sampler when p*n > 30. Which is my use case. Is there any interest in an implementation of this algorithm? |
I think this could be useful, but I don't have the background for it. I did notice that More broadly, perhaps we should expose the |
I have a working but not polished implementation for Multinomial for |
@benjamin-lieser as we already have a dependency for I looked at your code - I could not find free access to the paper you mention about the sampling, I attempted reading yours then making my own attempt, but I'm not thinking it's correct, I pushed that attempt to my fork. Seems like it's updating a prior that Let me know if you have a strong way of verifying it. I'm not confident my implementation will have exactly |
The problem is that the binomial implementation in statrs is very slow for large Ideally rand_distr and statrs would build on top of a common numerical implementation of these algorithms. The BTPE part in rand_distr also has still some issues with very small probabilities and large |
It's been quite a while since I looked into this implementation, but I will do my best to offer what I can. I peaked under the numpy hood at the time and found their code for this which ran significantly faster as I noted above:
I just adapted this to my use case in rust, chaining binomial calls:
The above worked for my use case and was much, much, faster than the original rust code:
The get_binomial function in my Rust code is just using the rand_distr::Binomial distribution as it was at that time. So I didn't have to re-write that in any way.
Perhaps there's a whole re-write that includes the Binomial aspect, but for my use case this was totally sufficient. It may be as easy as replacing the multinomial function with approximately what I've included here. I admit I have not sufficiently checked every edge case in my translation from C to Rust. This is also nice because it means that you only really have to optimize Binomial in the future, as Multinomial performance is directly tied to Binomial performance. |
@benjamin-lieser perhaps we can work with rust-random on that. Do you know much about the details of the gsl implementation compared to |
@jamaltas I did something similar, I've got it working now. The benchmark I wrote up has a minor improvement (~10%) sampling binomial (randomly assigned probabilities) instead of against inverting categorical CDF for single samples. Redid the bench with |
I have written some of these references into the source :D I am not that deep into the BTPE part, but I have a good understanding of the BINV part. Apart from rust-random/rand#1484 the implementations of GSL, rand_distr and numpy are very close. |
I don't know if there's a convergent library on where that kind of implementation would go, so I think it makes sense for us to borrow *since we derived from .NET and distribution functions were important, we don't distinguish distributions on open vs closed intervals so perhaps Uniform may change. |
I specifically avoided looking at the GSL implementation when working on the |
Would it make sense to have an optional feature to wrap some of Aside: Unsure about licensing here. Moving to Apache didn't occur from lack of response / contact info for previous contributors. |
Looking into a program I wrote I found the limiting factor was the multinomial sampling:
let mut rng = SmallRng::from_entropy();
let result = Multinomial::new(&weights, 100000).unwrap();
let counts = result.sample(&mut rng);
Specifically the sampling portion of the above code, the SmallRng call is small in comparison.
I rewrote this particular portion of my code in python using numpy.random.multinomial and found an ~400x speed increase. It appears the C code numpy calls on uses an implementation that chains many binomial calls together, whereas the statsrs implementation uses a cdf.
Wonder if there's any plans to change this?
The text was updated successfully, but these errors were encountered: