Faster normal draws with the ziggurat algorithm #326

richfitz · 2021-11-08T10:47:20Z

This PR implements the ziggurat algorithm for normally distributed numbers.

There are some follow-on bits of work to do that I am avoiding in this PR because they're more likely to be disruptive and this is large enough atm

The current version does compile on a GPU, but runs fairly slowly due to the doubles

Fixes #308

Turns out that passing cpp11::writable::doubles caused unexpected copies, which was a surprise. Something to investigate later

codecov · 2021-11-08T10:50:54Z

Codecov Report

Merging #326 (57a17bf) into master (29b427b) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##            master      #326   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           57        59    +2     
  Lines         3258      3331   +73     
=========================================
+ Hits          3258      3331   +73

Impacted Files	Coverage Δ
inst/include/dust/random/binomial.hpp	`100.00% <ø> (ø)`
R/rng.R	`100.00% <100.00%> (ø)`
inst/include/dust/random/normal.hpp	`100.00% <100.00%> (ø)`
inst/include/dust/random/normal_box_muller.hpp	`100.00% <100.00%> (ø)`
inst/include/dust/random/normal_ziggurat.hpp	`100.00% <100.00%> (ø)`
src/dust_rng.cpp	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 29b427b...57a17bf. Read the comment docs.

johnlees · 2021-11-08T12:28:19Z

inst/include/dust/random/normal.hpp

-  return std::sqrt(-2 * std::log(u1)) * std::cos(two_pi * u2);
-}
-
+real_type random_normal(rng_state_type& rng_state) {


Is the algorithm chosen at run time or when compiling the object? Wondering whether we could just template rather than using this function

that is a template - am I missing something?

After discussion: it feels like one could overload this as we know the algorithm is known at compile time but C++ doesn't let us do that

johnlees · 2021-11-08T12:30:07Z

inst/include/dust/random/normal_ziggurat.hpp

+// TODO: this will not work efficiently for float types because we
+// don't have float tables for 'x' and 'y'; getting them is not easy
+// without requiring c++14 either. The lower loop using 'x' could


How come we are able to manage this with binomial draws?

The binomial algorithm doesn't need to use the random numbers; the tables are only used in stirling_approx_tail, so we have fully specialised templates on real_type (and some ugly names k_tail_values_d and stirling_approx_tail_f). With C++14 we can make these template variables which removes the naming problem but because the algorithm here needs to use one or two random numbers along side the constants we'd end up with trying to make a partially specialised template (providing real_type but leaving rng_state_type open).

There are some solutions, but I'd like to try and implement these separately and compare timings to make sure that we don't end up paying too much on the CPU case

johnlees · 2021-11-08T12:30:37Z

inst/include/dust/random/normal_ziggurat.hpp

+real_type random_normal_ziggurat(rng_state_type& rng_state) {
+  using ziggurat::x;
+  using ziggurat::y;
+  constexpr size_t n = 256;


Comment to explain this choice and whether it is tuneable?

My first shot at this was fully tunable using std::array<real_type, int> for all sorts of useful n! But it was a pain to work with and we can't have std::array on the GPU...

I've added a note indicating where this comes from and why now

johnlees · 2021-11-08T12:31:49Z

inst/include/dust/random/normal_ziggurat.hpp

+    const auto f1 = std::exp(-0.5 * (x[i + 1] * x[i + 1] - z * z));
+    const auto u1 = random_real<real_type>(rng_state);
+    if (f1 + u1 * (f0 - f1) < 1.0) {
+      ret = z;


should this break?

johnlees · 2021-11-08T12:33:58Z

scripts/update_ziggurat_tables

+    s <- vapply(z, deparse, "", control = "digits17")
+    paste(


Do we know that this is enough precision? (is it based on being < 2.2e-16?)

digits17 is the full precision of the underlying number

johnlees · 2021-11-08T12:35:12Z

scripts/update_ziggurat_tables

+
+
+## Helper for root polishing
+uniroot2 <- function(f, bounds, ..., scal = 10) {


Does tolerance need scaling?

Tolerance comes through in the dots here

johnlees · 2021-11-08T12:35:28Z

scripts/update_ziggurat_tables

+}
+
+
+zig_constants <- function(n, tolerance = 1e-10) {


Enough precision here too?

I think so - this is 2 orders more than we get than by default. To do this "properly" we'd want to use long doubles really. I've seen implementations with these numbers only ok to 5 digits though, and ours is more accurate than Doornik (only differs in the ~6th place I think)

johnlees · 2021-11-08T12:35:57Z

scripts/update_ziggurat_tables

+
+zig_constants <- function(n, tolerance = 1e-10) {
+  ## As for intervals but with more robustness to being out of bounds
+  intervals <- function(n, r, v) {


I would consider adding a reference here

I've added one, but a more considerable derivation is forthcoming (I've got most of a detailed vignette working through the process_)

richfitz and others added 11 commits November 8, 2021 09:23

Basic implementation

5d89e39

Allow selecting algorithm in R interface

dc52a9b

Expose standard normals

3320340

Minor doc updates

e649dbc

Expand testing

990545b

Avoid copies

9df555c

Turns out that passing cpp11::writable::doubles caused unexpected copies, which was a surprise. Something to investigate later

Remove previous helper file

32c7d85

Tweaks to work on nvcc

fbcd325

Make code more gpu-friendly

338a9a3

Bump version and add news

5c6d3f2

Remove trailing whitespace

7c19f48

richfitz added 2 commits November 8, 2021 10:57

Fix header issues

ec7a72b

Rename gamma tables for compatibility

3e8bfab

richfitz marked this pull request as ready for review November 8, 2021 11:08

richfitz added 2 commits November 8, 2021 11:10

Add a little normal performance test

f2fa303

Also document ziggurat use

2d6eae7

richfitz requested a review from johnlees November 8, 2021 11:20

johnlees requested changes Nov 8, 2021

View reviewed changes

richfitz added 2 commits November 8, 2021 13:51

Comments from review

8f00ea2

Further justify choice

57a17bf

richfitz requested a review from johnlees November 8, 2021 14:05

johnlees approved these changes Nov 8, 2021

View reviewed changes

johnlees merged commit 906d994 into master Nov 8, 2021

johnlees deleted the i308-normal branch November 8, 2021 14:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster normal draws with the ziggurat algorithm #326

Faster normal draws with the ziggurat algorithm #326

richfitz commented Nov 8, 2021 •

edited

Loading

codecov bot commented Nov 8, 2021 •

edited

Loading

johnlees Nov 8, 2021

richfitz Nov 8, 2021

richfitz Nov 8, 2021

johnlees Nov 8, 2021

richfitz Nov 8, 2021

johnlees Nov 8, 2021

richfitz Nov 8, 2021

richfitz Nov 8, 2021

johnlees Nov 8, 2021

richfitz Nov 8, 2021

johnlees Nov 8, 2021

richfitz Nov 8, 2021

johnlees Nov 8, 2021

richfitz Nov 8, 2021

johnlees Nov 8, 2021

richfitz Nov 8, 2021

johnlees Nov 8, 2021

richfitz Nov 8, 2021



		## Helper for root polishing
		uniroot2 <- function(f, bounds, ..., scal = 10) {

Faster normal draws with the ziggurat algorithm #326

Faster normal draws with the ziggurat algorithm #326

Conversation

richfitz commented Nov 8, 2021 • edited Loading

codecov bot commented Nov 8, 2021 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richfitz commented Nov 8, 2021 •

edited

Loading

codecov bot commented Nov 8, 2021 •

edited

Loading