-
-
Notifications
You must be signed in to change notification settings - Fork 436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve shuffle function documentation #903
Comments
I'm not sure how relevant it is to make all permutations possible. In practice, it would be very hard to show that some permutations are not possible, it would basically require iterating all possible states of the RNG, which is not really feasible for 128 bits of state or more. |
I was thinking more of RNGs with few enough bits to make the states possible to enumerate. |
The default RNG within Rand has 1088 bits of state. Even (When massive parallelization over many CPUs is involved, a state size of at least 256 bits has a more comfortable margin than 64 bits.) |
This is the real concern: if your RNG has a short enough cycle that you can loop through all values, then it is possible to observe non-random behaviour (i.e. correlations between one part of a sequence and another). Note that So how long a cycle is enough? Attempting to generate all 52! permutations of your stack of cards will take a modern computer a long time; too long to be feasible. Modern CPUs run at a few GHz, say approx 3e7 × 3e9 ~= 1e17 clock cycles per year. 52! ~= 8e67, so it would take over 10^50 years at one permutation per clock cycle. On the other hand, an RNG with 64 bits of state has approx 2e19 states; still too long to exhaust, but short enough that if you pick many seed for many short pieces of output, there's a chance of overlap more on that here. Which brings us to...
Not actually true if you check the rand_pcg code; both implementations use 128 bits of state. Check here for our requirements for small RNGs. As for that wikipedia article, it makes the assumption that the RNG state must be large enough to describe all permutations to avoid bias. This is simply wrong; generating one permutation requires enough unbiased random data to describe one permutation (i.e. less than n^2 numbers), not all of them. |
I was looking at the Rand book, where it is claimed that |
That is accurate. Nevertheless, the state is 128 bits (more accurately, 127 bits since 1 is discarded). This influences the sequence, reducing the chance of two randomly seeded instances producing the same sequence. |
In order for all shufflings to be possible "the number of states of the PRNG should exceed the number of permutations by at least several orders of magnitude" (Wikipedia source). I.e. for a slice consisting of n elements the PRNG state bits should ideally exceed log2(n!). Perhaps it could be helpful to people to mention this in the documentation of the shuffle functions.
The text was updated successfully, but these errors were encountered: