Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RngCore::bytes_per_round #396

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion benches/distributions.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#![feature(test)]
#![cfg_attr(feature = "i128_support", feature(i128_type, i128))]
#![cfg_attr(all(feature="i128_support", feature="nightly"), allow(stable_features))] // stable since 2018-03-27
#![cfg_attr(all(feature="i128_support", feature="nightly"), feature(i128_type, i128))]

extern crate test;
extern crate rand;
Expand Down
6 changes: 6 additions & 0 deletions benches/generators.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#![feature(test)]
#![cfg_attr(all(feature="i128_support", feature="nightly"), allow(stable_features))] // stable since 2018-03-27
#![cfg_attr(all(feature="i128_support", feature="nightly"), feature(i128_type, i128))]

extern crate test;
extern crate rand;
Expand Down Expand Up @@ -74,6 +76,10 @@ gen_uint!(gen_u64_std, u64, StdRng::new());
gen_uint!(gen_u64_small, u64, SmallRng::new());
gen_uint!(gen_u64_os, u64, OsRng::new().unwrap());

#[cfg(feature = "i128_support")] gen_uint!(gen_u128_xorshift, u128, XorShiftRng::new());
#[cfg(feature = "i128_support")] gen_uint!(gen_u128_hc128, u128, Hc128Rng::new());
#[cfg(feature = "i128_support")] gen_uint!(gen_u128_os, u128, OsRng::new().unwrap());

// Do not test JitterRng like the others by running it RAND_BENCH_N times per,
// measurement, because it is way too slow. Only run it once.
#[bench]
Expand Down
32 changes: 31 additions & 1 deletion rand_core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,19 @@ pub trait RngCore {
///
/// [`fill_bytes`]: trait.RngCore.html#method.fill_bytes
fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error>;

/// Number of bytes generated per round of this RNG.
///
/// Some algorithms would benefit from knowing some basic properties about
/// the RNG. In terms of performance an algorithm may want to know whether
/// an RNG is best at generating `u32`s, or could provide `u64`s or more at
/// little to no extra cost.
///
/// For many RNGs a simple definition is: the smallest number of bytes this
/// RNG can generate without throwing away part of the generated value.
///
/// `bytes_per_round` has a default implementation that returns `4` (bytes).
fn bytes_per_round(&self) -> usize { 4 }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be an associated constant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried something like that in a not really thought through attempt: #377 (comment). The problem is that we then can't make RngCore into a trait object.

Copy link
Member

@dhardy dhardy Apr 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird; doesn't sound like associated constants should prevent a trait from becoming object-safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☹️

error[E0038]: the trait `rand_core::RngCore` cannot be made into an object
    --> src\lib.rs:1189:21
     |
1189 |         let mut r = Box::new(rng) as Box<RngCore>;
     |                     ^^^^^^^^^^^^^ the trait `rand_core::RngCore` cannot be made into an object
     |
     = note: the trait cannot contain associated consts like `BYTES_PER_ROUND`
     = note: required because of the requirements on the impl of `std::ops::CoerceUnsized<std::boxed::Box<rand_core::RngCore>>` for `std::boxed::Box<test::TestRng<StdRng>>`

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we really give a default implementation here?

Did you forget to implement for Jitter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not sure. A disadvantage with a default implementation is that you can easily forget a wrapper.

Did you forget to implement for Jitter?

JitterRng::next_u32() is about twice as fast as next_u64(), so 4 bytes would be the best fit there.

}

/// A trait for RNGs which do not generate random numbers individually, but in
Expand Down Expand Up @@ -384,7 +397,9 @@ pub trait SeedableRng: Sized {
}
}


// Implement `RngCore` for references to an `RngCore`.
// Force inlining all functions, so that it is up to the `RngCore`
// implementation and the optimizer to decide on inlining.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually if R is unsized these cannot be inlined

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, true.

But before fill_bytes was basically never inlined because we always use the RNG through this implementation, through a reference (and for some reason LLVM really does not like our abstractions...). So now we can at least control things when it is not a trait object.

impl<'a, R: RngCore + ?Sized> RngCore for &'a mut R {
#[inline(always)]
fn next_u32(&mut self) -> u32 {
Expand All @@ -396,15 +411,24 @@ impl<'a, R: RngCore + ?Sized> RngCore for &'a mut R {
(**self).next_u64()
}

#[inline(always)]
fn fill_bytes(&mut self, dest: &mut [u8]) {
(**self).fill_bytes(dest)
}

#[inline(always)]
fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error> {
(**self).try_fill_bytes(dest)
}

fn bytes_per_round(&self) -> usize {
(**self).bytes_per_round()
}
}

// Implement `RngCore` for boxed references to an `RngCore`.
// Force inlining all functions, so that it is up to the `RngCore`
// implementation and the optimizer to decide on inlining.
#[cfg(feature="alloc")]
impl<R: RngCore + ?Sized> RngCore for Box<R> {
#[inline(always)]
Expand All @@ -417,11 +441,17 @@ impl<R: RngCore + ?Sized> RngCore for Box<R> {
(**self).next_u64()
}

#[inline(always)]
fn fill_bytes(&mut self, dest: &mut [u8]) {
(**self).fill_bytes(dest)
}

#[inline(always)]
fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error> {
(**self).try_fill_bytes(dest)
}

fn bytes_per_round(&self) -> usize {
(**self).bytes_per_round()
}
}
99 changes: 42 additions & 57 deletions src/distributions/integer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,46 +13,52 @@
use {Rng};
use distributions::{Distribution, Standard};

impl Distribution<isize> for Standard {
#[inline]
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> isize {
rng.gen::<usize>() as isize
}
}

impl Distribution<i8> for Standard {
impl Distribution<u8> for Standard {
#[inline]
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> i8 {
rng.next_u32() as i8
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> u8 {
rng.next_u32() as u8
}
}

impl Distribution<i16> for Standard {
impl Distribution<u16> for Standard {
#[inline]
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> i16 {
rng.next_u32() as i16
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> u16 {
rng.next_u32() as u16
}
}

impl Distribution<i32> for Standard {
impl Distribution<u32> for Standard {
#[inline]
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> i32 {
rng.next_u32() as i32
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> u32 {
rng.next_u32()
}
}

impl Distribution<i64> for Standard {
impl Distribution<u64> for Standard {
#[inline]
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> i64 {
rng.next_u64() as i64
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> u64 {
rng.next_u64()
}
}

#[cfg(feature = "i128_support")]
impl Distribution<i128> for Standard {
impl Distribution<u128> for Standard {
#[inline]
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> i128 {
rng.gen::<u128>() as i128
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> u128 {
if rng.bytes_per_round() < 128 {
// Use LE; we explicitly generate one value before the next.
let x = rng.next_u64() as u128;
let y = rng.next_u64() as u128;
(y << 64) | x
} else {
let mut val = 0u128;
unsafe {
let ptr = &mut val;
let b_ptr = &mut *(ptr as *mut u128 as *mut [u8; 16]);
rng.fill_bytes(b_ptr);
}
val.to_le()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to see benchmarks for this on a BE platform

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sometimes just change things to to_be() and measure on x86_64. But it really only starts to show improvements for things like my SIMD experiment (twice as fast there).

}
}
}

Expand All @@ -70,44 +76,23 @@ impl Distribution<usize> for Standard {
}
}

impl Distribution<u8> for Standard {
#[inline]
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> u8 {
rng.next_u32() as u8
macro_rules! impl_int_from_uint {
($ty:ty, $uty:ty) => {
impl Distribution<$ty> for Standard {
#[inline]
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> $ty {
rng.gen::<$uty>() as $ty
}
}
}
}

impl Distribution<u16> for Standard {
#[inline]
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> u16 {
rng.next_u32() as u16
}
}

impl Distribution<u32> for Standard {
#[inline]
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> u32 {
rng.next_u32()
}
}

impl Distribution<u64> for Standard {
#[inline]
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> u64 {
rng.next_u64()
}
}

#[cfg(feature = "i128_support")]
impl Distribution<u128> for Standard {
#[inline]
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> u128 {
// Use LE; we explicitly generate one value before the next.
let x = rng.next_u64() as u128;
let y = rng.next_u64() as u128;
(y << 64) | x
}
}
impl_int_from_uint! { i8, u8 }
impl_int_from_uint! { i16, u16 }
impl_int_from_uint! { i32, u32 }
impl_int_from_uint! { i64, u64 }
#[cfg(feature = "i128_support")] impl_int_from_uint! { i128, u128 }
impl_int_from_uint! { isize, usize }


#[cfg(test)]
Expand Down
11 changes: 11 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -849,6 +849,10 @@ impl RngCore for StdRng {
fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error> {
self.0.try_fill_bytes(dest)
}

fn bytes_per_round(&self) -> usize {
self.0.bytes_per_round()
}
}

impl SeedableRng for StdRng {
Expand Down Expand Up @@ -936,6 +940,10 @@ impl RngCore for SmallRng {
fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error> {
self.0.try_fill_bytes(dest)
}

fn bytes_per_round(&self) -> usize {
self.0.bytes_per_round()
}
}

impl SeedableRng for SmallRng {
Expand Down Expand Up @@ -1017,6 +1025,9 @@ mod test {
fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error> {
self.inner.try_fill_bytes(dest)
}
fn bytes_per_round(&self) -> usize {
self.inner.bytes_per_round()
}
}

pub fn rng(seed: u64) -> TestRng<StdRng> {
Expand Down
2 changes: 2 additions & 0 deletions src/mock.rs
Original file line number Diff line number Diff line change
Expand Up @@ -58,4 +58,6 @@ impl RngCore for StepRng {
fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error> {
Ok(self.fill_bytes(dest))
}

fn bytes_per_round(&self) -> usize { 8 }
}
8 changes: 8 additions & 0 deletions src/os.rs
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,14 @@ impl RngCore for OsRng {
fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error> {
self.0.try_fill_bytes(dest)
}

fn bytes_per_round(&self) -> usize {
// The overhead of doing a syscall is large compared to the time
// it takes to generate the values. Requesting many values at a time is
// often faster than only one at a time.
// 256 is the limit some operating systems have per system call.
256
}
}

#[cfg(all(unix,
Expand Down
4 changes: 4 additions & 0 deletions src/reseeding.rs
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,10 @@ where R: BlockRngCore<Item = u32> + SeedableRng,
fn try_fill_bytes(&mut self, dest: &mut [u8]) -> Result<(), Error> {
self.0.try_fill_bytes(dest)
}

fn bytes_per_round(&self) -> usize {
self.0.bytes_per_round()
}
}

impl<R, Rsdr> CryptoRng for ReseedingRng<R, Rsdr>
Expand Down