Expose `width` on vec types and provide sum() ability. #5

ralfbiedert · 2017-12-02T23:34:40Z

I have code where I want to add / multiply two vectors and compute the sum over all results. It looks somewhat like this:

let mut simd_sum = f32s::splat(0.0f32); 

for (x, y) in zip(xvec.simd_iter(), yvec.simd_iter()) { 
    simd_sum = simd_sum + (x - y) * (x - y);
}

// TODO ... 
let sum = sum_f32s(simd_sum, simd_width);

I haven't found a way to a) either sum simd_sum (e.g., sum = simd_sum.0 + simd_sum.1 + ... simd_sum.n) with an existing function, or b) generically implement my own sum function.

The problem I'm having with implementing sum_f32s myself is that I haven't seen an easy way to get the current width. The hack I'm using looks like this:

let _temp = [0.0f32; 32];
let simd_width = (&_temp[..]).simd_iter().width();

It would be nice if both a) and b) were implemented, or, if they are already, documented how to use them.

The text was updated successfully, but these errors were encountered:

AdamNiederer · 2017-12-03T00:06:10Z

Hello,

Thank you for using faster, and apologies for the current lack of documentation and API instability - we're still a ways away from a stable release :)

Faster's vector types are just typedefs over stdsimd's vector types. It looks like stdsimd's docs are in a bit of upheaval right now, but here's the impl applied to all vector types in stdsimd. Unfortunately, one can't get the width of a vector with just that API (although you can get the scalar values with .extract(index)).

I'm not sure if publicly exposing the width of a vector is a smart idea, as it could encourage people to write fragile or nonportable code. I'm still open to feedback on whether that should be part of the public API.

A more elegant solution (imo) to your problem should be arriving in 0.3.0 (late December probably), which would allow you to reduce over both a vector and a collection using SIMD, so your code would look like

xvec.zip(yvec).simd_iter() // returns `PackedZip<f32>`
    .simd_reduce(|acc, (a, b)| acc + (a - b) * (a - b)) // returns `f32s`
    .scalar_reduce(|a, b| a + b) // returns `f32`

with the first closure doing roughly what your for loop does, and the second closure doing roughly what your sum_f32s does. Does that look like it would cover your use cases?

ralfbiedert · 2017-12-03T10:38:39Z

Hi, thank you very much for your swift response!

I'm not sure if publicly exposing the width of a vector is a smart idea, as it could encourage people to write fragile or nonportable code.

I'm probably too new to Rust to give a good answer.

However, my feeling is that people working with SIMD (and downloading faster) at least have a bit of an understanding what is happening behind the scenes.

Then it's a balance between enabling unforeseen use cases, and preventing happy little accidents.

For example, I used the hack described above ((&_temp[..]).simd_iter().width()) to implement my own sum, as I didn't know how wide my f32s would be. If that were removed I would have felt even more frustrated in the sense 'something important is missing here and I can't find it'.

Given Rust is a systems programming language, I think I would want to have access to low-level information, in the "safest way possible given the circumstances" (e.g., if you later do run time feature detection #2, the API should accurately / dynamically reflect the current type mappings and vector width).

A more elegant solution (imo) to your problem should be arriving in 0.3.0 [...] Does that look like it would cover your use cases?

Yes, looks very nice!

the current lack of documentation and API instability

Regarding documentation, when I started looking at faster I was mostly interested in examples, exactly like the one you just provided. I think if you had a dozen or so diverse ones in the README it would greatly help people to pick up faster even faster (sorry :).

AdamNiederer · 2017-12-19T21:00:25Z

width() and sum() on vectors has landed, although I haven't implemented a scalar_reduce-like function yet. I should have this done by 0.4.0.

Because reductive operations are inherently unportable anyway, I see no harm in exposing some nonportable APIs with sufficient warnings in the documentation.

ralfbiedert · 2017-12-20T10:34:15Z

Thanks! Does a way doing like xvec.zip(yvec).simd_iter() exist already? I saw you mentioned PackedZip<f32>, but didn't find it in the source. What would be the preferred way of doing this with 0.3?

AdamNiederer · 2017-12-20T20:10:09Z

That's also coming in 0.4.0 - I'm sitting on a mostly-working implmentation right now, but I ran into the "you can't impl a trait in std on a type in std" problem at the last second. That should land in master alongside or immediately before gathers/scatters.

AdamNiederer · 2018-01-24T02:27:20Z

Resolved in 0.4.0

AdamNiederer added this to the 0.4.0 milestone Dec 19, 2017

AdamNiederer closed this as completed Jan 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose `width` on vec types and provide sum() ability. #5

Expose `width` on vec types and provide sum() ability. #5

ralfbiedert commented Dec 2, 2017

AdamNiederer commented Dec 3, 2017 •

edited

Loading

ralfbiedert commented Dec 3, 2017

AdamNiederer commented Dec 19, 2017

ralfbiedert commented Dec 20, 2017 •

edited

Loading

AdamNiederer commented Dec 20, 2017

AdamNiederer commented Jan 24, 2018

Expose width on vec types and provide sum() ability. #5

Expose width on vec types and provide sum() ability. #5

Comments

ralfbiedert commented Dec 2, 2017

AdamNiederer commented Dec 3, 2017 • edited Loading

ralfbiedert commented Dec 3, 2017

AdamNiederer commented Dec 19, 2017

ralfbiedert commented Dec 20, 2017 • edited Loading

AdamNiederer commented Dec 20, 2017

AdamNiederer commented Jan 24, 2018

Expose `width` on vec types and provide sum() ability. #5

Expose `width` on vec types and provide sum() ability. #5

AdamNiederer commented Dec 3, 2017 •

edited

Loading

ralfbiedert commented Dec 20, 2017 •

edited

Loading