From 555608029d095f323d4afea60292625a33474133 Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Fri, 12 Oct 2018 10:43:35 +0200 Subject: [PATCH 1/8] initial simd_ffi RFC --- text/0000-simd-ffi.md | 228 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 228 insertions(+) create mode 100644 text/0000-simd-ffi.md diff --git a/text/0000-simd-ffi.md b/text/0000-simd-ffi.md new file mode 100644 index 00000000000..b55d4f50d73 --- /dev/null +++ b/text/0000-simd-ffi.md @@ -0,0 +1,228 @@ +- Feature Name: `simd_ffi` +- Start Date: 2018-10-12 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +This RFC allows using SIMD types in C FFI. + +# Motivation +[motivation]: #motivation + +The architecture-specific SIMD types provided in [`core::arch`] cannot currently +be used in C FFI. That is, Rust programs cannot interface with C libraries that +use these in their APIs. + +One notable example would be calling into vectorized [`libm`] implementations +like [`sleef`], [`libmvec`], or Intel's [`SVML`]. The [`packed_simd`] crate +relies on C FFI with these fundamental libraries to offer competitive +performance. + +[`core::arch`]: https://doc.rust-lang.org/stable/core/arch/index.html +[`libm`]: https://sourceware.org/glibc/wiki/libm +[`sleef`]: https://sleef.org/ +[`libmvec`]: https://sourceware.org/glibc/wiki/libm +[`SVML`]: https://software.intel.com/en-us/node/524289 +[`packed_simd`]: https://github.com/rust-lang-nursery/packed_simd + +## Why is using SIMD vectors in C FFI currently disallowed? + +Consider the following example +([playground](https://play.rust-lang.org/?gist=b8cfb63bb4e7fb00bb293f6e27061c52&version=nightly&mode=debug&edition=2015)): + +```rust +extern "C" fn foo(x: __m256); + +fn main() { + unsafe { + union U { v: __m256, a: [u64; 4] } + foo(U { a: [0; 4] }.v); + } +} +``` + +In this example, a 256-bit wide vector type, `__m256`, is passed to an `extern +"C"` function via C FFI. Is the behavior of passing `__m256` to the C function +defined? + +That depends on both the platform and how the Rust program was compiled! + +First, let's make the platform concrete and assume that it follows the [x64 SysV +ABI][sysv_abi] which states: + +> **3.2.1 Registers and the Stack Frame** +> +> Intel AVX (Advanced Vector Extensions) provides 16 256-bit wide AVX registers +> (`%ymm0` - `%ymm15`). The lower 128-bits of `%ymm0` - `%ymm15` are aliased to +> the respective 128b-bit SSE registers (`%xmm0` - `%xmm15`). For purposes of +> parameter passing and function return, `%xmmN` and `%ymmN` refer to the same +> register. Only one of them can be used at the same time. +> +> **3.2.3 Parameter Passing** +> +> **SSE** The class consists of types that fit into a vector register. +> +> **SSEUP** The class consists of types that fit into a vector register and can +> be passed and returned in the upper bytes of it. + +[sysv_abi]: https://www.uclibc.org/docs/psABI-x86_64.pdf + +Second, in `C`, the `__m256` type is only available if the current translation +unit is being compiled with `AVX` enabled. + +Back to the example: `__m256` is a 256-bit wide vector type, that is, wider than +128-bit, but it can be passed through a vector register using the lower and +upper 128-bits of a 256-bit wide register, and in C, if `__m256` can be used, +these registers are always available. + +That is, the C ABI requires two things: + +* that Rust passes `__m256` via a 256-bit wide register +* that `foo` has the `#[target_feature(enable = "avx")]` attribute ! + +And this is where things went wrong: in Rust, `__m256` is always available +independently of whether `AVX` is available or not[1](#layout_unspecified), +but we haven't specified how we are actually compiling our Rust program above: + +* if we compile it with `AVX` globally enabled, e.g., via `-C + target-feature=+avx`, then the behavior of calling `foo` is defined because + `__m256` will be passed to C in a single 256-bit wide register, which is what + the C ABI requires. + +* if we compile our program without `AVX` enabled, then the Rust program cannot + use 256-bit wide registers because they are not available, so independently of + how `__m256` will be passed to C, it won't be passed in a 256-bit wide + register, and the behavior is undefined because of an ABI mismatch. + +1: its layout is currently unspecified but that +is not relevant for this issue since if 256-bit registers are not available they +cannot be used anyways, which is what matters here. + +So, first of all, is this a big deal? + +Currently, one cannot use SIMD types in C FFI in stable Rust, so technically, +nothing is broken yet, and no, this is not a big deal: stable Rust is still +safe! However, we would like to be able to call C FFI functions without +introducing undefined behavior independently of which `-C target-features` are +passed, so the example code shown above has to be rejected by the compiler. + +Second, you might be wondering: why is `__m256` available even if `AVX` is not +available? That's a good question. We want to use `__m256` in some parts of +Rust's programs even if `AVX` is not globally enabled, and currently we don't +have great infrastructure for conditionally allowing it in some parts of the +program and not others. + +Ideally, one should only be able to use `__m256` and operations on it if `AVX` +is available. Which leads to how can we fix this ? + +The most trivial solution would be to just always require +`#[target_feature(enable = X)]` in C FFI functions using SIMD types, where +"unblocking" the use of each type requires one or two particular feature to be +enabled, e.g., `avx` or `avx2` in the case of `__m256`. + +That is, the compiler would reject the example above with an error: + +``` +error[E1337]: `__m256` on C FFI requires `#[target_feature(enable = "avx")]` + --> src/main.rs:7:15 + | +7 | fn foo(x: __m25a6) -> __m256; + | ^^^^^^^ +``` + +And the following program would always have defined behavior +([playground](https://play.rust-lang.org/?gist=db651d09441fd16172a5c94711b2ab97&version=nightly&mode=debug&edition=2015)): + +```rust +#[target_feature(enable = "avx")] +extern "C" fn foo(x: __m256) -> __m256; + +fn main() { + unsafe { + union U { v: __m256, a: [u64; 4] } + if is_x86_feature_detected!("avx") { + foo(U { a: [0; 4] }.v); + } + } +} +``` + +Note here that: + +* `extern "C" foo` is compiled with `AVX` enabled, so `foo` takes an `__m256` + like the C ABI expects +* the call to `foo` is guarded with an `is_x86_feature_detected`, that is, `foo` + will only be called if `AVX` is available at run-time +* if the Rust binary is compiled without `AVX`, Rust will insert shims in the + call to `foo` to pass it as a 256-bit register. Rust already does this, and + `#[target_feature]` is what allows it to do it. Without the + `#[target_feature]` annotation, Rust does not know that C expects this. + +# Guide-level and reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +Architecture-specific vector types require `#[target_feature]`s to be FFI safe. +That is, they are only safely usable as part of the signature of `extern` +functions if the function has certain `#[target_feature]`s enabled. + +Which `#[target_feature]`s must be enabled depends on the vector types being +used. + +For the stable architecture-specific vector types the following target features +must be enabled: + +* `x86`/`x86_64`: + * `__m128`, `__m128i`, `__m128d`: `"sse"` + * `__m256`, `__m256i`, `__m256d`: `"avx"` + + +Future stabilizations of architecture-specific vector types must specify the +target features required to use them in `extern` functions. + +# Drawbacks +[drawbacks]: #drawbacks + +TBD. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +This is an adhoc solution to the problem. This RFC does not explore more general +mechanisms of dealing with, or abstracting over, these kind of problems. + +## Future architecture-specific vector types + +In the future, we might want to stabilize some of the following vector types. +This section explores which target features would they require: + +* `x86`/`x86_64`: + * `__m64`: `mmx` + * `__m512`, `__m512i`, `__m512f`: "avx512f" +* `arm`: `neon` +* `aarch64`: `neon` +* `ppc64`: `altivec` / `vsx` +* `wasm32`: `simd128` + +## Require the feature to be enabled globally for the binary + +Instead of using `#[target_feature]` we could allow vector types on C FFI only +behind `#[cfg(target_feature)]`, e.g., via something like the portability check. + +This would not allow calling C FFI functions with vector types conditionally on, +e.g., run-time feature detection. + +# Prior art +[prior-art]: #prior-art + +In C, the architecture specific vector types are only available if the required +target features are enabled at compile-time. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +* Should it be possible to use, e.g., `__m128` on C FFI when the `avx` feature + is enabled? Does that change the calling convention and make doing so unsafe ? + We could extern this RFC to also require that to use certain types certain + features must be disabled. From 939a3d00b6aabae20dc519b433b64a30c8e67c96 Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Tue, 23 Oct 2018 16:45:02 +0200 Subject: [PATCH 2/8] typo in rationale --- text/0000-simd-ffi.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/0000-simd-ffi.md b/text/0000-simd-ffi.md index b55d4f50d73..1b9c73f0011 100644 --- a/text/0000-simd-ffi.md +++ b/text/0000-simd-ffi.md @@ -190,7 +190,8 @@ TBD. [rationale-and-alternatives]: #rationale-and-alternatives This is an adhoc solution to the problem. This RFC does not explore more general -mechanisms of dealing with, or abstracting over, these kind of problems. +mechanisms of dealing with, or abstracting over, `target_feature`s associated +with the SIMD vector types for FFI purposes. ## Future architecture-specific vector types From 397821211b5a82e829fe53ad2fe392d18fa8b538 Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Tue, 23 Oct 2018 17:07:30 +0200 Subject: [PATCH 3/8] fix more typos --- text/0000-simd-ffi.md | 33 +++++++++++++-------------------- 1 file changed, 13 insertions(+), 20 deletions(-) diff --git a/text/0000-simd-ffi.md b/text/0000-simd-ffi.md index 1b9c73f0011..509e043c4e8 100644 --- a/text/0000-simd-ffi.md +++ b/text/0000-simd-ffi.md @@ -97,30 +97,20 @@ but we haven't specified how we are actually compiling our Rust program above: register, and the behavior is undefined because of an ABI mismatch. 1: its layout is currently unspecified but that -is not relevant for this issue since if 256-bit registers are not available they -cannot be used anyways, which is what matters here. +is not relevant for this issue - what matters is that 256-bit registers are not +available and therefore they cannot be used. -So, first of all, is this a big deal? - -Currently, one cannot use SIMD types in C FFI in stable Rust, so technically, -nothing is broken yet, and no, this is not a big deal: stable Rust is still -safe! However, we would like to be able to call C FFI functions without -introducing undefined behavior independently of which `-C target-features` are -passed, so the example code shown above has to be rejected by the compiler. - -Second, you might be wondering: why is `__m256` available even if `AVX` is not -available? That's a good question. We want to use `__m256` in some parts of +You might be wondering: why is `__m256` available even if `AVX` is not +available? The reason is that we want to use `__m256` in some parts of Rust's programs even if `AVX` is not globally enabled, and currently we don't have great infrastructure for conditionally allowing it in some parts of the program and not others. Ideally, one should only be able to use `__m256` and operations on it if `AVX` -is available. Which leads to how can we fix this ? - -The most trivial solution would be to just always require -`#[target_feature(enable = X)]` in C FFI functions using SIMD types, where -"unblocking" the use of each type requires one or two particular feature to be -enabled, e.g., `avx` or `avx2` in the case of `__m256`. +is available, and this is exactly what this RFC proposes for using vector types +in C FFI: to always require `#[target_feature(enable = X)]` in C FFI functions +using SIMD types, where "unblocking" the use of each type requires some +particular feature to be enabled, e.g., `avx` or `avx2` in the case of `__m256`. That is, the compiler would reject the example above with an error: @@ -128,8 +118,8 @@ That is, the compiler would reject the example above with an error: error[E1337]: `__m256` on C FFI requires `#[target_feature(enable = "avx")]` --> src/main.rs:7:15 | -7 | fn foo(x: __m25a6) -> __m256; - | ^^^^^^^ +7 | fn foo(x: __m256) -> __m256; + | ^^^^^^ ``` And the following program would always have defined behavior @@ -149,6 +139,9 @@ fn main() { } ``` +independently of the `-C target-feature`s used globally to compile the whole +binary. + Note here that: * `extern "C" foo` is compiled with `AVX` enabled, so `foo` takes an `__m256` From eed41a8410a0c01549dfa47b9cbab584440f58bb Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Tue, 23 Oct 2018 19:22:35 +0200 Subject: [PATCH 4/8] remove confusing wording in rationale --- text/0000-simd-ffi.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/text/0000-simd-ffi.md b/text/0000-simd-ffi.md index 509e043c4e8..e13937f7283 100644 --- a/text/0000-simd-ffi.md +++ b/text/0000-simd-ffi.md @@ -182,9 +182,7 @@ TBD. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives -This is an adhoc solution to the problem. This RFC does not explore more general -mechanisms of dealing with, or abstracting over, `target_feature`s associated -with the SIMD vector types for FFI purposes. +This is an adhoc solution to the problem, but sufficient for FFI purposes. ## Future architecture-specific vector types From 50b25da9a9388a862a3210f26e96e0c4e3952fb5 Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Tue, 23 Oct 2018 19:23:54 +0200 Subject: [PATCH 5/8] update drawbacks --- text/0000-simd-ffi.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-simd-ffi.md b/text/0000-simd-ffi.md index e13937f7283..818556375aa 100644 --- a/text/0000-simd-ffi.md +++ b/text/0000-simd-ffi.md @@ -177,7 +177,7 @@ target features required to use them in `extern` functions. # Drawbacks [drawbacks]: #drawbacks -TBD. +None. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives From e7dabd362cea6bcb514df52d108cc5e45234dc25 Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Wed, 24 Oct 2018 09:21:42 +0200 Subject: [PATCH 6/8] remove wording about inserting shims --- text/0000-simd-ffi.md | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/text/0000-simd-ffi.md b/text/0000-simd-ffi.md index 818556375aa..ded34fea6ca 100644 --- a/text/0000-simd-ffi.md +++ b/text/0000-simd-ffi.md @@ -140,18 +140,14 @@ fn main() { ``` independently of the `-C target-feature`s used globally to compile the whole -binary. - -Note here that: +binary. Note that: * `extern "C" foo` is compiled with `AVX` enabled, so `foo` takes an `__m256` like the C ABI expects * the call to `foo` is guarded with an `is_x86_feature_detected`, that is, `foo` will only be called if `AVX` is available at run-time -* if the Rust binary is compiled without `AVX`, Rust will insert shims in the - call to `foo` to pass it as a 256-bit register. Rust already does this, and - `#[target_feature]` is what allows it to do it. Without the - `#[target_feature]` annotation, Rust does not know that C expects this. +* if the Rust calling convention differs from the calling convention of the + `extern` function, Rust has to adapt these. # Guide-level and reference-level explanation [reference-level-explanation]: #reference-level-explanation From a910d9f0f19138deed2956fc401880cfa930f7f2 Mon Sep 17 00:00:00 2001 From: gnzlbg Date: Wed, 27 Mar 2019 13:28:28 +0100 Subject: [PATCH 7/8] Clarify union transmute in the RFC example --- text/0000-simd-ffi.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/text/0000-simd-ffi.md b/text/0000-simd-ffi.md index ded34fea6ca..ef155e7cb8d 100644 --- a/text/0000-simd-ffi.md +++ b/text/0000-simd-ffi.md @@ -131,9 +131,12 @@ extern "C" fn foo(x: __m256) -> __m256; fn main() { unsafe { - union U { v: __m256, a: [u64; 4] } + #[repr(C)] union U { v: __m256, a: [u64; 4] } if is_x86_feature_detected!("avx") { - foo(U { a: [0; 4] }.v); + // note: this operation is used here for readability + // but its behavior is currently unspecified (see note above). + let vec = U { a: [0; 4] }.v; + foo(vec); } } } @@ -212,5 +215,5 @@ target features are enabled at compile-time. * Should it be possible to use, e.g., `__m128` on C FFI when the `avx` feature is enabled? Does that change the calling convention and make doing so unsafe ? - We could extern this RFC to also require that to use certain types certain + We could extend this RFC to also require that to use certain types certain features must be disabled. From 2ba7466029d3674bd586a925dd5bb1c75856a24a Mon Sep 17 00:00:00 2001 From: Mazdak Farrokhzad Date: Sun, 28 Jul 2019 09:53:49 +0200 Subject: [PATCH 8/8] RFC 2574 --- text/{0000-simd-ffi.md => 2574-simd-ffi.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-simd-ffi.md => 2574-simd-ffi.md} (98%) diff --git a/text/0000-simd-ffi.md b/text/2574-simd-ffi.md similarity index 98% rename from text/0000-simd-ffi.md rename to text/2574-simd-ffi.md index ef155e7cb8d..9ca9f04614e 100644 --- a/text/0000-simd-ffi.md +++ b/text/2574-simd-ffi.md @@ -1,7 +1,7 @@ - Feature Name: `simd_ffi` - Start Date: 2018-10-12 -- RFC PR: (leave this empty) -- Rust Issue: (leave this empty) +- RFC PR: [rust-lang/rfcs#2574](https://github.com/rust-lang/rfcs/pull/2574) +- Rust Issue: [rust-lang/rust#63068](https://github.com/rust-lang/rust/issues/63068) # Summary [summary]: #summary