Skip to content

Commit 199456d

Browse files
committed
Add suggestions from Josh Triplett and some other small tweaks
Fix the metadata block Co-authored-by: Josh Triplett <[email protected]> Add annotations to Clang's output of the example
1 parent e20af0d commit 199456d

File tree

1 file changed

+74
-56
lines changed

1 file changed

+74
-56
lines changed

posts/2024-00-00-i128-layout-update.md renamed to posts/2024-03-30-i128-layout-update.md

+74-56
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,12 @@
22
layout: post
33
title: "Changes to `u128`/`i128` layout in 1.77 and 1.78"
44
author: Trevor Gross
5-
team: Lang
5+
team: The Rust Lang Team <https://www.rust-lang.org/governance/teams/lang>
66
---
77

8-
Rust has long had an inconsistency with C regarding the alignment of 128-bit integers.
9-
This problem has recently been resolved, but the fix comes with some effects that are
10-
worth being aware of.
8+
Rust has long had an inconsistency with C regarding the alignment of 128-bit integers
9+
on the x86-32 and x86-64 architectures. This problem has recently been resolved, but
10+
the fix comes with some effects that are worth being aware of.
1111

1212
As a user, you most likely do not need to worry about these changes unless you are:
1313

@@ -18,9 +18,9 @@ There are also no changes to architectures other than x86-32 and x86-64. If your
1818
code makes heavy use of 128-bit integers, you may notice runtime performance increases
1919
at a possible cost of additional memory use.
2020

21-
This post is intended to clarify what changed, why it changed, and what to expect. If
22-
you are only looking for a compatibility matrix, jump to the
23-
[Compatibility](#compatibility) section.
21+
This post documents what the problem was, what changed to fix it, and what to expect
22+
with the changes. If you are already familiar with the problem and only looking for a
23+
compatibility matrix, jump to the [Compatibility](#compatibility) section.
2424

2525
# Background
2626

@@ -32,13 +32,13 @@ The size of simple types like primitives is usually unambiguous, being the exact
3232
the data they represent with no padding (unused space). For example, an `i64` always has
3333
a size of 64 bits or 8 bytes.
3434

35-
Alignment, however, can seem less consistent. An 8-byte integer _could_ reasonably be
36-
stored at any memory address (1-byte aligned), but most 64-bit computers will get the
37-
best performance if it is instead stored at a multiple of 8 (8-byte aligned). So, like
38-
in other languages, primitives in Rust have this most efficient alignment by default.
39-
The effects of this can be seen when creating composite types: [^composite-playground]
35+
Alignment, however, can vary. An 8-byte integer _could_ be stored at any memory address
36+
(1-byte aligned), but most 64-bit computers will get the best performance if it is
37+
instead stored at a multiple of 8 (8-byte aligned). So, like in other languages,
38+
primitives in Rust have this most efficient alignment by default. The effects of this
39+
can be seen when creating composite types ([playground link][composite-playground]):
4040

41-
```rust=
41+
```rust
4242
use core::mem::{align_of, offset_of};
4343

4444
#[repr(C)]
@@ -49,7 +49,7 @@ struct Foo {
4949

5050
#[repr(C)]
5151
struct Bar {
52-
a: u8, // 1=byte aligned
52+
a: u8, // 1-byte aligned
5353
b: u64, // 8-byte aligned
5454
}
5555

@@ -69,52 +69,53 @@ Alignment of Bar: 8
6969
```
7070

7171
We see that within a struct, a type will always be placed such that its offset is a
72-
multiple of its alignment.
72+
multiple of its alignment - even if this means unused space (Rust minimizes this by
73+
default when `repr(C)` is not used).
7374

7475
These numbers are not arbitrary; the application binary interface (ABI) says what they
7576
should be. In the x86-64 [psABI] (processor-specific ABI) for System V (Unix & Linux),
7677
_Figure 3.1: Scalar Types_ tells us exactly how primitives should be represented:
7778

78-
| C type | Rust equivalent | `sizeof` | Alignment (bytes) |
79-
| ---------------- | --------------- | -------- | ----------------- |
80-
| `char` | `i8` | 1 | 1 |
81-
| `unsigned char` | `u8` | 1 | 1 |
82-
| `short` | `i16` | 2 | 2 |
83-
| `unsigned short` | `u16` | 2 | 2 |
84-
| `long` | `i64` | 8 | 8 |
85-
| `unsigned long` | `u64` | 8 | 8 |
79+
| C type | Rust equivalent | `sizeof` | Alignment (bytes) |
80+
| -------------------- | --------------- | -------- | ----------------- |
81+
| `char` | `i8` | 1 | 1 |
82+
| `unsigned char` | `u8` | 1 | 1 |
83+
| `short` | `i16` | 2 | 2 |
84+
| **`unsigned short`** | **`u16`** | **2** | **2** |
85+
| `long` | `i64` | 8 | 8 |
86+
| **`unsigned long`** | **`u64`** | **8** | **8** |
8687

8788
The ABI only specifies C types, but Rust follows the same definitions both for
8889
compatibility and for the performance benefits.
8990

9091
# The Incorrect Alignment Problem
9192

92-
It is easy to imagine that if two implementations disagree on the alignment of a data
93-
type, they would not be able to reliably share data containing that type. Well...
93+
If two implementations disagree on the alignment of a data type, they cannot reliably
94+
share data containing that type. Rust had inconsistent alignment for 128-bit types:
9495

95-
```rust=
96+
```rust
9697
println!("alignment of i128: {}", align_of::<i128>());
9798
```
9899

99-
```text=
100+
```text
100101
// rustc 1.76.0
101102
alignment of i128: 8
102103
```
103104

104-
```c=
105+
```c
105106
printf("alignment of __int128: %zu\n", _Alignof(__int128));
106107
```
107108
108-
```text=
109+
```text
109110
// gcc 13.2
110111
alignment of __int128: 16
111112
112113
// clang 17.0.1
113114
alignment of __int128: 16
114115
```
115116

116-
Looks like Rust disagrees![^align-godbolt] Looking back at the [psABI], we can see that
117-
Rust indeed is in the wrong here:
117+
([Godbolt link][align-godbolt]) Looking back at the [psABI], we can see that Rust has
118+
the wrong alignment here:
118119

119120
| C type | Rust equivalent | `sizeof` | Alignment (bytes) |
120121
| ------------------- | --------------- | -------- | ----------------- |
@@ -125,21 +126,22 @@ It turns out this isn't because of something that Rust is actively doing incorre
125126
layout of primitives comes from the LLVM codegen backend used by both Rust and Clang,
126127
among other languages, and it has the alignment for `i128` hardcoded to 8 bytes.
127128

128-
Clang does not have this issue only because of a workaround, where the alignment is
129+
Clang uses the correct alignment only because of a workaround, where the alignment is
129130
manually set to 16 bytes before handing the type to LLVM. This fixes the layout issue
130131
but has been the source of some other minor problems.[^f128-segfault][^va-segfault]
131132
Rust does no such manual adjustement, hence the issue reported at
132133
<https://github.com/rust-lang/rust/issues/54341>.
133134

134135
# The Calling Convention Problem
135136

136-
It happens that there an additional problem: LLVM does not always do the correct thing
137-
when passing 128-bit integers as function arguments. This was a [known issue in LLVM],
138-
before its [relevance to Rust was discovered].
137+
There is an additional problem: LLVM does not always do the correct thing when passing
138+
128-bit integers as function arguments. This was a [known issue in LLVM], before its
139+
[relevance to Rust was discovered].
139140

140-
When calling a function, the arguments get passed in registers until there are no more
141-
slots, then they get "spilled" to the stack. The ABI tells us what to do here as well,
142-
in the section _3.2.3 Parameter Passing_:
141+
When calling a function, the arguments get passed in registers (special storage
142+
locations within the CPU) until there are no more slots, then they get "spilled" to
143+
the stack (the program's memory). The ABI tells us what to do here as well, in the
144+
section _3.2.3 Parameter Passing_:
143145

144146
> Arguments of type `__int128` offer the same operations as INTEGERs, yet they do not
145147
> fit into one general purpose register but require two registers. For classification
@@ -159,12 +161,11 @@ example, inline assembly is used to call `foo(0xaf, val, val, val)` with `val` a
159161
`0x0x11223344556677889900aabbccddeeff`.
160162
161163
x86-64 uses the registers `rdi`, `rsi`, `rdx`, `rcx`, `r8`, and `r9` to pass function
162-
arguments, in that order (you guessed it, this is also in the ABI). Each argument
163-
fits a word (64 bits), and anything that doesn't fit gets `push`ed to the
164-
stack.
164+
arguments, in that order (you guessed it, this is also in the ABI). Each register
165+
fits a word (64 bits), and anything that doesn't fit gets `push`ed to the stack.
165166
166-
```c=
167-
/* full example at https://godbolt.org/z/zGaK1T96c */
167+
```c
168+
/* full example at <https://godbolt.org/z/5c8cb5cxs> */
168169
169170
/* to see the issue, we need a padding value to "mess up" argument alignment */
170171
void foo(char pad, __int128 a, __int128 b, __int128 c) {
@@ -176,7 +177,9 @@ void foo(char pad, __int128 a, __int128 b, __int128 c) {
176177
177178
int main() {
178179
asm(
179-
"movl $0xaf, %edi \n\t" /* 1st slot (edi): padding char */
180+
/* load arguments that fit in registers */
181+
"movl $0xaf, %edi \n\t" /* 1st slot (edi): padding char (`edi` is the
182+
* same as `rdi`, just a smaller access size) */
180183
"movq $0x9900aabbccddeeff, %rsi \n\t" /* 2rd slot (rsi): lower half of `a` */
181184
"movq $0x1122334455667788, %rdx \n\t" /* 3nd slot (rdx): upper half of `a` */
182185
"movq $0x9900aabbccddeeff, %rcx \n\t" /* 4th slot (rcx): lower half of `b` */
@@ -187,6 +190,7 @@ int main() {
187190
/* reuse our stored registers to load the stack */
188191
"pushq %rdx \n\t" /* upper half of `c` gets passed on the stack */
189192
"pushq %rsi \n\t" /* lower half of `c` gets passed on the stack */
193+
190194
"call foo \n\t" /* call the function */
191195
"addq $16, %rsp \n\t" /* reset the stack */
192196
);
@@ -209,15 +213,17 @@ But running with Clang 17 prints:
209213
0x11223344556677889900aabbccddeeff
210214
0x11223344556677889900aabbccddeeff
211215
0x9900aabbccddeeffdeadbeef4c0ffee0
216+
//^^^^^^^^^^^^^^^^ this should be the lower half
217+
// ^^^^^^^^^^^^^^^^ look familiar?
212218
```
213219
214220
Surprise!
215221
216222
This illustrates the second problem: LLVM expects an `i128` to be passed half in a
217-
register and half on the stack, but this is not allowed by the ABI.
223+
register and half on the stack when possible, but this is not allowed by the ABI.
218224
219-
Since this comes from LLVM and has no reasonable workaround, this is a problem in
220-
both Clang and Rust.
225+
Since the behavior comes from LLVM and has no reasonable workaround, this is a
226+
problem in both Clang and Rust.
221227
222228
# Solutions
223229
@@ -231,17 +237,28 @@ Both of these changes made it into LLVM 18, meaning all relevant ABI issues will
231237
resolved in both Clang and Rust that use this version (Clang 18 and Rust 1.78 when using
232238
the bundled LLVM).
233239
234-
However, `rustc` can also use the version of LLVM installed in the system rather than a
235-
bundled version, which may be older. To mitigate the change of problems from differing
240+
However, `rustc` can also use the version of LLVM installed on the system rather than a
241+
bundled version, which may be older. To mitigate the chance of problems from differing
236242
alignment with the same `rustc` version, [a proposal] was introduced to manually
237-
correct the alignment, like Clang has been doing. This was implemented by Matthew Maurer
243+
correct the alignment like Clang has been doing. This was implemented by Matthew Maurer
238244
in [#11672].
239245
246+
Since these changes, Rust now produces the correct alignment:
247+
248+
```rust
249+
println!("alignment of i128: {}", align_of::<i128>());
250+
```
251+
252+
```text
253+
// rustc 1.77.0
254+
alignment of i128: 16
255+
```
256+
240257
As mentioned above, part of the reason for an ABI to specify the alignment of a datatype
241258
is because it is more efficient on that architecture. We actually got to see that
242259
firsthand: the [initial performance run] with the manual alignment change showed
243260
nontrivial improvements to compiler performance (which relies heavily on 128-bit
244-
integers to store integer literals). The downside of increasing alignment is that
261+
integers to work with integer literals). The downside of increasing alignment is that
245262
composite types do not always fit together as nicely in memory, leading to an increase
246263
in usage. Unfortunately this meant some of the performance wins needed to be sacrificed
247264
to avoid an increased memory footprint.
@@ -252,7 +269,7 @@ to avoid an increased memory footprint.
252269
[D28990]: https://reviews.llvm.org/D28990
253270
[D86310]: https://reviews.llvm.org/D86310
254271
255-
# Compatibilty
272+
# Compatibility
256273
257274
The most imporant question is how compatibility changed as a result of these fixes. In
258275
short, `i128` and `u128` with Rust using LLVM 18 (the default version starting with
@@ -279,20 +296,21 @@ are summarized in the table below:
279296
280297
# Effects & Future Steps
281298
282-
As mentioned in the introduction, most users will see no effects of this change
299+
As mentioned in the introduction, most users will notice no effects of this change
283300
unless you are already doing something questionable with these types.
284301
285302
Starting with Rust 1.77, it will be reasonably safe to start experimenting with
286303
128-bit integers in FFI, with some more certainty coming with the LLVM update
287304
in 1.78. There is [ongoing discussion] about lifting the lint in an upcoming
288-
version, but it remains to be seen when that will actually happen.
305+
version, but we want to be cautious and avoid introducing silent breakage for users
306+
whose Rust compiler may be built with an older LLVM.
289307
290308
[relevance to Rust was discovered]: https://github.com/rust-lang/rust/issues/54341#issuecomment-1064729606
291309
[initial performance run]: https://github.com/rust-lang/rust/pull/116672/#issuecomment-1858600381
292310
[known issue in llvm]: https://github.com/llvm/llvm-project/issues/41784
293311
[psabi]: https://www.uclibc.org/docs/psABI-x86_64.pdf
294312
[ongoing discussion]: https://github.com/rust-lang/lang-team/issues/255
295-
[^align-godbolt]: https://godbolt.org/z/h94Ge1vMW
296-
[^composite-playground]: https://play.rust-lang.org/?version=beta&mode=debug&edition=2021&gist=c263ae121912284d3ba553290caa6778
313+
[align-godbolt]: https://godbolt.org/z/h94Ge1vMW
314+
[composite-playground]: https://play.rust-lang.org/?version=beta&mode=debug&edition=2021&gist=52f349bdea92bf724bc453f37dbd32ea
297315
[^va-segfault]: https://github.com/llvm/llvm-project/issues/20283
298316
[^f128-segfault]: https://bugs.llvm.org/show_bug.cgi?id=50198

0 commit comments

Comments
 (0)