2
2
layout : post
3
3
title : " Changes to `u128`/`i128` layout in 1.77 and 1.78"
4
4
author : Trevor Gross
5
- team : Lang
5
+ team : The Rust Lang Team <https://www.rust-lang.org/governance/teams/lang>
6
6
---
7
7
8
- Rust has long had an inconsistency with C regarding the alignment of 128-bit integers.
9
- This problem has recently been resolved, but the fix comes with some effects that are
10
- worth being aware of.
8
+ Rust has long had an inconsistency with C regarding the alignment of 128-bit integers
9
+ on the x86-32 and x86-64 architectures. This problem has recently been resolved, but
10
+ the fix comes with some effects that are worth being aware of.
11
11
12
12
As a user, you most likely do not need to worry about these changes unless you are:
13
13
@@ -18,9 +18,9 @@ There are also no changes to architectures other than x86-32 and x86-64. If your
18
18
code makes heavy use of 128-bit integers, you may notice runtime performance increases
19
19
at a possible cost of additional memory use.
20
20
21
- This post is intended to clarify what changed, why it changed , and what to expect. If
22
- you are only looking for a compatibility matrix, jump to the
23
- [ Compatibility] ( #compatibility ) section.
21
+ This post documents what the problem was, what changed to fix it , and what to expect
22
+ with the changes. If you are already familiar with the problem and only looking for a
23
+ compatibility matrix, jump to the [ Compatibility] ( #compatibility ) section.
24
24
25
25
# Background
26
26
@@ -32,13 +32,13 @@ The size of simple types like primitives is usually unambiguous, being the exact
32
32
the data they represent with no padding (unused space). For example, an ` i64 ` always has
33
33
a size of 64 bits or 8 bytes.
34
34
35
- Alignment, however, can seem less consistent . An 8-byte integer _ could_ reasonably be
36
- stored at any memory address (1-byte aligned), but most 64-bit computers will get the
37
- best performance if it is instead stored at a multiple of 8 (8-byte aligned). So, like
38
- in other languages, primitives in Rust have this most efficient alignment by default.
39
- The effects of this can be seen when creating composite types: [ ^ composite-playground ]
35
+ Alignment, however, can vary . An 8-byte integer _ could_ be stored at any memory address
36
+ (1-byte aligned), but most 64-bit computers will get the best performance if it is
37
+ instead stored at a multiple of 8 (8-byte aligned). So, like in other languages,
38
+ primitives in Rust have this most efficient alignment by default. The effects of this
39
+ can be seen when creating composite types ( [ playground link ] [ composite-playground ] ):
40
40
41
- ``` rust=
41
+ ``` rust
42
42
use core :: mem :: {align_of, offset_of};
43
43
44
44
#[repr(C )]
@@ -49,7 +49,7 @@ struct Foo {
49
49
50
50
#[repr(C )]
51
51
struct Bar {
52
- a: u8, // 1= byte aligned
52
+ a : u8 , // 1- byte aligned
53
53
b : u64 , // 8-byte aligned
54
54
}
55
55
@@ -69,52 +69,53 @@ Alignment of Bar: 8
69
69
```
70
70
71
71
We see that within a struct, a type will always be placed such that its offset is a
72
- multiple of its alignment.
72
+ multiple of its alignment - even if this means unused space (Rust minimizes this by
73
+ default when ` repr(C) ` is not used).
73
74
74
75
These numbers are not arbitrary; the application binary interface (ABI) says what they
75
76
should be. In the x86-64 [ psABI] (processor-specific ABI) for System V (Unix & Linux),
76
77
_ Figure 3.1: Scalar Types_ tells us exactly how primitives should be represented:
77
78
78
- | C type | Rust equivalent | ` sizeof ` | Alignment (bytes) |
79
- | ---------------- | --------------- | -------- | ----------------- |
80
- | ` char ` | ` i8 ` | 1 | 1 |
81
- | ` unsigned char ` | ` u8 ` | 1 | 1 |
82
- | ` short ` | ` i16 ` | 2 | 2 |
83
- | ` unsigned short ` | ` u16 ` | 2 | 2 |
84
- | ` long ` | ` i64 ` | 8 | 8 |
85
- | ` unsigned long ` | ` u64 ` | 8 | 8 |
79
+ | C type | Rust equivalent | ` sizeof ` | Alignment (bytes) |
80
+ | -------------------- | --------------- | -------- | ----------------- |
81
+ | ` char ` | ` i8 ` | 1 | 1 |
82
+ | ` unsigned char ` | ` u8 ` | 1 | 1 |
83
+ | ` short ` | ` i16 ` | 2 | 2 |
84
+ | ** ` unsigned short ` ** | ** ` u16 ` ** | ** 2 ** | ** 2 ** |
85
+ | ` long ` | ` i64 ` | 8 | 8 |
86
+ | ** ` unsigned long ` ** | ** ` u64 ` ** | ** 8 ** | ** 8 ** |
86
87
87
88
The ABI only specifies C types, but Rust follows the same definitions both for
88
89
compatibility and for the performance benefits.
89
90
90
91
# The Incorrect Alignment Problem
91
92
92
- It is easy to imagine that if two implementations disagree on the alignment of a data
93
- type, they would not be able to reliably share data containing that type. Well...
93
+ If two implementations disagree on the alignment of a data type, they cannot reliably
94
+ share data containing that type. Rust had inconsistent alignment for 128-bit types:
94
95
95
- ``` rust=
96
+ ``` rust
96
97
println! (" alignment of i128: {}" , align_of :: <i128 >());
97
98
```
98
99
99
- ``` text=
100
+ ``` text
100
101
// rustc 1.76.0
101
102
alignment of i128: 8
102
103
```
103
104
104
- ``` c=
105
+ ``` c
105
106
printf ("alignment of __ int128: %zu\n", _ Alignof(__ int128));
106
107
```
107
108
108
- ``` text=
109
+ ```text
109
110
// gcc 13.2
110
111
alignment of __int128: 16
111
112
112
113
// clang 17.0.1
113
114
alignment of __int128: 16
114
115
```
115
116
116
- Looks like Rust disagrees! [ ^ align-godbolt ] Looking back at the [ psABI] , we can see that
117
- Rust indeed is in the wrong here:
117
+ ( [ Godbolt link ] [ align-godbolt ] ) Looking back at the [ psABI] , we can see that Rust has
118
+ the wrong alignment here:
118
119
119
120
| C type | Rust equivalent | ` sizeof ` | Alignment (bytes) |
120
121
| ------------------- | --------------- | -------- | ----------------- |
@@ -125,21 +126,22 @@ It turns out this isn't because of something that Rust is actively doing incorre
125
126
layout of primitives comes from the LLVM codegen backend used by both Rust and Clang,
126
127
among other languages, and it has the alignment for ` i128 ` hardcoded to 8 bytes.
127
128
128
- Clang does not have this issue only because of a workaround, where the alignment is
129
+ Clang uses the correct alignment only because of a workaround, where the alignment is
129
130
manually set to 16 bytes before handing the type to LLVM. This fixes the layout issue
130
131
but has been the source of some other minor problems.[ ^ f128-segfault ] [ ^ va-segfault ]
131
132
Rust does no such manual adjustement, hence the issue reported at
132
133
< https://github.com/rust-lang/rust/issues/54341 > .
133
134
134
135
# The Calling Convention Problem
135
136
136
- It happens that there an additional problem: LLVM does not always do the correct thing
137
- when passing 128-bit integers as function arguments. This was a [ known issue in LLVM] ,
138
- before its [ relevance to Rust was discovered] .
137
+ There is an additional problem: LLVM does not always do the correct thing when passing
138
+ 128-bit integers as function arguments. This was a [ known issue in LLVM] , before its
139
+ [ relevance to Rust was discovered] .
139
140
140
- When calling a function, the arguments get passed in registers until there are no more
141
- slots, then they get "spilled" to the stack. The ABI tells us what to do here as well,
142
- in the section _ 3.2.3 Parameter Passing_ :
141
+ When calling a function, the arguments get passed in registers (special storage
142
+ locations within the CPU) until there are no more slots, then they get "spilled" to
143
+ the stack (the program's memory). The ABI tells us what to do here as well, in the
144
+ section _ 3.2.3 Parameter Passing_ :
143
145
144
146
> Arguments of type ` __int128 ` offer the same operations as INTEGERs, yet they do not
145
147
> fit into one general purpose register but require two registers. For classification
@@ -159,12 +161,11 @@ example, inline assembly is used to call `foo(0xaf, val, val, val)` with `val` a
159
161
`0x0x11223344556677889900aabbccddeeff`.
160
162
161
163
x86-64 uses the registers `rdi`, `rsi`, `rdx`, `rcx`, `r8`, and `r9` to pass function
162
- arguments, in that order (you guessed it, this is also in the ABI). Each argument
163
- fits a word (64 bits), and anything that doesn't fit gets `push`ed to the
164
- stack.
164
+ arguments, in that order (you guessed it, this is also in the ABI). Each register
165
+ fits a word (64 bits), and anything that doesn't fit gets `push`ed to the stack.
165
166
166
- ```c=
167
- /* full example at https://godbolt.org/z/zGaK1T96c */
167
+ ```c
168
+ /* full example at < https://godbolt.org/z/5c8cb5cxs> */
168
169
169
170
/* to see the issue, we need a padding value to "mess up" argument alignment */
170
171
void foo(char pad, __int128 a, __int128 b, __int128 c) {
@@ -176,7 +177,9 @@ void foo(char pad, __int128 a, __int128 b, __int128 c) {
176
177
177
178
int main() {
178
179
asm(
179
- "movl $0xaf, %edi \n\t" /* 1st slot (edi): padding char */
180
+ /* load arguments that fit in registers */
181
+ "movl $0xaf, %edi \n\t" /* 1st slot (edi): padding char (`edi` is the
182
+ * same as `rdi`, just a smaller access size) */
180
183
"movq $0x9900aabbccddeeff, %rsi \n\t" /* 2rd slot (rsi): lower half of `a` */
181
184
"movq $0x1122334455667788, %rdx \n\t" /* 3nd slot (rdx): upper half of `a` */
182
185
"movq $0x9900aabbccddeeff, %rcx \n\t" /* 4th slot (rcx): lower half of `b` */
@@ -187,6 +190,7 @@ int main() {
187
190
/* reuse our stored registers to load the stack */
188
191
"pushq %rdx \n\t" /* upper half of `c` gets passed on the stack */
189
192
"pushq %rsi \n\t" /* lower half of `c` gets passed on the stack */
193
+
190
194
"call foo \n\t" /* call the function */
191
195
"addq $16, %rsp \n\t" /* reset the stack */
192
196
);
@@ -209,15 +213,17 @@ But running with Clang 17 prints:
209
213
0x11223344556677889900aabbccddeeff
210
214
0x11223344556677889900aabbccddeeff
211
215
0x9900aabbccddeeffdeadbeef4c0ffee0
216
+ //^^^^^^^^^^^^^^^^ this should be the lower half
217
+ // ^^^^^^^^^^^^^^^^ look familiar?
212
218
```
213
219
214
220
Surprise!
215
221
216
222
This illustrates the second problem: LLVM expects an ` i128 ` to be passed half in a
217
- register and half on the stack, but this is not allowed by the ABI.
223
+ register and half on the stack when possible , but this is not allowed by the ABI.
218
224
219
- Since this comes from LLVM and has no reasonable workaround, this is a problem in
220
- both Clang and Rust.
225
+ Since the behavior comes from LLVM and has no reasonable workaround, this is a
226
+ problem in both Clang and Rust.
221
227
222
228
# Solutions
223
229
@@ -231,17 +237,28 @@ Both of these changes made it into LLVM 18, meaning all relevant ABI issues will
231
237
resolved in both Clang and Rust that use this version (Clang 18 and Rust 1.78 when using
232
238
the bundled LLVM).
233
239
234
- However, ` rustc ` can also use the version of LLVM installed in the system rather than a
235
- bundled version, which may be older. To mitigate the change of problems from differing
240
+ However, ` rustc ` can also use the version of LLVM installed on the system rather than a
241
+ bundled version, which may be older. To mitigate the chance of problems from differing
236
242
alignment with the same ` rustc ` version, [ a proposal] was introduced to manually
237
- correct the alignment, like Clang has been doing. This was implemented by Matthew Maurer
243
+ correct the alignment like Clang has been doing. This was implemented by Matthew Maurer
238
244
in [ #11672 ] .
239
245
246
+ Since these changes, Rust now produces the correct alignment:
247
+
248
+ ``` rust
249
+ println! (" alignment of i128: {}" , align_of :: <i128 >());
250
+ ```
251
+
252
+ ``` text
253
+ // rustc 1.77.0
254
+ alignment of i128: 16
255
+ ```
256
+
240
257
As mentioned above, part of the reason for an ABI to specify the alignment of a datatype
241
258
is because it is more efficient on that architecture. We actually got to see that
242
259
firsthand: the [ initial performance run] with the manual alignment change showed
243
260
nontrivial improvements to compiler performance (which relies heavily on 128-bit
244
- integers to store integer literals). The downside of increasing alignment is that
261
+ integers to work with integer literals). The downside of increasing alignment is that
245
262
composite types do not always fit together as nicely in memory, leading to an increase
246
263
in usage. Unfortunately this meant some of the performance wins needed to be sacrificed
247
264
to avoid an increased memory footprint.
@@ -252,7 +269,7 @@ to avoid an increased memory footprint.
252
269
[ D28990 ] : https://reviews.llvm.org/D28990
253
270
[ D86310 ] : https://reviews.llvm.org/D86310
254
271
255
- # Compatibilty
272
+ # Compatibility
256
273
257
274
The most imporant question is how compatibility changed as a result of these fixes. In
258
275
short, ` i128 ` and ` u128 ` with Rust using LLVM 18 (the default version starting with
@@ -279,20 +296,21 @@ are summarized in the table below:
279
296
280
297
# Effects & Future Steps
281
298
282
- As mentioned in the introduction, most users will see no effects of this change
299
+ As mentioned in the introduction, most users will notice no effects of this change
283
300
unless you are already doing something questionable with these types.
284
301
285
302
Starting with Rust 1.77, it will be reasonably safe to start experimenting with
286
303
128-bit integers in FFI, with some more certainty coming with the LLVM update
287
304
in 1.78. There is [ ongoing discussion] about lifting the lint in an upcoming
288
- version, but it remains to be seen when that will actually happen.
305
+ version, but we want to be cautious and avoid introducing silent breakage for users
306
+ whose Rust compiler may be built with an older LLVM.
289
307
290
308
[ relevance to Rust was discovered ] : https://github.com/rust-lang/rust/issues/54341#issuecomment-1064729606
291
309
[ initial performance run ] : https://github.com/rust-lang/rust/pull/116672/#issuecomment-1858600381
292
310
[ known issue in llvm ] : https://github.com/llvm/llvm-project/issues/41784
293
311
[ psabi ] : https://www.uclibc.org/docs/psABI-x86_64.pdf
294
312
[ ongoing discussion ] : https://github.com/rust-lang/lang-team/issues/255
295
- [ ^ align-godbolt ] : https://godbolt.org/z/h94Ge1vMW
296
- [ ^ composite-playground ] : https://play.rust-lang.org/?version=beta&mode=debug&edition=2021&gist=c263ae121912284d3ba553290caa6778
313
+ [ align-godbolt ] : https://godbolt.org/z/h94Ge1vMW
314
+ [ composite-playground ] : https://play.rust-lang.org/?version=beta&mode=debug&edition=2021&gist=52f349bdea92bf724bc453f37dbd32ea
297
315
[ ^ va-segfault ] : https://github.com/llvm/llvm-project/issues/20283
298
316
[ ^ f128-segfault ] : https://bugs.llvm.org/show_bug.cgi?id=50198
0 commit comments