Skip to content

Commit c254258

Browse files
Merge pull request #1281 from tgross35/i128-blog
Create a blog on changes to 128-bit integers
2 parents 47c4534 + 199456d commit c254258

File tree

1 file changed

+316
-0
lines changed

1 file changed

+316
-0
lines changed
+316
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,316 @@
1+
---
2+
layout: post
3+
title: "Changes to `u128`/`i128` layout in 1.77 and 1.78"
4+
author: Trevor Gross
5+
team: The Rust Lang Team <https://www.rust-lang.org/governance/teams/lang>
6+
---
7+
8+
Rust has long had an inconsistency with C regarding the alignment of 128-bit integers
9+
on the x86-32 and x86-64 architectures. This problem has recently been resolved, but
10+
the fix comes with some effects that are worth being aware of.
11+
12+
As a user, you most likely do not need to worry about these changes unless you are:
13+
14+
1. Assuming the alignment of `i128`/`u128` rather than using `align_of`
15+
1. Ignoring the `improper_ctypes*` lints and using these types in FFI
16+
17+
There are also no changes to architectures other than x86-32 and x86-64. If your
18+
code makes heavy use of 128-bit integers, you may notice runtime performance increases
19+
at a possible cost of additional memory use.
20+
21+
This post documents what the problem was, what changed to fix it, and what to expect
22+
with the changes. If you are already familiar with the problem and only looking for a
23+
compatibility matrix, jump to the [Compatibility](#compatibility) section.
24+
25+
# Background
26+
27+
Data types have two intrinsic values that relate to how they can be arranged in memory;
28+
size and alignment. A type's size is the amount of space it takes up in memory, and its
29+
alignment specifies which addresses it is allowed to be placed at.
30+
31+
The size of simple types like primitives is usually unambiguous, being the exact size of
32+
the data they represent with no padding (unused space). For example, an `i64` always has
33+
a size of 64 bits or 8 bytes.
34+
35+
Alignment, however, can vary. An 8-byte integer _could_ be stored at any memory address
36+
(1-byte aligned), but most 64-bit computers will get the best performance if it is
37+
instead stored at a multiple of 8 (8-byte aligned). So, like in other languages,
38+
primitives in Rust have this most efficient alignment by default. The effects of this
39+
can be seen when creating composite types ([playground link][composite-playground]):
40+
41+
```rust
42+
use core::mem::{align_of, offset_of};
43+
44+
#[repr(C)]
45+
struct Foo {
46+
a: u8, // 1-byte aligned
47+
b: u16, // 2-byte aligned
48+
}
49+
50+
#[repr(C)]
51+
struct Bar {
52+
a: u8, // 1-byte aligned
53+
b: u64, // 8-byte aligned
54+
}
55+
56+
println!("Offset of b (u16) in Foo: {}", offset_of!(Foo, b));
57+
println!("Alignment of Foo: {}", align_of::<Foo>());
58+
println!("Offset of b (u64) in Bar: {}", offset_of!(Bar, b));
59+
println!("Alignment of Bar: {}", align_of::<Bar>());
60+
```
61+
62+
Output:
63+
64+
```text
65+
Offset of b (u16) in Foo: 2
66+
Alignment of Foo: 2
67+
Offset of b (u64) in Bar: 8
68+
Alignment of Bar: 8
69+
```
70+
71+
We see that within a struct, a type will always be placed such that its offset is a
72+
multiple of its alignment - even if this means unused space (Rust minimizes this by
73+
default when `repr(C)` is not used).
74+
75+
These numbers are not arbitrary; the application binary interface (ABI) says what they
76+
should be. In the x86-64 [psABI] (processor-specific ABI) for System V (Unix & Linux),
77+
_Figure 3.1: Scalar Types_ tells us exactly how primitives should be represented:
78+
79+
| C type | Rust equivalent | `sizeof` | Alignment (bytes) |
80+
| -------------------- | --------------- | -------- | ----------------- |
81+
| `char` | `i8` | 1 | 1 |
82+
| `unsigned char` | `u8` | 1 | 1 |
83+
| `short` | `i16` | 2 | 2 |
84+
| **`unsigned short`** | **`u16`** | **2** | **2** |
85+
| `long` | `i64` | 8 | 8 |
86+
| **`unsigned long`** | **`u64`** | **8** | **8** |
87+
88+
The ABI only specifies C types, but Rust follows the same definitions both for
89+
compatibility and for the performance benefits.
90+
91+
# The Incorrect Alignment Problem
92+
93+
If two implementations disagree on the alignment of a data type, they cannot reliably
94+
share data containing that type. Rust had inconsistent alignment for 128-bit types:
95+
96+
```rust
97+
println!("alignment of i128: {}", align_of::<i128>());
98+
```
99+
100+
```text
101+
// rustc 1.76.0
102+
alignment of i128: 8
103+
```
104+
105+
```c
106+
printf("alignment of __int128: %zu\n", _Alignof(__int128));
107+
```
108+
109+
```text
110+
// gcc 13.2
111+
alignment of __int128: 16
112+
113+
// clang 17.0.1
114+
alignment of __int128: 16
115+
```
116+
117+
([Godbolt link][align-godbolt]) Looking back at the [psABI], we can see that Rust has
118+
the wrong alignment here:
119+
120+
| C type | Rust equivalent | `sizeof` | Alignment (bytes) |
121+
| ------------------- | --------------- | -------- | ----------------- |
122+
| `__int128` | `i128` | 16 | 16 |
123+
| `unsigned __int128` | `u128` | 16 | 16 |
124+
125+
It turns out this isn't because of something that Rust is actively doing incorrectly:
126+
layout of primitives comes from the LLVM codegen backend used by both Rust and Clang,
127+
among other languages, and it has the alignment for `i128` hardcoded to 8 bytes.
128+
129+
Clang uses the correct alignment only because of a workaround, where the alignment is
130+
manually set to 16 bytes before handing the type to LLVM. This fixes the layout issue
131+
but has been the source of some other minor problems.[^f128-segfault][^va-segfault]
132+
Rust does no such manual adjustement, hence the issue reported at
133+
<https://github.com/rust-lang/rust/issues/54341>.
134+
135+
# The Calling Convention Problem
136+
137+
There is an additional problem: LLVM does not always do the correct thing when passing
138+
128-bit integers as function arguments. This was a [known issue in LLVM], before its
139+
[relevance to Rust was discovered].
140+
141+
When calling a function, the arguments get passed in registers (special storage
142+
locations within the CPU) until there are no more slots, then they get "spilled" to
143+
the stack (the program's memory). The ABI tells us what to do here as well, in the
144+
section _3.2.3 Parameter Passing_:
145+
146+
> Arguments of type `__int128` offer the same operations as INTEGERs, yet they do not
147+
> fit into one general purpose register but require two registers. For classification
148+
> purposes `__int128` is treated as if it were implemented as:
149+
>
150+
> ```c
151+
> typedef struct {
152+
> long low, high;
153+
> } __int128;
154+
> ```
155+
>
156+
> with the exception that arguments of type `__int128` that are stored in memory must be
157+
> aligned on a 16-byte boundary.
158+
159+
We can try this out by implementing the calling convention manually. In the below C
160+
example, inline assembly is used to call `foo(0xaf, val, val, val)` with `val` as
161+
`0x0x11223344556677889900aabbccddeeff`.
162+
163+
x86-64 uses the registers `rdi`, `rsi`, `rdx`, `rcx`, `r8`, and `r9` to pass function
164+
arguments, in that order (you guessed it, this is also in the ABI). Each register
165+
fits a word (64 bits), and anything that doesn't fit gets `push`ed to the stack.
166+
167+
```c
168+
/* full example at <https://godbolt.org/z/5c8cb5cxs> */
169+
170+
/* to see the issue, we need a padding value to "mess up" argument alignment */
171+
void foo(char pad, __int128 a, __int128 b, __int128 c) {
172+
printf("%#x\n", pad & 0xff);
173+
print_i128(a);
174+
print_i128(b);
175+
print_i128(c);
176+
}
177+
178+
int main() {
179+
asm(
180+
/* load arguments that fit in registers */
181+
"movl $0xaf, %edi \n\t" /* 1st slot (edi): padding char (`edi` is the
182+
* same as `rdi`, just a smaller access size) */
183+
"movq $0x9900aabbccddeeff, %rsi \n\t" /* 2rd slot (rsi): lower half of `a` */
184+
"movq $0x1122334455667788, %rdx \n\t" /* 3nd slot (rdx): upper half of `a` */
185+
"movq $0x9900aabbccddeeff, %rcx \n\t" /* 4th slot (rcx): lower half of `b` */
186+
"movq $0x1122334455667788, %r8 \n\t" /* 5th slot (r8): upper half of `b` */
187+
"movq $0xdeadbeef4c0ffee0, %r9 \n\t" /* 6th slot (r9): should be unused, but
188+
* let's trick clang! */
189+
190+
/* reuse our stored registers to load the stack */
191+
"pushq %rdx \n\t" /* upper half of `c` gets passed on the stack */
192+
"pushq %rsi \n\t" /* lower half of `c` gets passed on the stack */
193+
194+
"call foo \n\t" /* call the function */
195+
"addq $16, %rsp \n\t" /* reset the stack */
196+
);
197+
}
198+
```
199+
200+
Running the above with GCC prints the following expected output:
201+
202+
```
203+
0xaf
204+
0x11223344556677889900aabbccddeeff
205+
0x11223344556677889900aabbccddeeff
206+
0x11223344556677889900aabbccddeeff
207+
```
208+
209+
But running with Clang 17 prints:
210+
211+
```
212+
0xaf
213+
0x11223344556677889900aabbccddeeff
214+
0x11223344556677889900aabbccddeeff
215+
0x9900aabbccddeeffdeadbeef4c0ffee0
216+
//^^^^^^^^^^^^^^^^ this should be the lower half
217+
// ^^^^^^^^^^^^^^^^ look familiar?
218+
```
219+
220+
Surprise!
221+
222+
This illustrates the second problem: LLVM expects an `i128` to be passed half in a
223+
register and half on the stack when possible, but this is not allowed by the ABI.
224+
225+
Since the behavior comes from LLVM and has no reasonable workaround, this is a
226+
problem in both Clang and Rust.
227+
228+
# Solutions
229+
230+
Getting these problems resolved was a lengthy effort by many people, starting with a
231+
patch by compiler team member Simonas Kazlauskas in 2017: [D28990]. Unfortunately,
232+
this wound up reverted. It was later attempted again in [D86310] by LLVM contributor
233+
Harald van Dijk, which is the version that finally landed in October 2023.
234+
235+
Around the same time, Nikita Popov fixed the calling convention issue with [D158169].
236+
Both of these changes made it into LLVM 18, meaning all relevant ABI issues will be
237+
resolved in both Clang and Rust that use this version (Clang 18 and Rust 1.78 when using
238+
the bundled LLVM).
239+
240+
However, `rustc` can also use the version of LLVM installed on the system rather than a
241+
bundled version, which may be older. To mitigate the chance of problems from differing
242+
alignment with the same `rustc` version, [a proposal] was introduced to manually
243+
correct the alignment like Clang has been doing. This was implemented by Matthew Maurer
244+
in [#11672].
245+
246+
Since these changes, Rust now produces the correct alignment:
247+
248+
```rust
249+
println!("alignment of i128: {}", align_of::<i128>());
250+
```
251+
252+
```text
253+
// rustc 1.77.0
254+
alignment of i128: 16
255+
```
256+
257+
As mentioned above, part of the reason for an ABI to specify the alignment of a datatype
258+
is because it is more efficient on that architecture. We actually got to see that
259+
firsthand: the [initial performance run] with the manual alignment change showed
260+
nontrivial improvements to compiler performance (which relies heavily on 128-bit
261+
integers to work with integer literals). The downside of increasing alignment is that
262+
composite types do not always fit together as nicely in memory, leading to an increase
263+
in usage. Unfortunately this meant some of the performance wins needed to be sacrificed
264+
to avoid an increased memory footprint.
265+
266+
[a proposal]: https://github.com/rust-lang/compiler-team/issues/683
267+
[#11672]: https://github.com/rust-lang/rust/pull/116672/
268+
[D158169]: https://reviews.llvm.org/D158169
269+
[D28990]: https://reviews.llvm.org/D28990
270+
[D86310]: https://reviews.llvm.org/D86310
271+
272+
# Compatibility
273+
274+
The most imporant question is how compatibility changed as a result of these fixes. In
275+
short, `i128` and `u128` with Rust using LLVM 18 (the default version starting with
276+
1.78) will be completely compatible with any version of GCC, as well as Clang 18 and
277+
above (released March 2024). All other combinations have some incompatible cases, which
278+
are summarized in the table below:
279+
280+
| Compiler 1 | Compiler 2 | status |
281+
| ---------------------------------- | ------------------- | ----------------------------------- |
282+
| Rust ≥ 1.78 with bundled LLVM (18) | GCC (any version) | Fully compatible |
283+
| Rust ≥ 1.78 with bundled LLVM (18) | Clang ≥ 18 | Fully compatible |
284+
| Rust ≥ 1.77 with LLVM ≥ 18 | GCC (any version) | Fully compatible |
285+
| Rust ≥ 1.77 with LLVM ≥ 18 | Clang ≥ 18 | Fully compatible |
286+
| Rust ≥ 1.77 with LLVM ≥ 18 | Clang \< 18 | Storage compatible, has calling bug |
287+
| Rust ≥ 1.77 with LLVM \< 18 | GCC (any version) | Storage compatible, has calling bug |
288+
| Rust ≥ 1.77 with LLVM \< 18 | Clang (any version) | Storage compatible, has calling bug |
289+
| Rust \< 1.77[^l] | GCC (any version) | Incompatible |
290+
| Rust \< 1.77[^l] | Clang (any version) | Incompatible |
291+
| GCC (any version) | Clang ≥ 18 | Fully compatible |
292+
| GCC (any version) | Clang \< 18 | Storage compatible with calling bug |
293+
294+
[^l]: Rust < 1.77 with LLVM 18 will have some degree of compatibility, this is just
295+
an uncommon combination.
296+
297+
# Effects & Future Steps
298+
299+
As mentioned in the introduction, most users will notice no effects of this change
300+
unless you are already doing something questionable with these types.
301+
302+
Starting with Rust 1.77, it will be reasonably safe to start experimenting with
303+
128-bit integers in FFI, with some more certainty coming with the LLVM update
304+
in 1.78. There is [ongoing discussion] about lifting the lint in an upcoming
305+
version, but we want to be cautious and avoid introducing silent breakage for users
306+
whose Rust compiler may be built with an older LLVM.
307+
308+
[relevance to Rust was discovered]: https://github.com/rust-lang/rust/issues/54341#issuecomment-1064729606
309+
[initial performance run]: https://github.com/rust-lang/rust/pull/116672/#issuecomment-1858600381
310+
[known issue in llvm]: https://github.com/llvm/llvm-project/issues/41784
311+
[psabi]: https://www.uclibc.org/docs/psABI-x86_64.pdf
312+
[ongoing discussion]: https://github.com/rust-lang/lang-team/issues/255
313+
[align-godbolt]: https://godbolt.org/z/h94Ge1vMW
314+
[composite-playground]: https://play.rust-lang.org/?version=beta&mode=debug&edition=2021&gist=52f349bdea92bf724bc453f37dbd32ea
315+
[^va-segfault]: https://github.com/llvm/llvm-project/issues/20283
316+
[^f128-segfault]: https://bugs.llvm.org/show_bug.cgi?id=50198

0 commit comments

Comments
 (0)