Consider utilizing GC.AllocateUninitializedArray in the various collection types #47198

tannergooding · 2021-01-19T22:33:49Z

tannergooding
Jan 19, 2021
Collaborator

GC.AllocateUninitializedArray was exposed in .NET 5 and provides a way to avoid zeroing the backing memory for an array. For large collections, this could be potentially beneficial to use and provide a small win when resizing or copying the items.

#47186 shows a prototype for this and shows that, at least for creating the list, there is approx. half a nanosecond slowdown for creating arrays that are less than 2048 total bytes and a minor gain at 2048 bytes. Once you get to an array that is 4096 bytes or more, you start seeing an approximate 50% speedup in the allocation.

The major downside to this is that if a user improperly access data from multiple threads, they could access the backing memory for the collection and view "uninitialized" data. This would not occur during normal, correct usage of the APIs.

colgreen · 2022-04-23T20:52:45Z

colgreen
Apr 23, 2022

Note that the x64 JITter output for invoking GC.AllocateUninitializedArray() is substantially longer than the x64 for invoking new[]...

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKGIAYACY8gOgDkBXfGKASzFwBuGjWIBmJqQYBhBgG8aDJUwnMkDXgDsMAbQC6DAGIQIACi0YGAN2wAbDjACUi5QurKPTAOwMA4tJYAQVtbCDBsDBgAVU0tXgxeO14ALxgAE0CoKGwATwAeCwA+Uxt7J2F3ZQBfEUqlcSZydQt9IxNSc21rOwdnOvkXT28GTRgAdw1tHVKHPQqPKo9BhhoqoA

C# test source:

public class C {
    public static int[] AllocateUninitializedArray(int value)
    {
        return GC.AllocateUninitializedArray<int>(value);
    }

    public static int[] NewArray(int value)
    {
        return new int[value];
    }    
}

Resulting x64 (Core CLR 6.0.222.6406 on amd64)

C.AllocateUninitializedArray(Int32)
    L0000: push rsi
    L0001: sub rsp, 0x20
    L0005: mov esi, ecx
    L0007: cmp esi, 0x200
    L000d: jge short L0028
    L000f: movsxd rdx, esi
    L0012: mov rcx, 0x7ffa6ca18410
    L001c: call 0x00007ffacc4fb1b0
    L0021: nop
    L0022: add rsp, 0x20
    L0026: pop rsi
    L0027: ret
    L0028: mov rcx, 0x7ffa6ca18410
    L0032: call 0x00007ffacc3fd020
    L0037: test rax, rax
    L003a: je short L004f
    L003c: mov rcx, [rax+0x18]
    L0040: mov edx, esi
    L0042: mov r8d, 0x10
    L0048: call 0x00007ffacc4c44e0
    L004d: jmp short L0021
    L004f: xor ecx, ecx
    L0051: jmp short L0040

C.NewArray(Int32)
    L0000: sub rsp, 0x28
    L0004: movsxd rdx, ecx
    L0007: mov rcx, 0x7ffa6ca18410
    L0011: call 0x00007ffacc4fb1b0
    L0016: nop
    L0017: add rsp, 0x28
    L001b: ret

From:
https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKBuIGYACMxgYUYG8bGfmnjySRgEsAdhgDaAXUYBBADbyIYbBhgBVUWOEZh2ecIBeMACayoUbAE8AFGIyMAbvoCuMAJTdeXarz/MAdkYAcVYAOgUlFTVNbV19I1NzSysAHnsAPhtneTd3AG4vHgBfGiK+ZkERcWlGADkYAHdk6ztxJ1cPcp9/XmIg0SbqyRy3KULfXmK/GmKgA===

0 replies

colgreen · 2022-04-23T21:10:28Z

colgreen
Apr 23, 2022

More info.... GC.AllocateUninitializedArray() will currently revert to performing new[] for reference types, and arrays shorter than 2048 bytes of allocation. Relevant source code:

        public static T[] AllocateUninitializedArray<T>(int length, bool pinned = false) // T[] rather than T?[] to match `new T[length]` behavior
        {
            if (!pinned)
            {
                if (RuntimeHelpers.IsReferenceOrContainsReferences<T>())
                {
                    return new T[length];
                }

                // for debug builds we always want to call AllocateNewArray to detect AllocateNewArray bugs
#if !DEBUG
                // small arrays are allocated using `new[]` as that is generally faster.
                if (length < 2048 / Unsafe.SizeOf<T>())
                {
                    return new T[length];
                }
#endif
            }

Therefore, for small arrays (which are the most common scenario) GC.AllocateUninitializedArray() simply adds some overhead around a normal new[] invocation.

I think the obvious scenario where GC.AllocateUninitializedArray() makes sense is when the array contents are about to be set anyway (i.e. within the current method that did the allocation), such that the lifetime of the uninitialized data is very short indeed. Scenarios where uninitialized data has a potentially long opens up new risks and therefore on balance might not be worth it, especially where the performance gains are minimal.

0 replies

brunom · 2024-11-13T03:18:35Z

brunom
Nov 13, 2024

#47186 feared that 'This may potentially turn race conditions into memory safety violations'. But note that clearing the array in the collection preserves half the speed up of GC.AllocateUninitializedArray. For example, when List<T> gets a new 8192 array, it can copy the old array into the first 4096 elements and clear the last 4096 elements.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider utilizing GC.AllocateUninitializedArray in the various collection types #47198

{{title}}

Replies: 3 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Consider utilizing GC.AllocateUninitializedArray in the various collection types #47198

tannergooding Jan 19, 2021 Collaborator

Replies: 3 comments

colgreen Apr 23, 2022

colgreen Apr 23, 2022

brunom Nov 13, 2024

tannergooding
Jan 19, 2021
Collaborator

colgreen
Apr 23, 2022

colgreen
Apr 23, 2022

brunom
Nov 13, 2024