-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathNOTES
142 lines (106 loc) · 5.67 KB
/
NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
Memory management for ICE objects
=================================
ICE instructions, operands, etc. are small objects, and hordes of them are
created during compilation. It makes sense to provide efficient memory
allocation for these objects, because we only deallocate them when a function is
done translating. So a function (Cfg in Subzero parlance) should maintain an
arena allocator which provides memory for the ICE objects contained within it.
The simplest and fastest allocation strategy can be used for this arena - a
"bump pointer" allocator that doesn't worry about deallocating and recycling,
until it's completely killed (when the function is done translating and we
have a buffer of object code instead).
From a performance point of view, it is very desirable for the objects most
often created by Subzero to be arena-allocated.
Challenges
----------
With an allocator per function, we no longer can simply allocate ICE objects
with the new operator:
IceInstPhi *IcePhi = new IceInstPhi(...)
Rather, we'll have to use placement new. Something like:
// Allocate memory
IceInstPhi *IcePhi = CfgAllocator.Allocate<IceInstPhi>();
// Use placement new to construct the object
new (IcePhi) IceInstPhi(...)
With this scheme, destructors have to be invoked manually before deallocation.
This adds bookkeeping costs. On the other hand, if we can ensure that the types
allocated in the arena are PODs or only hold other types that are allocated in
the same arena, we may forego destructors altogether and make deallocation very
fast.
This means there's a significant restriction on what arena-allocated objects
may contain. They must not hold resources other than memory (i.e. no sockets,
mutexes or other handlers requiring orderly release). They must not hold
heap-allocated objects; this includes STL containers (and most LLVM data
structures), which all use the heap by default. While STL containers support
custom allocators, this is commonly considered broken (though perhaps better to
some degree in C++11). So whatever containers are held by arena-allocated
objects must be arena-allocated themselves.
Existing LLVM tools for arena allocation
----------------------------------------
Basic arena allocation tools
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The LLVM Support/Allocator.h header provides a number of tools for arena
allocation. BumpPtrAllocator is a generic allocator that uses the simple
bump-pointer technique to provide fast allocation. BumpPtrAllocator does not
handle object destruction - it just releases the whole arena when cleared or
destroyed.
SpecificBumpPtrAllocator is wrapper around BumpPtrAllocator. It's a class
template instantiated by the handled object type. This makes allocation calls
somewhat cleaner. More importantly, since SpecificBumpPtrAllocator knows the
type of the objects it allocates, it can call destructors when releasing the
arena.
Recycling allocators
~~~~~~~~~~~~~~~~~~~~
The Recycler class template (in Support/Recycler.h) is another wrapper around
allocators. It keeps a free list of memory chunks populated upon deallocation.
Then new allocations are requested, they can be satisfied from this free list
before the memory request gets passed to the allocator. With certain allocation
and deallocation patterns, Recycler can save memory.
ArrayRecycler does the same for arrays allocated with a BumpPtrAllocator.
Allocator-supporting containers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Some of the containers provided by LLVM support can be specialized with an
allocator type (at compile time, as a template parameter). These containers then
perform all their memory management through an allocator.
At this time, the containers that support allocators are: StringMap, StringSet,
ScopedHashTable, ImmutableSet, ImmutableList.
Lessons from LLVM's MI layer
----------------------------
The LLVM MI layer uses arena allocation for all of its artifacts. The main
gatekeeper of this layer is MachineFunction, which keeps its own
BumpPtrAllocator and has special allocators for instructions, operands, basic
blocks and so on.
Example: new MIs (MachineInstr objects) are created with:
MachineInstr *
MachineFunction::CreateMachineInstr(const MCInstrDesc &MCID,
DebugLoc DL, bool NoImp) {
return new (InstructionRecycler.Allocate<MachineInstr>(Allocator))
MachineInstr(*this, MCID, DL, NoImp);
}
Arrays of MachineOperand objects are created with:
MachineOperand *allocateOperandArray(OperandCapacity Cap) {
return OperandRecycler.allocate(Cap, Allocator);
}
The allocator is used for all additional heap allocations, for example to
create a register mask - Allocator.Allocate<uint32_t>(Size).
Naked "new" is not used anywhere within MachineFunction.
Containers in the MI layer
~~~~~~~~~~~~~~~~~~~~~~~~~~
MachineInstr
************
MachineInstr doesn't use any containers or heap-allocating members. The only
container-y code related to it is being part of an ilist of MachineInstrs held
by each MachineBasicBlock (MBB). However, the ilist_traits for MachineInstr
are defined to use a sentinel member that's not heap-allocated.
MachineOperands are very POD-like and have no containers.
MachineBasicBlock
*****************
MBBs are ilist_nodes held in a MachineFunction, using the same trick as MIs to
avoid heap allocation for the sentinel.
However, MBBs also hold on to a number of std::vectors; according to the LLVM
developers, this was left behind as the transition to allocators was gradual.
Since the amount of MBBs created is not very large, this is not expected to
have a big impact.
MachineFunction
***************
MachineFunctions (MFs) also hold std::vectors for block numbering. See MBBs
for reasons and expected impact on actual performance.