Performance compared to non-GC structs #51

Jamesernator · 2024-10-12T09:26:11Z

So the last couple days I wrote some JS to decode a binary format, to break things down I decided to use a pattern of having individual functions return a { bytesRead: number, value: T } object. When running this I noticed that the majority (80%) of the runtime was spent in the garbage collector for these objects.

I confirmed that these specific allocations were the cause by eliminating some of the loops that used these objects (GC time dropped to 40% of time), however I'm less happy with the resulting pattern used in the code to avoid allocations.

A question I have is, will engines be able to optimize structs to have at least comparable performance to non-GC-ed structs (e.g. multi-value) in some at least some cases (for example in cases where such structs can be detected to be used linearly)?

The text was updated successfully, but these errors were encountered:

acutmore · 2024-10-18T16:08:08Z

I suspect if engines could optimise out the allocation for a struct being used to return multiple values that are immediately consumed by the caller, then they would also be able to do that for a plain object. Perhaps the fixed-layout aspects of structs make this optimisation slightly easier, but I would imagine the escape analysis would be similar.

syg · 2024-10-18T17:00:00Z

I'm skeptical it's feasible to do escape analysis to do scalar replacement and stack allocation in anything lower than the optimizing JIT tier. At non-optimizing tiers, without type feedback, you'd have to do it optimistically and also compile in a slow path that can work at runtime, which means this optimization doesn't always pay for itself, which asks the question how do you tune it?

Concretely: your example involves a utility function returning object literals, which means that to stack allocate it, that utility function itself needs to be compiled differently (either to return scalar components via registers, or directly write them into some stack slots in the caller). But to do that, we'd need to know all possible callers of that utility functions use the return object in a non-escaping way (it's hard/intractable for a VM to know this, given JS's dynamism). If we don't, we'd need to build in a dynamic deopt path. On the caller side, every call of this function now needs to know that it can either do some stack-allocation thing, or deopt and return a heap object, so there's complexity on the caller side too.

At the optimizing tier, as @acutmore says the fixed-layout aspect of it at least tells you how much stack space you need. But my hunch is at that tier, if we have done enough analysis to be able to perform scalar replacement, the compiler can also treat non-fixed-layout objects as fixed layout if they happen to be used locally as fixed-layout objects anyway.

Concretely: in the optimizing tier, we'd do escape analysis after inlining. So for your example with the utility function, the problem becomes more tractable: once that utility function is inlined, it's an intra-procedural instead of inter-procedural optimization.

IOW, I don't think structs really help much with the escape analysis problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance compared to non-GC structs #51

Performance compared to non-GC structs #51

Jamesernator commented Oct 12, 2024

acutmore commented Oct 18, 2024

syg commented Oct 18, 2024 •

edited

Loading

Performance compared to non-GC structs #51

Performance compared to non-GC structs #51

Comments

Jamesernator commented Oct 12, 2024

acutmore commented Oct 18, 2024

syg commented Oct 18, 2024 • edited Loading

syg commented Oct 18, 2024 •

edited

Loading