-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add benchmarks results for 9aa82c6fb90e1dcd6e7f60626255d597ef0fdea1
- Loading branch information
github-actions
committed
Mar 13, 2024
1 parent
6468bd6
commit 07a1e92
Showing
5 changed files
with
895 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
{ | ||
"lastUpdate": 1710160575440, | ||
"lastUpdate": 1710360181784, | ||
"repoUrl": "https://github.com/luau-lang/luau", | ||
"entries": { | ||
"luau-analyze": [ | ||
|
@@ -11416,6 +11416,72 @@ | |
"extra": "luau-analyze" | ||
} | ||
] | ||
}, | ||
{ | ||
"commit": { | ||
"author": { | ||
"email": "[email protected]", | ||
"name": "Arseny Kapoulkine", | ||
"username": "zeux" | ||
}, | ||
"committer": { | ||
"email": "[email protected]", | ||
"name": "GitHub", | ||
"username": "web-flow" | ||
}, | ||
"distinct": true, | ||
"id": "9aa82c6fb90e1dcd6e7f60626255d597ef0fdea1", | ||
"message": "CodeGen: Improve lowering of NUM_TO_VEC on A64 for constants (#1194)\n\nWhen the input is a constant, we use a fairly inefficient sequence of\r\nfmov+fcvt+dup or, when the double isn't encodable in fmov,\r\nadr+ldr+fcvt+dup.\r\n\r\nInstead, we can use the same lowering as X64 when the input is a\r\nconstant, and load the vector from memory. However, if the constant is\r\nencodable via fmov, we can use a vector fmov instead (which is just one\r\ninstruction and doesn't need constant space).\r\n\r\nFortunately the bit encoding of fmov for 32-bit floating point numbers\r\nmatches that of 64-bit: the decoding algorithm is a little different\r\nbecause it expands into a larger exponent, but the values are\r\ncompatible, so if a double can be encoded into a scalar fmov with a\r\ngiven abcdefgh pattern, the same pattern should encode the same float;\r\ndue to the very limited number of mantissa and exponent bits, all values\r\nthat are encodable are also exact in both 32-bit and 64-bit floats.\r\n\r\nThis strategy is ~same as what gcc uses. For complex vectors, we\r\npreviously used 4 instructions and 8 bytes of constant storage, and now\r\nwe use 2 instructions and 16 bytes of constant storage, so the memory\r\nfootprint is the same; for simple vectors we just need 1 instruction (4\r\nbytes).\r\n\r\nclang lowers vector constants a little differently, opting to synthesize\r\na 64-bit integer using 4 instructions (mov/movk) and then move it to the\r\nvector register - this requires 5 instructions and 20 bytes, vs ours/gcc\r\n2 instructions and 8+16=24 bytes. I tried a simpler version of this that\r\nwould be more compact - synthesize a 32-bit integer constant with\r\nmov+movk, and move it to vector register via dup.4s - but this was a\r\nlittle slower on M2, so for now we prefer the slightly larger version as\r\nit's not a regression vs current implementation.\r\n\r\nOn the vector approximation benchmark we get:\r\n\r\n- Before this PR (flag=false): ~7.85 ns/op\r\n- After this PR (flag=true): ~7.74 ns/op\r\n- After this PR, with 0.125 instead of 0.123 in the benchmark code (to\r\nuse fmov): ~7.52 ns/op\r\n- Not part of this PR, but the mov/dup strategy described above: ~8.00\r\nns/op", | ||
"timestamp": "2024-03-13T12:56:11-07:00", | ||
"tree_id": "b46afdd603a2f3bd60b9cac918c2ddc0faf0d668", | ||
"url": "https://github.com/luau-lang/luau/commit/9aa82c6fb90e1dcd6e7f60626255d597ef0fdea1" | ||
}, | ||
"date": 1710360181780, | ||
"tool": "benchmarkluau", | ||
"benches": [ | ||
{ | ||
"name": "map-nonstrict", | ||
"value": 4.78128, | ||
"unit": "4ms", | ||
"range": "±0%", | ||
"extra": "luau-analyze" | ||
}, | ||
{ | ||
"name": "map-strict", | ||
"value": 5.84051, | ||
"unit": "5ms", | ||
"range": "±0%", | ||
"extra": "luau-analyze" | ||
}, | ||
{ | ||
"name": "map-dcr", | ||
"value": 51.0637, | ||
"unit": "ms", | ||
"range": "±0%", | ||
"extra": "luau-analyze" | ||
}, | ||
{ | ||
"name": "regex-nonstrict", | ||
"value": 7.7506, | ||
"unit": "7ms", | ||
"range": "±0%", | ||
"extra": "luau-analyze" | ||
}, | ||
{ | ||
"name": "regex-strict", | ||
"value": 9.96327, | ||
"unit": "9ms", | ||
"range": "±0%", | ||
"extra": "luau-analyze" | ||
}, | ||
{ | ||
"name": "regex-dcr", | ||
"value": 115.89, | ||
"unit": "ms", | ||
"range": "±0%", | ||
"extra": "luau-analyze" | ||
} | ||
] | ||
} | ||
] | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
{ | ||
"lastUpdate": 1710160575122, | ||
"lastUpdate": 1710360181462, | ||
"repoUrl": "https://github.com/luau-lang/luau", | ||
"entries": { | ||
"callgrind codegen": [ | ||
|
@@ -29668,6 +29668,254 @@ | |
"extra": "luau-codegen" | ||
} | ||
] | ||
}, | ||
{ | ||
"commit": { | ||
"author": { | ||
"email": "[email protected]", | ||
"name": "Arseny Kapoulkine", | ||
"username": "zeux" | ||
}, | ||
"committer": { | ||
"email": "[email protected]", | ||
"name": "GitHub", | ||
"username": "web-flow" | ||
}, | ||
"distinct": true, | ||
"id": "9aa82c6fb90e1dcd6e7f60626255d597ef0fdea1", | ||
"message": "CodeGen: Improve lowering of NUM_TO_VEC on A64 for constants (#1194)\n\nWhen the input is a constant, we use a fairly inefficient sequence of\r\nfmov+fcvt+dup or, when the double isn't encodable in fmov,\r\nadr+ldr+fcvt+dup.\r\n\r\nInstead, we can use the same lowering as X64 when the input is a\r\nconstant, and load the vector from memory. However, if the constant is\r\nencodable via fmov, we can use a vector fmov instead (which is just one\r\ninstruction and doesn't need constant space).\r\n\r\nFortunately the bit encoding of fmov for 32-bit floating point numbers\r\nmatches that of 64-bit: the decoding algorithm is a little different\r\nbecause it expands into a larger exponent, but the values are\r\ncompatible, so if a double can be encoded into a scalar fmov with a\r\ngiven abcdefgh pattern, the same pattern should encode the same float;\r\ndue to the very limited number of mantissa and exponent bits, all values\r\nthat are encodable are also exact in both 32-bit and 64-bit floats.\r\n\r\nThis strategy is ~same as what gcc uses. For complex vectors, we\r\npreviously used 4 instructions and 8 bytes of constant storage, and now\r\nwe use 2 instructions and 16 bytes of constant storage, so the memory\r\nfootprint is the same; for simple vectors we just need 1 instruction (4\r\nbytes).\r\n\r\nclang lowers vector constants a little differently, opting to synthesize\r\na 64-bit integer using 4 instructions (mov/movk) and then move it to the\r\nvector register - this requires 5 instructions and 20 bytes, vs ours/gcc\r\n2 instructions and 8+16=24 bytes. I tried a simpler version of this that\r\nwould be more compact - synthesize a 32-bit integer constant with\r\nmov+movk, and move it to vector register via dup.4s - but this was a\r\nlittle slower on M2, so for now we prefer the slightly larger version as\r\nit's not a regression vs current implementation.\r\n\r\nOn the vector approximation benchmark we get:\r\n\r\n- Before this PR (flag=false): ~7.85 ns/op\r\n- After this PR (flag=true): ~7.74 ns/op\r\n- After this PR, with 0.125 instead of 0.123 in the benchmark code (to\r\nuse fmov): ~7.52 ns/op\r\n- Not part of this PR, but the mov/dup strategy described above: ~8.00\r\nns/op", | ||
"timestamp": "2024-03-13T12:56:11-07:00", | ||
"tree_id": "b46afdd603a2f3bd60b9cac918c2ddc0faf0d668", | ||
"url": "https://github.com/luau-lang/luau/commit/9aa82c6fb90e1dcd6e7f60626255d597ef0fdea1" | ||
}, | ||
"date": 1710360181456, | ||
"tool": "benchmarkluau", | ||
"benches": [ | ||
{ | ||
"name": "base64", | ||
"value": 13.385, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "chess", | ||
"value": 52.018, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "life", | ||
"value": 23.356, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "matrixmult", | ||
"value": 9.336, | ||
"unit": "9ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "mesh-normal-scalar", | ||
"value": 13, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "pcmmix", | ||
"value": 1.38, | ||
"unit": "1ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "qsort", | ||
"value": 41.508, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "sha256", | ||
"value": 4.525, | ||
"unit": "4ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "ack", | ||
"value": 40.021, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "binary-trees", | ||
"value": 20.853, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "fannkuchen-redux", | ||
"value": 3.878, | ||
"unit": "3ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "fixpoint-fact", | ||
"value": 49.032, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "heapsort", | ||
"value": 7.701, | ||
"unit": "7ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "mandel", | ||
"value": 40.471, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "n-body", | ||
"value": 9.707, | ||
"unit": "9ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "qt", | ||
"value": 24.955, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "queen", | ||
"value": 0.805, | ||
"unit": "0ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "scimark", | ||
"value": 24.643, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "spectral-norm", | ||
"value": 2.444, | ||
"unit": "2ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "sieve", | ||
"value": 82.952, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "3d-cube", | ||
"value": 3.736, | ||
"unit": "3ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "3d-morph", | ||
"value": 3.744, | ||
"unit": "3ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "3d-raytrace", | ||
"value": 3.304, | ||
"unit": "3ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "controlflow-recursive", | ||
"value": 3.463, | ||
"unit": "3ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "crypto-aes", | ||
"value": 7.228, | ||
"unit": "7ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "fannkuch", | ||
"value": 6.068, | ||
"unit": "6ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "math-cordic", | ||
"value": 3.768, | ||
"unit": "3ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "math-partial-sums", | ||
"value": 1.872, | ||
"unit": "1ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "n-body-oop", | ||
"value": 13.714, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "tictactoe", | ||
"value": 62.961, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "trig", | ||
"value": 6.618, | ||
"unit": "6ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "voxelgen", | ||
"value": 27.559, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
} | ||
] | ||
} | ||
] | ||
} | ||
|
Oops, something went wrong.