-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add benchmarks results for e6bf71871a6b9f601545dba8a42ce89c6069675c
- Loading branch information
github-actions
committed
Nov 9, 2024
1 parent
e54fa07
commit 0ed9439
Showing
5 changed files
with
951 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
{ | ||
"lastUpdate": 1731103069070, | ||
"lastUpdate": 1731112783238, | ||
"repoUrl": "https://github.com/luau-lang/luau", | ||
"entries": { | ||
"luau-analyze": [ | ||
|
@@ -16234,6 +16234,72 @@ | |
"extra": "luau-analyze" | ||
} | ||
] | ||
}, | ||
{ | ||
"commit": { | ||
"author": { | ||
"email": "[email protected]", | ||
"name": "Arseny Kapoulkine", | ||
"username": "zeux" | ||
}, | ||
"committer": { | ||
"email": "[email protected]", | ||
"name": "GitHub", | ||
"username": "web-flow" | ||
}, | ||
"distinct": true, | ||
"id": "e6bf71871a6b9f601545dba8a42ce89c6069675c", | ||
"message": "CodeGen: Rewrite dot product lowering using a dedicated IR instruction (#1512)\n\nInstead of doing the dot product related math in scalar IR, we lift the\r\ncomputation into a dedicated IR instruction.\r\n\r\nOn x64, we can use VDPPS which was more or less tailor made for this\r\npurpose. This is better than manual scalar lowering that requires\r\nreloading components from memory; it's not always a strict improvement\r\nover the shuffle+add version (which we never had), but this can now be\r\nadjusted in the IR lowering in an optimal fashion (maybe even based on\r\nCPU vendor, although that'd create issues for offline compilation).\r\n\r\nOn A64, we can either use naive adds or paired adds, as there is no\r\ndedicated vector-wide horizontal instruction until SVE. Both run at\r\nabout the same performance on M2, but paired adds require fewer\r\ninstructions and temporaries.\r\n\r\nI've measured this using mesh-normal-vector benchmark, changing the\r\nbenchmark to just report the time of the second loop inside\r\n`calculate_normals`, testing master vs #1504 vs this PR, also increasing\r\nthe grid size to 400 for more stable timings.\r\n\r\nOn Zen 4 (7950X), this PR is comfortably ~8% faster vs master, while I\r\nsee neutral to negative results in #1504.\r\nOn M2 (base), this PR is ~28% faster vs master, while #1504 is only\r\nabout ~10% faster.\r\n\r\nIf I measure the second loop in `calculate_tangent_space` instead, I\r\nget:\r\n\r\nOn Zen 4 (7950X), this PR is ~12% faster vs master, while #1504 is ~3%\r\nfaster\r\nOn M2 (base), this PR is ~24% faster vs master, while #1504 is only\r\nabout ~13% faster.\r\n\r\nNote that the loops in question are not quite optimal, as they store and\r\nreload various vectors to dictionary values due to inappropriate use of\r\nlocals. The underlying gains in individual functions are thus larger\r\nthan the numbers above; for example, changing the `calculate_normals`\r\nloop to use a local variable to store the normalized vector (but still\r\nsaving the result to dictionary value), I get a ~24% performance\r\nincrease from this PR on Zen4 vs master instead of just 8% (#1504 is\r\n~15% slower in this setup).", | ||
"timestamp": "2024-11-08T16:23:09-08:00", | ||
"tree_id": "c5bf52046b7ad7c495e930780fe8ba3a95d09432", | ||
"url": "https://github.com/luau-lang/luau/commit/e6bf71871a6b9f601545dba8a42ce89c6069675c" | ||
}, | ||
"date": 1731112783233, | ||
"tool": "benchmarkluau", | ||
"benches": [ | ||
{ | ||
"name": "map-nonstrict", | ||
"value": 4.86579, | ||
"unit": "4ms", | ||
"range": "±0%", | ||
"extra": "luau-analyze" | ||
}, | ||
{ | ||
"name": "map-strict", | ||
"value": 5.92236, | ||
"unit": "5ms", | ||
"range": "±0%", | ||
"extra": "luau-analyze" | ||
}, | ||
{ | ||
"name": "map-dcr", | ||
"value": 26.9731, | ||
"unit": "ms", | ||
"range": "±0%", | ||
"extra": "luau-analyze" | ||
}, | ||
{ | ||
"name": "regex-nonstrict", | ||
"value": 8.16724, | ||
"unit": "8ms", | ||
"range": "±0%", | ||
"extra": "luau-analyze" | ||
}, | ||
{ | ||
"name": "regex-strict", | ||
"value": 10.6295, | ||
"unit": "ms", | ||
"range": "±0%", | ||
"extra": "luau-analyze" | ||
}, | ||
{ | ||
"name": "regex-dcr", | ||
"value": 7742.64, | ||
"unit": "ms", | ||
"range": "±0%", | ||
"extra": "luau-analyze" | ||
} | ||
] | ||
} | ||
] | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
{ | ||
"lastUpdate": 1731103068754, | ||
"lastUpdate": 1731112782915, | ||
"repoUrl": "https://github.com/luau-lang/luau", | ||
"entries": { | ||
"callgrind codegen": [ | ||
|
@@ -47856,6 +47856,268 @@ | |
"extra": "luau-codegen" | ||
} | ||
] | ||
}, | ||
{ | ||
"commit": { | ||
"author": { | ||
"email": "[email protected]", | ||
"name": "Arseny Kapoulkine", | ||
"username": "zeux" | ||
}, | ||
"committer": { | ||
"email": "[email protected]", | ||
"name": "GitHub", | ||
"username": "web-flow" | ||
}, | ||
"distinct": true, | ||
"id": "e6bf71871a6b9f601545dba8a42ce89c6069675c", | ||
"message": "CodeGen: Rewrite dot product lowering using a dedicated IR instruction (#1512)\n\nInstead of doing the dot product related math in scalar IR, we lift the\r\ncomputation into a dedicated IR instruction.\r\n\r\nOn x64, we can use VDPPS which was more or less tailor made for this\r\npurpose. This is better than manual scalar lowering that requires\r\nreloading components from memory; it's not always a strict improvement\r\nover the shuffle+add version (which we never had), but this can now be\r\nadjusted in the IR lowering in an optimal fashion (maybe even based on\r\nCPU vendor, although that'd create issues for offline compilation).\r\n\r\nOn A64, we can either use naive adds or paired adds, as there is no\r\ndedicated vector-wide horizontal instruction until SVE. Both run at\r\nabout the same performance on M2, but paired adds require fewer\r\ninstructions and temporaries.\r\n\r\nI've measured this using mesh-normal-vector benchmark, changing the\r\nbenchmark to just report the time of the second loop inside\r\n`calculate_normals`, testing master vs #1504 vs this PR, also increasing\r\nthe grid size to 400 for more stable timings.\r\n\r\nOn Zen 4 (7950X), this PR is comfortably ~8% faster vs master, while I\r\nsee neutral to negative results in #1504.\r\nOn M2 (base), this PR is ~28% faster vs master, while #1504 is only\r\nabout ~10% faster.\r\n\r\nIf I measure the second loop in `calculate_tangent_space` instead, I\r\nget:\r\n\r\nOn Zen 4 (7950X), this PR is ~12% faster vs master, while #1504 is ~3%\r\nfaster\r\nOn M2 (base), this PR is ~24% faster vs master, while #1504 is only\r\nabout ~13% faster.\r\n\r\nNote that the loops in question are not quite optimal, as they store and\r\nreload various vectors to dictionary values due to inappropriate use of\r\nlocals. The underlying gains in individual functions are thus larger\r\nthan the numbers above; for example, changing the `calculate_normals`\r\nloop to use a local variable to store the normalized vector (but still\r\nsaving the result to dictionary value), I get a ~24% performance\r\nincrease from this PR on Zen4 vs master instead of just 8% (#1504 is\r\n~15% slower in this setup).", | ||
"timestamp": "2024-11-08T16:23:09-08:00", | ||
"tree_id": "c5bf52046b7ad7c495e930780fe8ba3a95d09432", | ||
"url": "https://github.com/luau-lang/luau/commit/e6bf71871a6b9f601545dba8a42ce89c6069675c" | ||
}, | ||
"date": 1731112782905, | ||
"tool": "benchmarkluau", | ||
"benches": [ | ||
{ | ||
"name": "base64", | ||
"value": 11.54, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "chess", | ||
"value": 52.008, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "life", | ||
"value": 23.355, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "matrixmult", | ||
"value": 9.335, | ||
"unit": "9ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "mesh-normal-scalar", | ||
"value": 13.055, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "mesh-normal-vector", | ||
"value": 8.109, | ||
"unit": "8ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "pcmmix", | ||
"value": 1.36, | ||
"unit": "1ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "qsort", | ||
"value": 41.507, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "sha256", | ||
"value": 4.57, | ||
"unit": "4ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "ack", | ||
"value": 40.015, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "binary-trees", | ||
"value": 20.847, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "fannkuchen-redux", | ||
"value": 3.892, | ||
"unit": "3ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "fixpoint-fact", | ||
"value": 48.87, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "heapsort", | ||
"value": 7.716, | ||
"unit": "7ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "mandel", | ||
"value": 40.418, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "n-body", | ||
"value": 9.707, | ||
"unit": "9ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "qt", | ||
"value": 24.975, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "queen", | ||
"value": 0.805, | ||
"unit": "0ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "scimark", | ||
"value": 24.636, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "spectral-norm", | ||
"value": 2.444, | ||
"unit": "2ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "sieve", | ||
"value": 84.552, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "3d-cube", | ||
"value": 3.732, | ||
"unit": "3ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "3d-morph", | ||
"value": 3.747, | ||
"unit": "3ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "3d-raytrace", | ||
"value": 3.28, | ||
"unit": "3ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "controlflow-recursive", | ||
"value": 3.464, | ||
"unit": "3ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "crypto-aes", | ||
"value": 7.182, | ||
"unit": "7ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "fannkuch", | ||
"value": 6.167, | ||
"unit": "6ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "math-cordic", | ||
"value": 3.768, | ||
"unit": "3ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "math-partial-sums", | ||
"value": 1.917, | ||
"unit": "1ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "n-body-oop", | ||
"value": 13.739, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "tictactoe", | ||
"value": 62.952, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "trig", | ||
"value": 6.65, | ||
"unit": "6ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
}, | ||
{ | ||
"name": "vector", | ||
"value": null, | ||
"unit": ":", | ||
"range": "±+/-", | ||
"extra": "on" | ||
}, | ||
{ | ||
"name": "voxelgen", | ||
"value": 27.659, | ||
"unit": "ms", | ||
"range": "±0.000%", | ||
"extra": "luau-codegen" | ||
} | ||
] | ||
} | ||
] | ||
} | ||
|
Oops, something went wrong.