-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #451 and a couple other fixes #452
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
maleadt
reviewed
Oct 4, 2024
tgymnich
approved these changes
Oct 4, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metal Benchmarks
Benchmark suite | Current: e4ae1ca | Previous: cd9e68a | Ratio |
---|---|---|---|
private array/construct |
24642.416666666664 ns |
26732.583333333336 ns |
0.92 |
private array/broadcast |
449458.5 ns |
461834 ns |
0.97 |
private array/random/randn/Float32 |
1008584 ns |
1007750 ns |
1.00 |
private array/random/randn!/Float32 |
640875 ns |
633312.5 ns |
1.01 |
private array/random/rand!/Int64 |
573791 ns |
581542 ns |
0.99 |
private array/random/rand!/Float32 |
600062.5 ns |
592916.5 ns |
1.01 |
private array/random/rand/Int64 |
855459 ns |
870916 ns |
0.98 |
private array/random/rand/Float32 |
868479.5 ns |
977958 ns |
0.89 |
private array/copyto!/gpu_to_gpu |
485167 ns |
486875 ns |
1.00 |
private array/copyto!/cpu_to_gpu |
541437.5 ns |
720666.5 ns |
0.75 |
private array/copyto!/gpu_to_cpu |
495333.5 ns |
576000 ns |
0.86 |
private array/accumulate/1d |
1442479 ns |
1445354 ns |
1.00 |
private array/accumulate/2d |
1495458 ns |
1477250 ns |
1.01 |
private array/iteration/findall/int |
2273000 ns |
2238875 ns |
1.02 |
private array/iteration/findall/bool |
2061875 ns |
2031625 ns |
1.01 |
private array/iteration/findfirst/int |
1670312.5 ns |
1678291.5 ns |
1.00 |
private array/iteration/findfirst/bool |
1658395.5 ns |
1671396 ns |
0.99 |
private array/iteration/scalar |
2383167 ns |
2361083 ns |
1.01 |
private array/iteration/logical |
3447479 ns |
3433396 ns |
1.00 |
private array/iteration/findmin/1d |
1766500 ns |
1770396 ns |
1.00 |
private array/iteration/findmin/2d |
1363167 ns |
1363729 ns |
1.00 |
private array/reductions/reduce/1d |
738000 ns |
724645.5 ns |
1.02 |
private array/reductions/reduce/2d |
710208.5 ns |
711916.5 ns |
1.00 |
private array/reductions/mapreduce/1d |
766583 ns |
798667 ns |
0.96 |
private array/reductions/mapreduce/2d |
715562.5 ns |
714500 ns |
1.00 |
private array/permutedims/4d |
943520.5 ns |
946542 ns |
1.00 |
private array/permutedims/2d |
941333 ns |
937833 ns |
1.00 |
private array/permutedims/3d |
1021375 ns |
1007167 ns |
1.01 |
private array/copy |
762541.5 ns |
893375 ns |
0.85 |
latency/precompile |
4411934458 ns |
4410766833 ns |
1.00 |
latency/ttfp |
6903664750 ns |
6887075395.5 ns |
1.00 |
latency/import |
726189854.5 ns |
725464062.5 ns |
1.00 |
integration/metaldevrt |
756375 ns |
755125 ns |
1.00 |
integration/byval/slices=1 |
1621542 ns |
1577625 ns |
1.03 |
integration/byval/slices=3 |
8824396 ns |
8895417 ns |
0.99 |
integration/byval/reference |
1622459 ns |
1559209 ns |
1.04 |
integration/byval/slices=2 |
2779250 ns |
2635958 ns |
1.05 |
kernel/indexing |
447833 ns |
462375 ns |
0.97 |
kernel/indexing_checked |
440250 ns |
446500 ns |
0.99 |
kernel/launch |
11875 ns |
11833 ns |
1.00 |
metal/synchronization/stream |
19458 ns |
19542 ns |
1.00 |
metal/synchronization/context |
19875 ns |
19833 ns |
1.00 |
shared array/construct |
23815.916666666664 ns |
24125 ns |
0.99 |
shared array/broadcast |
470250 ns |
467083.5 ns |
1.01 |
shared array/random/randn/Float32 |
1009166.5 ns |
1016896 ns |
0.99 |
shared array/random/randn!/Float32 |
633417 ns |
634250 ns |
1.00 |
shared array/random/rand!/Int64 |
568959 ns |
573167 ns |
0.99 |
shared array/random/rand!/Float32 |
591417 ns |
592958 ns |
1.00 |
shared array/random/rand/Int64 |
852521 ns |
865103.5 ns |
0.99 |
shared array/random/rand/Float32 |
835771 ns |
831271 ns |
1.01 |
shared array/copyto!/gpu_to_gpu |
643500 ns |
678792 ns |
0.95 |
shared array/copyto!/cpu_to_gpu |
85083.5 ns |
94625 ns |
0.90 |
shared array/copyto!/gpu_to_cpu |
84083 ns |
84125 ns |
1.00 |
shared array/accumulate/1d |
1436833.5 ns |
1446583 ns |
0.99 |
shared array/accumulate/2d |
1479750 ns |
1483229.5 ns |
1.00 |
shared array/iteration/findall/int |
1975916 ns |
1952167 ns |
1.01 |
shared array/iteration/findall/bool |
1780479.5 ns |
1778083 ns |
1.00 |
shared array/iteration/findfirst/int |
1408937.5 ns |
1415604 ns |
1.00 |
shared array/iteration/findfirst/bool |
1377667 ns |
1376542 ns |
1.00 |
shared array/iteration/scalar |
190833 ns |
192125 ns |
0.99 |
shared array/iteration/logical |
3211562.5 ns |
3214833 ns |
1.00 |
shared array/iteration/findmin/1d |
1472604 ns |
1477146 ns |
1.00 |
shared array/iteration/findmin/2d |
1379479 ns |
1384958.5 ns |
1.00 |
shared array/reductions/reduce/1d |
652542 ns |
644417 ns |
1.01 |
shared array/reductions/reduce/2d |
711708 ns |
700500 ns |
1.02 |
shared array/reductions/mapreduce/1d |
616583 ns |
673667 ns |
0.92 |
shared array/reductions/mapreduce/2d |
720000 ns |
710479.5 ns |
1.01 |
shared array/permutedims/4d |
1072041 ns |
935500 ns |
1.15 |
shared array/permutedims/2d |
943250 ns |
944334 ns |
1.00 |
shared array/permutedims/3d |
1021375 ns |
1023583 ns |
1.00 |
shared array/copy |
599625 ns |
808959 ns |
0.74 |
This comment was automatically generated by workflow using github-action-benchmark.
christiangnrd
added a commit
that referenced
this pull request
Oct 17, 2024
And some other small fixes.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #451