Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster Khatri-Rao #41

Merged
merged 10 commits into from
Mar 1, 2024
Merged

Faster Khatri-Rao #41

merged 10 commits into from
Mar 1, 2024

Conversation

dahong67
Copy link
Owner

@dahong67 dahong67 commented Mar 1, 2024

Fixes #34

@dahong67
Copy link
Owner Author

dahong67 commented Mar 1, 2024

Simple fix to avoid extra copy of the Khatri-Rao product already yields some significant savings.

Copy link

codecov bot commented Mar 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (8f57eef) to head (11c156f).

Additional details and impacted files
@@            Coverage Diff            @@
##            master       #41   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            7         7           
  Lines          217       217           
=========================================
  Hits           217       217           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

1. use `reduce` to compute all but the last `kron`
2. use `kron!` for the last one
dahong67 added 3 commits March 1, 2024 07:46
In order to first commit some Khatri-Rao benchmarks.
This reverts commit b0a6d2b.
Add back new version now that benchmarks have been committed.
This reverts commit 0178c48.
@dahong67
Copy link
Owner Author

dahong67 commented Mar 1, 2024

Benchmark Report for GCPDecompositions

Job Properties

  • Time of benchmarks:
    • Target: 1 Mar 2024 - 07:56
    • Baseline: 1 Mar 2024 - 07:58
  • Package commits:
    • Target: 40709e
    • Baseline: eaba07
  • Julia commits:
    • Target: bed2cd
    • Baseline: bed2cd
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: GCP_BENCHMARK_SUITES => khatrirao
    • Baseline: GCP_BENCHMARK_SUITES => khatrirao

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["khatrirao", "size=(100, 1000, 30), rank=30"] 0.35 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(100, 1000, 30), rank=5"] 0.37 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(100, 1000, 30), rank=60"] 0.35 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(100, 1000, 30), rank=90"] 0.35 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(1000, 30, 100), rank=30"] 0.34 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(1000, 30, 100), rank=5"] 0.35 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(1000, 30, 100), rank=60"] 0.34 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(1000, 30, 100), rank=90"] 0.34 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(20, 40, 80, 500), rank=30"] 0.34 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(20, 40, 80, 500), rank=5"] 0.34 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(20, 40, 80, 500), rank=60"] 0.36 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(20, 40, 80, 500), rank=90"] 0.36 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(30, 100, 1000), rank=30"] 0.33 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(30, 100, 1000), rank=5"] 0.34 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(30, 100, 1000), rank=60"] 0.34 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(30, 100, 1000), rank=90"] 0.33 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(30, 30), rank=30"] 0.47 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(30, 30), rank=5"] 0.55 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(30, 30), rank=60"] 0.47 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(30, 30), rank=90"] 0.42 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(30, 30, 30), rank=30"] 0.42 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(30, 30, 30), rank=5"] 0.38 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(30, 30, 30), rank=60"] 0.40 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(30, 30, 30), rank=90"] 0.38 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(30, 30, 30, 30), rank=30"] 0.39 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(30, 30, 30, 30), rank=5"] 0.34 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(30, 30, 30, 30), rank=60"] 0.39 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(30, 30, 30, 30), rank=90"] 0.37 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(40, 80, 500, 20), rank=30"] 0.35 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(40, 80, 500, 20), rank=5"] 0.36 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(40, 80, 500, 20), rank=60"] 0.35 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(40, 80, 500, 20), rank=90"] 0.35 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(500, 20, 40, 80), rank=30"] 0.35 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(500, 20, 40, 80), rank=5"] 0.34 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(500, 20, 40, 80), rank=60"] 0.35 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(500, 20, 40, 80), rank=90"] 0.36 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(60, 60), rank=30"] 0.28 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(60, 60), rank=5"] 0.39 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(60, 60), rank=60"] 0.34 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(60, 60), rank=90"] 0.35 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(60, 60, 60), rank=30"] 0.36 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(60, 60, 60), rank=5"] 0.38 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(60, 60, 60), rank=60"] 0.38 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(60, 60, 60), rank=90"] 0.38 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(60, 60, 60, 60), rank=30"] 0.34 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(60, 60, 60, 60), rank=5"] 0.34 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(60, 60, 60, 60), rank=60"] 0.34 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(60, 60, 60, 60), rank=90"] 0.35 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(80, 500, 20, 40), rank=30"] 0.36 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(80, 500, 20, 40), rank=5"] 0.35 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(80, 500, 20, 40), rank=60"] 0.36 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(80, 500, 20, 40), rank=90"] 0.36 (5%) ✅ 0.51 (1%) ✅
["khatrirao", "size=(90, 90), rank=30"] 0.39 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(90, 90), rank=5"] 0.38 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(90, 90), rank=60"] 0.43 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(90, 90), rank=90"] 0.41 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(90, 90, 90), rank=30"] 0.37 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(90, 90, 90), rank=5"] 0.32 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(90, 90, 90), rank=60"] 0.39 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(90, 90, 90), rank=90"] 0.35 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(90, 90, 90, 90), rank=30"] 0.35 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(90, 90, 90, 90), rank=5"] 0.34 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(90, 90, 90, 90), rank=60"] 0.36 (5%) ✅ 0.50 (1%) ✅
["khatrirao", "size=(90, 90, 90, 90), rank=90"] 0.36 (5%) ✅ 0.50 (1%) ✅

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["khatrirao"]

Julia versioninfo

Target

Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  uname: Darwin 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 arm64 arm
  CPU: Apple M1 Max: 
                 speed         user         nice          sys         idle          irq
       #1-10  2400 MHz    1142917 s          0 s     430158 s   16110622 s          0 s
  Memory: 64.0 GB (45379.59375 MB free)
  Uptime: 268574.0 sec
  Load Avg:  2.03173828125  2.1787109375  2.23046875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

Baseline

Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  uname: Darwin 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 arm64 arm
  CPU: Apple M1 Max: 
                 speed         user         nice          sys         idle          irq
       #1-10  2400 MHz    1143977 s          0 s     431014 s   16122435 s          0 s
  Memory: 64.0 GB (42579.40625 MB free)
  Uptime: 268712.0 sec
  Load Avg:  2.283203125  2.1982421875  2.224609375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

@dahong67
Copy link
Owner Author

dahong67 commented Mar 1, 2024

Benchmark Report for GCPDecompositions

Job Properties

  • Time of benchmarks:
    • Target: 1 Mar 2024 - 08:14
    • Baseline: 1 Mar 2024 - 08:15
  • Package commits:
    • Target: 21bebd
    • Baseline: 40709e
  • Julia commits:
    • Target: bed2cd
    • Baseline: bed2cd
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: GCP_BENCHMARK_SUITES => khatrirao
    • Baseline: GCP_BENCHMARK_SUITES => khatrirao

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["khatrirao", "size=(30, 30), rank=30"] 0.88 (5%) ✅ 0.98 (1%) ✅
["khatrirao", "size=(30, 30), rank=5"] 1.08 (5%) ❌ 0.98 (1%) ✅
["khatrirao", "size=(30, 30), rank=60"] 0.97 (5%) 0.98 (1%) ✅
["khatrirao", "size=(30, 30), rank=90"] 0.92 (5%) ✅ 0.98 (1%) ✅
["khatrirao", "size=(500, 20, 40, 80), rank=90"] 1.05 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(60, 60), rank=5"] 0.91 (5%) ✅ 1.00 (1%)
["khatrirao", "size=(60, 60), rank=60"] 0.91 (5%) ✅ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["khatrirao"]

Julia versioninfo

Target

Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  uname: Darwin 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 arm64 arm
  CPU: Apple M1 Max: 
                 speed         user         nice          sys         idle          irq
       #1-10  2400 MHz    1150896 s          0 s     434838 s   16202342 s          0 s
  Memory: 64.0 GB (44430.15625 MB free)
  Uptime: 269627.0 sec
  Load Avg:  2.48828125  2.35009765625  2.38671875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

Baseline

Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  uname: Darwin 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 arm64 arm
  CPU: Apple M1 Max: 
                 speed         user         nice          sys         idle          irq
       #1-10  2400 MHz    1151487 s          0 s     435382 s   16209347 s          0 s
  Memory: 64.0 GB (44424.984375 MB free)
  Uptime: 269709.0 sec
  Load Avg:  2.64794921875  2.4150390625  2.40625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

@dahong67
Copy link
Owner Author

dahong67 commented Mar 1, 2024

Benchmark Report for GCPDecompositions

Job Properties

  • Time of benchmarks:
    • Target: 1 Mar 2024 - 08:20
    • Baseline: 1 Mar 2024 - 08:21
  • Package commits:
    • Target: a071fb
    • Baseline: 21bebd
  • Julia commits:
    • Target: bed2cd
    • Baseline: bed2cd
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: GCP_BENCHMARK_SUITES => khatrirao
    • Baseline: GCP_BENCHMARK_SUITES => khatrirao

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["khatrirao", "size=(100, 1000, 30), rank=30"] 0.95 (5%) ✅ 0.98 (1%) ✅
["khatrirao", "size=(100, 1000, 30), rank=5"] 0.98 (5%) 0.98 (1%) ✅
["khatrirao", "size=(100, 1000, 30), rank=60"] 0.97 (5%) 0.98 (1%) ✅
["khatrirao", "size=(100, 1000, 30), rank=90"] 0.97 (5%) 0.98 (1%) ✅
["khatrirao", "size=(30, 30), rank=30"] 0.80 (5%) ✅ 1.00 (1%)
["khatrirao", "size=(30, 30), rank=5"] 0.84 (5%) ✅ 1.01 (1%)
["khatrirao", "size=(30, 30), rank=60"] 0.76 (5%) ✅ 1.00 (1%)
["khatrirao", "size=(30, 30), rank=90"] 0.91 (5%) ✅ 1.00 (1%)
["khatrirao", "size=(30, 30, 30, 30), rank=30"] 0.96 (5%) 0.97 (1%) ✅
["khatrirao", "size=(30, 30, 30, 30), rank=5"] 0.94 (5%) ✅ 0.97 (1%) ✅
["khatrirao", "size=(30, 30, 30, 30), rank=60"] 0.93 (5%) ✅ 0.97 (1%) ✅
["khatrirao", "size=(30, 30, 30, 30), rank=90"] 0.96 (5%) 0.97 (1%) ✅
["khatrirao", "size=(30,), rank=60"] 0.00 (5%) ✅ 1.00 (1%)
["khatrirao", "size=(40, 80, 500, 20), rank=30"] 0.94 (5%) ✅ 0.95 (1%) ✅
["khatrirao", "size=(40, 80, 500, 20), rank=5"] 0.91 (5%) ✅ 0.95 (1%) ✅
["khatrirao", "size=(40, 80, 500, 20), rank=60"] 0.94 (5%) ✅ 0.95 (1%) ✅
["khatrirao", "size=(40, 80, 500, 20), rank=90"] 0.96 (5%) 0.95 (1%) ✅
["khatrirao", "size=(500, 20, 40, 80), rank=30"] 1.00 (5%) 0.99 (1%) ✅
["khatrirao", "size=(500, 20, 40, 80), rank=5"] 1.01 (5%) 0.99 (1%) ✅
["khatrirao", "size=(500, 20, 40, 80), rank=60"] 1.01 (5%) 0.99 (1%) ✅
["khatrirao", "size=(500, 20, 40, 80), rank=90"] 1.00 (5%) 0.99 (1%) ✅
["khatrirao", "size=(60, 60), rank=30"] 0.94 (5%) ✅ 1.00 (1%)
["khatrirao", "size=(60, 60), rank=60"] 0.94 (5%) ✅ 1.00 (1%)
["khatrirao", "size=(60, 60), rank=90"] 0.93 (5%) ✅ 1.00 (1%)
["khatrirao", "size=(60, 60, 60, 60), rank=30"] 0.97 (5%) 0.98 (1%) ✅
["khatrirao", "size=(60, 60, 60, 60), rank=5"] 0.99 (5%) 0.98 (1%) ✅
["khatrirao", "size=(60, 60, 60, 60), rank=60"] 0.94 (5%) ✅ 0.98 (1%) ✅
["khatrirao", "size=(60, 60, 60, 60), rank=90"] 0.94 (5%) ✅ 0.98 (1%) ✅
["khatrirao", "size=(80, 500, 20, 40), rank=30"] 1.01 (5%) 0.98 (1%) ✅
["khatrirao", "size=(80, 500, 20, 40), rank=5"] 1.03 (5%) 0.98 (1%) ✅
["khatrirao", "size=(80, 500, 20, 40), rank=60"] 0.95 (5%) 0.98 (1%) ✅
["khatrirao", "size=(80, 500, 20, 40), rank=90"] 0.96 (5%) 0.98 (1%) ✅
["khatrirao", "size=(90, 90), rank=30"] 1.07 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(90, 90), rank=5"] 0.93 (5%) ✅ 1.00 (1%)
["khatrirao", "size=(90, 90), rank=60"] 0.87 (5%) ✅ 1.00 (1%)
["khatrirao", "size=(90, 90, 90), rank=5"] 1.10 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(90, 90, 90, 90), rank=30"] 0.92 (5%) ✅ 0.99 (1%) ✅
["khatrirao", "size=(90, 90, 90, 90), rank=5"] 0.97 (5%) 0.99 (1%) ✅
["khatrirao", "size=(90, 90, 90, 90), rank=60"] 0.91 (5%) ✅ 0.99 (1%) ✅
["khatrirao", "size=(90, 90, 90, 90), rank=90"] 0.93 (5%) ✅ 0.99 (1%) ✅

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["khatrirao"]

Julia versioninfo

Target

Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  uname: Darwin 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 arm64 arm
  CPU: Apple M1 Max: 
                 speed         user         nice          sys         idle          irq
       #1-10  2400 MHz    1153686 s          0 s     436667 s   16234423 s          0 s
  Memory: 64.0 GB (44351.65625 MB free)
  Uptime: 269997.0 sec
  Load Avg:  1.9755859375  2.333984375  2.38671875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

Baseline

Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  uname: Darwin 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 arm64 arm
  CPU: Apple M1 Max: 
                 speed         user         nice          sys         idle          irq
       #1-10  2400 MHz    1154209 s          0 s     437214 s   16241581 s          0 s
  Memory: 64.0 GB (44274.09375 MB free)
  Uptime: 270080.0 sec
  Load Avg:  1.7626953125  2.1796875  2.3193359375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

Avoids allocating intermediate arrays, at a cost of additional multiplies.
@dahong67
Copy link
Owner Author

dahong67 commented Mar 1, 2024

Benchmark Report for GCPDecompositions

Job Properties

  • Time of benchmarks:
    • Target: 1 Mar 2024 - 08:30
    • Baseline: 1 Mar 2024 - 08:31
  • Package commits:
    • Target: f9a78c
    • Baseline: a071fb
  • Julia commits:
    • Target: bed2cd
    • Baseline: bed2cd
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: GCP_BENCHMARK_SUITES => khatrirao
    • Baseline: GCP_BENCHMARK_SUITES => khatrirao

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["khatrirao", "size=(100, 1000, 30), rank=30"] 1.53 (5%) ❌ 0.99 (1%)
["khatrirao", "size=(100, 1000, 30), rank=5"] 1.53 (5%) ❌ 0.99 (1%)
["khatrirao", "size=(100, 1000, 30), rank=60"] 1.57 (5%) ❌ 0.99 (1%)
["khatrirao", "size=(100, 1000, 30), rank=90"] 1.61 (5%) ❌ 0.99 (1%)
["khatrirao", "size=(1000, 30, 100), rank=30"] 1.57 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(1000, 30, 100), rank=5"] 1.51 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(1000, 30, 100), rank=60"] 1.60 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(1000, 30, 100), rank=90"] 1.60 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(20, 40, 80, 500), rank=30"] 1.82 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(20, 40, 80, 500), rank=5"] 1.86 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(20, 40, 80, 500), rank=60"] 1.80 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(20, 40, 80, 500), rank=90"] 1.89 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(30, 100, 1000), rank=30"] 1.50 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(30, 100, 1000), rank=5"] 1.43 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(30, 100, 1000), rank=60"] 1.49 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(30, 100, 1000), rank=90"] 1.51 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(30, 30), rank=30"] 1.10 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(30, 30), rank=5"] 1.27 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(30, 30), rank=90"] 0.89 (5%) ✅ 1.00 (1%)
["khatrirao", "size=(30, 30, 30), rank=30"] 1.49 (5%) ❌ 0.97 (1%) ✅
["khatrirao", "size=(30, 30, 30), rank=5"] 1.54 (5%) ❌ 0.97 (1%) ✅
["khatrirao", "size=(30, 30, 30), rank=60"] 1.49 (5%) ❌ 0.97 (1%) ✅
["khatrirao", "size=(30, 30, 30), rank=90"] 1.51 (5%) ❌ 0.97 (1%) ✅
["khatrirao", "size=(30, 30, 30, 30), rank=30"] 2.03 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(30, 30, 30, 30), rank=5"] 2.14 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(30, 30, 30, 30), rank=60"] 2.06 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(30, 30, 30, 30), rank=90"] 2.08 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(40, 80, 500, 20), rank=30"] 2.20 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(40, 80, 500, 20), rank=5"] 2.18 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(40, 80, 500, 20), rank=60"] 2.12 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(40, 80, 500, 20), rank=90"] 2.13 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(500, 20, 40, 80), rank=30"] 1.83 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(500, 20, 40, 80), rank=5"] 1.93 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(500, 20, 40, 80), rank=60"] 1.89 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(500, 20, 40, 80), rank=90"] 1.79 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(60, 60, 60), rank=30"] 1.46 (5%) ❌ 0.98 (1%) ✅
["khatrirao", "size=(60, 60, 60), rank=5"] 1.48 (5%) ❌ 0.98 (1%) ✅
["khatrirao", "size=(60, 60, 60), rank=60"] 1.48 (5%) ❌ 0.98 (1%) ✅
["khatrirao", "size=(60, 60, 60), rank=90"] 1.52 (5%) ❌ 0.98 (1%) ✅
["khatrirao", "size=(60, 60, 60, 60), rank=30"] 2.03 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(60, 60, 60, 60), rank=5"] 1.97 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(60, 60, 60, 60), rank=60"] 2.00 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(60, 60, 60, 60), rank=90"] 1.99 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(80, 500, 20, 40), rank=30"] 1.96 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(80, 500, 20, 40), rank=5"] 1.90 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(80, 500, 20, 40), rank=60"] 1.95 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(80, 500, 20, 40), rank=90"] 1.93 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(90, 90), rank=5"] 1.06 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(90, 90), rank=60"] 1.19 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(90, 90), rank=90"] 1.13 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(90, 90, 90), rank=30"] 1.53 (5%) ❌ 0.99 (1%) ✅
["khatrirao", "size=(90, 90, 90), rank=5"] 1.59 (5%) ❌ 0.99 (1%) ✅
["khatrirao", "size=(90, 90, 90), rank=60"] 1.51 (5%) ❌ 0.99 (1%) ✅
["khatrirao", "size=(90, 90, 90), rank=90"] 1.52 (5%) ❌ 0.99 (1%) ✅
["khatrirao", "size=(90, 90, 90, 90), rank=30"] 1.95 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(90, 90, 90, 90), rank=5"] 1.98 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(90, 90, 90, 90), rank=60"] 1.97 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(90, 90, 90, 90), rank=90"] 1.94 (5%) ❌ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["khatrirao"]

Julia versioninfo

Target

Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  uname: Darwin 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 arm64 arm
  CPU: Apple M1 Max: 
                 speed         user         nice          sys         idle          irq
       #1-10  2400 MHz    1158196 s          0 s     439403 s   16286523 s          0 s
  Memory: 64.0 GB (44441.828125 MB free)
  Uptime: 270595.0 sec
  Load Avg:  3.8408203125  3.01953125  2.625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

Baseline

Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  uname: Darwin 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 arm64 arm
  CPU: Apple M1 Max: 
                 speed         user         nice          sys         idle          irq
       #1-10  2400 MHz    1158763 s          0 s     439942 s   16293707 s          0 s
  Memory: 64.0 GB (44355.34375 MB free)
  Uptime: 270679.0 sec
  Load Avg:  2.31640625  2.7197265625  2.54541015625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

@dahong67
Copy link
Owner Author

dahong67 commented Mar 1, 2024

This PR adds benchmarks for Khatri-Rao and tests out some alternative implementations:

  1. 40709ec (originally from b0a6d2b) avoids making an extra copy of the output array K by computing all the intermediate products then using kron! for the last one. This has a big benefit since the output array can be large.
    temp = reduce(kron, [view(A[i], :, j) for i in 1:N-1])
    kron!(view(K, :, j), temp, view(A[N], :, j))

    also discussed in:
  2. 21bebdb swaps reduce(kron, ...) for kron(...). As one might expect, doesn't seem to make much difference.
    temp = (N == 2) ? view(A[1], :, j) : kron([view(A[i], :, j) for i in 1:N-1]...)
    kron!(view(K, :, j), temp, view(A[N], :, j))
  3. a071fbd tries a recursive version that tries to choose a good order for the intermediate products (idea is to always multiply the two smallest matrices). Doesn't seem to make much difference - would perhaps be more noticeable if there were many small modes.
    # Base case: N = 2
    if N == 2
    r = (only unique)(size.(A, 2))
    return reshape(reshape(A[1], :, 1, r) .* reshape(A[2], 1, :, r), :, r)
    end
    # Recursive case: N > 2
    I, r = size.(A, 1), (only unique)(size.(A, 2))
    n = argmin(n -> I[n] * I[n+1], 1:N-1)
    return khatrirao(A[1:n-1]..., khatrirao(A[n], A[n+1]), A[n+2:end]...)
  4. f9a78c9 tries a version based on broadcasting. The main benefit is that we avoid allocating arrays for intermediate products, but this generally comes at the cost of additional multiplies. In particular, for N matrices that are all d x r the earlier approaches involve r (d^2 + ... + d^N) multiplies and the same amount of storage but the broadcasting version involves r (N-1) d^N multiplies and r d^N storage. The earlier approaches may also have some cache-locality benefits. In the benchmarks here, the earlier version seemed overall better, so this version was reverted. May be interesting to revisit (perhaps for cases with large N and small d), could also try to make a version that gets the "best of both worlds" by carefully re-using the memory for the output for the intermediate products.
    # General case: N > 1
    r = (only unique)(size.(A, 2))
    R = ntuple(Val(N)) do k
    dims = (ntuple(i -> 1, Val(N - k))..., :, ntuple(i -> 1, Val(k - 1))..., r)
    return reshape(A[k], dims)
    end
    return reshape(broadcast(*, R...), :, r)

    also discussed/attempted in:

dahong67 added 2 commits March 1, 2024 10:30
Should probably also add some Khatri-Rao specific tests in the future.
@dahong67
Copy link
Owner Author

dahong67 commented Mar 1, 2024

Benchmark for bugfix.

Benchmark Report for GCPDecompositions

Job Properties

  • Time of benchmarks:
    • Target: 1 Mar 2024 - 14:56
    • Baseline: 1 Mar 2024 - 14:57
  • Package commits:
    • Target: 11c156
    • Baseline: 0f7773
  • Julia commits:
    • Target: bed2cd
    • Baseline: bed2cd
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: GCP_BENCHMARK_SUITES => khatrirao
    • Baseline: GCP_BENCHMARK_SUITES => khatrirao

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["khatrirao", "size=(100, 1000, 30), rank=30"] 1.09 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(100, 1000, 30), rank=5"] 1.06 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(100, 1000, 30), rank=60"] 1.05 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(100, 1000, 30), rank=90"] 1.07 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(20, 40, 80, 500), rank=5"] 1.08 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(30, 30), rank=5"] 1.16 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(30, 30), rank=60"] 1.12 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(30, 30, 30), rank=5"] 0.90 (5%) ✅ 1.00 (1%)
["khatrirao", "size=(30, 30, 30), rank=60"] 1.07 (5%) ❌ 1.00 (1%)
["khatrirao", "size=(90, 90), rank=5"] 0.93 (5%) ✅ 1.00 (1%)
["khatrirao", "size=(90, 90), rank=60"] 0.91 (5%) ✅ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["khatrirao"]

Julia versioninfo

Target

Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  uname: Darwin 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 arm64 arm
  CPU: Apple M1 Max: 
                 speed         user         nice          sys         idle          irq
       #1-10  2400 MHz    1299022 s          0 s     495685 s   17671955 s          0 s
  Memory: 64.0 GB (44493.9375 MB free)
  Uptime: 293749.0 sec
  Load Avg:  1.75390625  2.35302734375  2.923828125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

Baseline

Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  uname: Darwin 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 arm64 arm
  CPU: Apple M1 Max: 
                 speed         user         nice          sys         idle          irq
       #1-10  2400 MHz    1299540 s          0 s     496208 s   17678983 s          0 s
  Memory: 64.0 GB (44444.671875 MB free)
  Uptime: 293830.0 sec
  Load Avg:  1.69775390625  2.193359375  2.80859375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

@dahong67
Copy link
Owner Author

dahong67 commented Mar 1, 2024

Testing with the MTTKRP benchmark. Greatest impact is to mode-1 and mode-N MTTKRP's - makes sense since the interior modes involve smaller Khatri-Rao's (or may not even require any).

Benchmark Report for GCPDecompositions

Job Properties

  • Time of benchmarks:
    • Target: 1 Mar 2024 - 14:48
    • Baseline: 1 Mar 2024 - 14:49
  • Package commits:
    • Target: 11c156
    • Baseline: 8f57ee
  • Julia commits:
    • Target: bed2cd
    • Baseline: bed2cd
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: GCP_BENCHMARK_SUITES => mttkrp
    • Baseline: GCP_BENCHMARK_SUITES => mttkrp

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["mttkrp", "size=(100, 100, 100), rank=10, mode=1"] 0.85 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(100, 100, 100), rank=10, mode=3"] 0.83 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(100, 100, 100), rank=100, mode=1"] 0.69 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(100, 100, 100), rank=100, mode=3"] 0.70 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(100, 100, 100), rank=150, mode=1"] 0.65 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(100, 100, 100), rank=150, mode=3"] 0.63 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(100, 100, 100), rank=200, mode=1"] 0.65 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(100, 100, 100), rank=200, mode=3"] 0.64 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(100, 100, 100), rank=250, mode=1"] 0.64 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(100, 100, 100), rank=250, mode=3"] 0.65 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(100, 100, 100), rank=300, mode=1"] 0.66 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(100, 100, 100), rank=300, mode=3"] 0.66 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(100, 100, 100), rank=50, mode=1"] 0.73 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(100, 100, 100), rank=50, mode=3"] 0.73 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(1000, 100, 30), rank=10, mode=1"] 0.95 (5%) ✅ 0.57 (1%) ✅
["mttkrp", "size=(1000, 100, 30), rank=10, mode=3"] 0.70 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(1000, 100, 30), rank=100, mode=1"] 0.90 (5%) ✅ 0.57 (1%) ✅
["mttkrp", "size=(1000, 100, 30), rank=100, mode=3"] 0.59 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(1000, 100, 30), rank=200, mode=1"] 0.87 (5%) ✅ 0.57 (1%) ✅
["mttkrp", "size=(1000, 100, 30), rank=200, mode=3"] 0.57 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(1000, 100, 30), rank=300, mode=1"] 0.88 (5%) ✅ 0.57 (1%) ✅
["mttkrp", "size=(1000, 100, 30), rank=300, mode=3"] 0.55 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=10, mode=1"] 0.94 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=10, mode=3"] 0.86 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=100, mode=1"] 0.72 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=100, mode=3"] 0.72 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=150, mode=1"] 0.69 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=150, mode=3"] 0.68 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=200, mode=1"] 0.72 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=200, mode=3"] 0.68 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=250, mode=1"] 0.70 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=250, mode=3"] 0.68 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=300, mode=1"] 0.67 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=300, mode=3"] 0.70 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=50, mode=1"] 0.75 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(150, 150, 150), rank=50, mode=3"] 0.79 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=10, mode=1"] 0.84 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=10, mode=3"] 0.87 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=100, mode=1"] 0.72 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=100, mode=3"] 0.73 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=150, mode=1"] 0.72 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=150, mode=3"] 0.70 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=200, mode=1"] 0.71 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=200, mode=3"] 0.72 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=250, mode=1"] 0.70 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=250, mode=3"] 0.69 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=300, mode=1"] 0.70 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=300, mode=3"] 0.70 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=50, mode=1"] 0.78 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(200, 200, 200), rank=50, mode=3"] 0.80 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(30, 100, 1000), rank=10, mode=1"] 0.75 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(30, 100, 1000), rank=10, mode=3"] 0.94 (5%) ✅ 0.57 (1%) ✅
["mttkrp", "size=(30, 100, 1000), rank=100, mode=1"] 0.61 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(30, 100, 1000), rank=100, mode=3"] 0.89 (5%) ✅ 0.57 (1%) ✅
["mttkrp", "size=(30, 100, 1000), rank=200, mode=1"] 0.62 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(30, 100, 1000), rank=200, mode=3"] 0.87 (5%) ✅ 0.57 (1%) ✅
["mttkrp", "size=(30, 100, 1000), rank=300, mode=1"] 0.58 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(30, 100, 1000), rank=300, mode=3"] 0.89 (5%) ✅ 0.57 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=10, mode=1"] 0.86 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=10, mode=3"] 0.86 (5%) ✅ 0.51 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=100, mode=1"] 0.64 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=100, mode=3"] 0.65 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=150, mode=1"] 0.66 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=150, mode=3"] 0.61 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=200, mode=1"] 0.60 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=200, mode=3"] 0.63 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=250, mode=1"] 0.59 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=250, mode=3"] 0.62 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=300, mode=1"] 0.64 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=300, mode=3"] 0.61 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=50, mode=1"] 0.69 (5%) ✅ 0.50 (1%) ✅
["mttkrp", "size=(50, 50, 50), rank=50, mode=2"] 0.94 (5%) ✅ 1.00 (1%)
["mttkrp", "size=(50, 50, 50), rank=50, mode=3"] 0.71 (5%) ✅ 0.50 (1%) ✅

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["mttkrp"]

Julia versioninfo

Target

Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  uname: Darwin 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 arm64 arm
  CPU: Apple M1 Max: 
                 speed         user         nice          sys         idle          irq
       #1-10  2400 MHz    1294965 s          0 s     492258 s   17634484 s          0 s
  Memory: 64.0 GB (17078.03125 MB free)
  Uptime: 293295.0 sec
  Load Avg:  4.90576171875  3.03515625  3.359375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

Baseline

Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  uname: Darwin 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:18 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6000 arm64 arm
  CPU: Apple M1 Max: 
                 speed         user         nice          sys         idle          irq
       #1-10  2400 MHz    1296314 s          0 s     493388 s   17635366 s          0 s
  Memory: 64.0 GB (16610.0625 MB free)
  Uptime: 293329.0 sec
  Load Avg:  6.3515625  3.5576171875  3.5361328125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

MTTKRP benchmark plots

Runtime vs. size (for square tensors)

Below are plots showing the runtime in miliseconds of MTTKRP as a function of the size of the square tensor, for varying ranks and modes:

ndims = 3, rank = 10, mode = 1 ndims = 3, rank = 10, mode = 2 ndims = 3, rank = 10, mode = 3 ndims = 3, rank = 50, mode = 1 ndims = 3, rank = 50, mode = 2 ndims = 3, rank = 50, mode = 3 ndims = 3, rank = 100, mode = 1 ndims = 3, rank = 100, mode = 2 ndims = 3, rank = 100, mode = 3 ndims = 3, rank = 150, mode = 1 ndims = 3, rank = 150, mode = 2 ndims = 3, rank = 150, mode = 3 ndims = 3, rank = 200, mode = 1 ndims = 3, rank = 200, mode = 2 ndims = 3, rank = 200, mode = 3 ndims = 3, rank = 250, mode = 1 ndims = 3, rank = 250, mode = 2 ndims = 3, rank = 250, mode = 3 ndims = 3, rank = 300, mode = 1 ndims = 3, rank = 300, mode = 2 ndims = 3, rank = 300, mode = 3
Target
             ndims = 3, rank = 10, mode = 1  
            ┌──────────────────────────────┐ 
          4 │                              │ 
            │                            .'│ 
            │                          .'  │ 
            │                         :    │ 
Time (ms)   │                       .'     │ 
            │                     .'       │ 
            │                   .'         │ 
            │                 .'           │ 
            │            ....'             │ 
          0 │       ..'''                  │ 
            └──────────────────────────────┘ 
             0                          200  
                          Size               
             ndims = 3, rank = 10, mode = 2  
            ┌──────────────────────────────┐ 
          2 │                              │ 
            │                            .'│ 
            │                          .'  │ 
            │                        .'    │ 
Time (ms)   │                      .'      │ 
            │                    .'        │ 
            │                  .'          │ 
            │                .'            │ 
            │             ..:              │ 
          0 │       ..''''                 │ 
            └──────────────────────────────┘ 
             0                          200  
                          Size               
             ndims = 3, rank = 10, mode = 3  
            ┌──────────────────────────────┐ 
          3 │                             .│ 
            │                            .'│ 
            │                          .'  │ 
            │                        .'    │ 
Time (ms)   │                       .'     │ 
            │                     .'       │ 
            │                   .'         │ 
            │                .''           │ 
            │           ...''              │ 
          0 │       .'''                   │ 
            └──────────────────────────────┘ 
             0                          200  
                          Size               
              ndims = 3, rank = 50, mode = 1  
             ┌──────────────────────────────┐ 
          10 │                              │ 
             │                            .'│ 
             │                           .' │ 
             │                         .'   │ 
Time (ms)    │                        :     │ 
             │                      .'      │ 
             │                   ..'        │ 
             │                 .'           │ 
             │           ...'''             │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
             ndims = 3, rank = 50, mode = 2  
            ┌──────────────────────────────┐ 
          7 │                             .│ 
            │                            .'│ 
            │                          .'  │ 
            │                         .'   │ 
Time (ms)   │                       .'     │ 
            │                      .'      │ 
            │                    .'        │ 
            │                 ..'          │ 
            │              ..'             │ 
          0 │       ..'''''                │ 
            └──────────────────────────────┘ 
             0                          200  
                          Size               
             ndims = 3, rank = 50, mode = 3  
            ┌──────────────────────────────┐ 
          9 │                             .│ 
            │                           .' │ 
            │                          .'  │ 
            │                        .'    │ 
Time (ms)   │                       .'     │ 
            │                     .'       │ 
            │                  ..'         │ 
            │                .'            │ 
            │           ...''              │ 
          0 │       .'''                   │ 
            └──────────────────────────────┘ 
             0                          200  
                          Size               
              ndims = 3, rank = 100, mode = 1 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                              │ 
             │                              │ 
             │                             .│ 
Time (ms)    │                          .'' │ 
             │                        .'    │ 
             │                      .'      │ 
             │                  ..''        │ 
             │             ...''            │ 
           0 │       ..''''                 │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 100, mode = 2 
             ┌──────────────────────────────┐ 
          11 │                              │ 
             │                            .'│ 
             │                           .' │ 
             │                         .'   │ 
Time (ms)    │                        :     │ 
             │                      .'      │ 
             │                    .'        │ 
             │                 ..'          │ 
             │              ..'             │ 
           0 │       ..'''''                │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 100, mode = 3 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                              │ 
             │                              │ 
             │                             .│ 
Time (ms)    │                          .'' │ 
             │                        .'    │ 
             │                      .'      │ 
             │                  ..''        │ 
             │             ....'            │ 
           0 │       ..''''                 │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 150, mode = 1 
             ┌──────────────────────────────┐ 
          20 │                             .│ 
             │                            .'│ 
             │                          .'  │ 
             │                        .'    │ 
Time (ms)    │                       .'     │ 
             │                     .:       │ 
             │                   .'         │ 
             │                .''           │ 
             │           ...''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 150, mode = 2 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                              │ 
             │                              │ 
             │                            .'│ 
Time (ms)    │                          .'  │ 
             │                        .'    │ 
             │                     ..'      │ 
             │                  .''         │ 
             │            ....''            │ 
           0 │       ..'''                  │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 150, mode = 3 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                            .'│ 
             │                          .'  │ 
             │                         :    │ 
Time (ms)    │                       .'     │ 
             │                      :       │ 
             │                   .''        │ 
             │                ..'           │ 
             │           ...''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 200, mode = 1 
             ┌──────────────────────────────┐ 
          30 │                              │ 
             │                              │ 
             │                             .│ 
             │                           .' │ 
Time (ms)    │                         .'   │ 
             │                       .'     │ 
             │                    ..'       │ 
             │                 ..'          │ 
             │            ....'             │ 
           0 │       ..'''                  │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 200, mode = 2 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                             .│ 
             │                           .' │ 
             │                          :   │ 
Time (ms)    │                        .'    │ 
             │                      .'      │ 
             │                   .''        │ 
             │                .''           │ 
             │           ..'''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 200, mode = 3 
             ┌──────────────────────────────┐ 
          30 │                              │ 
             │                              │ 
             │                             .│ 
             │                           .' │ 
Time (ms)    │                         .'   │ 
             │                       .'     │ 
             │                    ..'       │ 
             │                 ..'          │ 
             │            ....'             │ 
           0 │       ..'''                  │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 250, mode = 1 
             ┌──────────────────────────────┐ 
          30 │                              │ 
             │                            .'│ 
             │                          .'  │ 
             │                         :    │ 
Time (ms)    │                       .'     │ 
             │                     .'       │ 
             │                  ..'         │ 
             │                .'            │ 
             │           ..'''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 250, mode = 2 
             ┌──────────────────────────────┐ 
          30 │                              │ 
             │                              │ 
             │                             .│ 
             │                           .' │ 
Time (ms)    │                         .'   │ 
             │              :'''...  .'     │ 
             │            .'       ''       │ 
             │          .'                  │ 
             │         :                    │ 
           0 │       .'                     │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 250, mode = 3 
             ┌──────────────────────────────┐ 
          30 │                             .│ 
             │                            .'│ 
             │                          .'  │ 
             │                        .'    │ 
Time (ms)    │                       .'     │ 
             │                     .'       │ 
             │                   .'         │ 
             │                .''           │ 
             │           ..'''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 300, mode = 1 
             ┌──────────────────────────────┐ 
          40 │                              │ 
             │                             .│ 
             │                           .' │ 
             │                         .'   │ 
Time (ms)    │                        .'    │ 
             │                      .'      │ 
             │                   ..'        │ 
             │                ..'           │ 
             │           ...''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 300, mode = 2 
             ┌──────────────────────────────┐ 
          30 │                              │ 
             │                            .'│ 
             │                          .'  │ 
             │                         :    │ 
Time (ms)    │                       .'     │ 
             │                      :       │ 
             │                   .''        │ 
             │                .''           │ 
             │           ..'''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 300, mode = 3 
             ┌──────────────────────────────┐ 
          40 │                              │ 
             │                             .│ 
             │                           .' │ 
             │                          :   │ 
Time (ms)    │                        .'    │ 
             │                      .'      │ 
             │                   .''        │ 
             │                ..'           │ 
             │           ...''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
Baseline
             ndims = 3, rank = 10, mode = 1  
            ┌──────────────────────────────┐ 
          5 │                              │ 
            │                             .│ 
            │                            : │ 
            │                          .'  │ 
Time (ms)   │                        .'    │ 
            │                       .'     │ 
            │                    ..'       │ 
            │                 ..'          │ 
            │             ...'             │ 
          0 │       ..''''                 │ 
            └──────────────────────────────┘ 
             0                          200  
                          Size               
             ndims = 3, rank = 10, mode = 2  
            ┌──────────────────────────────┐ 
          2 │                              │ 
            │                            .'│ 
            │                          .'  │ 
            │                        .'    │ 
Time (ms)   │                      .'      │ 
            │                    .'        │ 
            │                  .'          │ 
            │                .'            │ 
            │             ..:              │ 
          0 │       ..''''                 │ 
            └──────────────────────────────┘ 
             0                          200  
                          Size               
             ndims = 3, rank = 10, mode = 3  
            ┌──────────────────────────────┐ 
          4 │                              │ 
            │                              │ 
            │                            .'│ 
            │                          .'  │ 
Time (ms)   │                        .'    │ 
            │                      .'      │ 
            │                    .'        │ 
            │                 .''          │ 
            │           ...'''             │ 
          0 │       .'''                   │ 
            └──────────────────────────────┘ 
             0                          200  
                          Size               
              ndims = 3, rank = 50, mode = 1  
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                              │ 
             │                              │ 
             │                              │ 
Time (ms)    │                            .'│ 
             │                         ..'  │ 
             │                       .'     │ 
             │                   ..''       │ 
             │              ...''           │ 
           0 │       ...''''                │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
             ndims = 3, rank = 50, mode = 2  
            ┌──────────────────────────────┐ 
          7 │                             .│ 
            │                            .'│ 
            │                          .'  │ 
            │                         .'   │ 
Time (ms)   │                       .'     │ 
            │                      .'      │ 
            │                    .'        │ 
            │                 ..'          │ 
            │              ..'             │ 
          0 │       ..'''''                │ 
            └──────────────────────────────┘ 
             0                          200  
                          Size               
              ndims = 3, rank = 50, mode = 3  
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                              │ 
             │                              │ 
             │                              │ 
Time (ms)    │                            ..│ 
             │                          .'  │ 
             │                       .''    │ 
             │                   ...'       │ 
             │              ...''           │ 
           0 │       ...''''                │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 100, mode = 1 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                            .'│ 
             │                          .'  │ 
             │                         :    │ 
Time (ms)    │                       .'     │ 
             │                     .'       │ 
             │                   .'         │ 
             │                .''           │ 
             │           ..'''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 100, mode = 2 
             ┌──────────────────────────────┐ 
          11 │                              │ 
             │                            .'│ 
             │                           :  │ 
             │                         .'   │ 
Time (ms)    │                        :     │ 
             │                      .'      │ 
             │                    .'        │ 
             │                 ..'          │ 
             │              ..'             │ 
           0 │       ..'''''                │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 100, mode = 3 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                            .'│ 
             │                          .'  │ 
             │                         .'   │ 
Time (ms)    │                       .'     │ 
             │                     .'       │ 
             │                   .'         │ 
             │                .''           │ 
             │           ...''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 150, mode = 1 
             ┌──────────────────────────────┐ 
          30 │                              │ 
             │                              │ 
             │                            .'│ 
             │                          .'  │ 
Time (ms)    │                        .'    │ 
             │                      .'      │ 
             │                   .''        │ 
             │                ..'           │ 
             │           ...''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 150, mode = 2 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                              │ 
             │                              │ 
             │                            .'│ 
Time (ms)    │                          .'  │ 
             │                        .'    │ 
             │                     ..'      │ 
             │                  .''         │ 
             │            ....''            │ 
           0 │       ..'''                  │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 150, mode = 3 
             ┌──────────────────────────────┐ 
          30 │                              │ 
             │                             .│ 
             │                            .'│ 
             │                          .'  │ 
Time (ms)    │                        .'    │ 
             │                      .'      │ 
             │                   ..'        │ 
             │                ..'           │ 
             │           ...''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 200, mode = 1 
             ┌──────────────────────────────┐ 
          40 │                              │ 
             │                              │ 
             │                            .'│ 
             │                          .'  │ 
Time (ms)    │                        .'    │ 
             │                      .'      │ 
             │                    .''       │ 
             │                ..''          │ 
             │           ...''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 200, mode = 2 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                             :│ 
             │                           .' │ 
             │                         .'   │ 
Time (ms)    │                        :     │ 
             │                      .'      │ 
             │                   .''        │ 
             │                .''           │ 
             │           ..'''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 200, mode = 3 
             ┌──────────────────────────────┐ 
          40 │                              │ 
             │                             .│ 
             │                           .' │ 
             │                         .'   │ 
Time (ms)    │                        .'    │ 
             │                      .'      │ 
             │                   ..'        │ 
             │                ..'           │ 
             │           ...''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 250, mode = 1 
             ┌──────────────────────────────┐ 
          40 │                            .'│ 
             │                           .' │ 
             │                         .'   │ 
             │                        :     │ 
Time (ms)    │                      .'      │ 
             │                    .'        │ 
             │                  .'          │ 
             │               .''            │ 
             │          ..'''               │ 
           0 │       '''                    │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 250, mode = 2 
             ┌──────────────────────────────┐ 
          30 │                              │ 
             │                              │ 
             │                             .│ 
             │                           .' │ 
Time (ms)    │                         .'   │ 
             │                       .'     │ 
             │                    ..'       │ 
             │                 .''          │ 
             │           ...'''             │ 
           0 │       ..''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 250, mode = 3 
             ┌──────────────────────────────┐ 
          40 │                            .'│ 
             │                           .' │ 
             │                         .'   │ 
             │                        :     │ 
Time (ms)    │                      .'      │ 
             │                    .'        │ 
             │                  .'          │ 
             │               .''            │ 
             │          ..'''               │ 
           0 │       .''                    │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 300, mode = 1 
             ┌──────────────────────────────┐ 
          50 │                             :│ 
             │                           .' │ 
             │                         .'   │ 
             │                        :     │ 
Time (ms)    │                      .'      │ 
             │                    .''       │ 
             │                  .'          │ 
             │               ..'            │ 
             │          ...''               │ 
           0 │       .''                    │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 300, mode = 2 
             ┌──────────────────────────────┐ 
          30 │                             .│ 
             │                            .'│ 
             │                          .'  │ 
             │                        .'    │ 
Time (ms)    │                       .'     │ 
             │                     .:       │ 
             │                   .'         │ 
             │                .''           │ 
             │           ..'''              │ 
           0 │       .'''                   │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               
              ndims = 3, rank = 300, mode = 3 
             ┌──────────────────────────────┐ 
          50 │                             :│ 
             │                           .' │ 
             │                         .'   │ 
             │                        :     │ 
Time (ms)    │                      .'      │ 
             │                    ..'       │ 
             │                  .'          │ 
             │               ..'            │ 
             │          ...''               │ 
           0 │       .''                    │ 
             └──────────────────────────────┘ 
              0                          200  
                           Size               

Runtime vs. rank

Below are plots showing the runtime in miliseconds of MTTKRP as a function of the size of the rank, for varying sizes and modes:

size = (30, 100, 1000), mode = 1 size = (30, 100, 1000), mode = 2 size = (30, 100, 1000), mode = 3 size = (50, 50, 50), mode = 1 size = (50, 50, 50), mode = 2 size = (50, 50, 50), mode = 3 size = (100, 100, 100), mode = 1 size = (100, 100, 100), mode = 2 size = (100, 100, 100), mode = 3 size = (150, 150, 150), mode = 1 size = (150, 150, 150), mode = 2 size = (150, 150, 150), mode = 3 size = (200, 200, 200), mode = 1 size = (200, 200, 200), mode = 2 size = (200, 200, 200), mode = 3 size = (1000, 100, 30), mode = 1 size = (1000, 100, 30), mode = 2 size = (1000, 100, 30), mode = 3
Target
             size = (30, 100, 1000), mode = 1 
             ┌──────────────────────────────┐ 
          50 │                             .│ 
             │                         ..'' │ 
             │                      ..'     │ 
             │                   .''        │ 
Time (ms)    │               ..''           │ 
             │            .''               │ 
             │         .''                  │ 
             │      .''                     │ 
             │   .''                        │ 
           0 │ ''                           │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
            size = (30, 100, 1000), mode = 2 
            ┌──────────────────────────────┐ 
          7 │                            .:│ 
            │                         ..'  │ 
            │                      ..'     │ 
            │                   ..'        │ 
Time (ms)   │               ..''           │ 
            │           ..''               │ 
            │        .''                   │ 
            │    ..''                      │ 
            │ ..'                          │ 
          0 │                              │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
            size = (30, 100, 1000), mode = 3 
            ┌──────────────────────────────┐ 
          8 │                             .│ 
            │                          ..' │ 
            │                        .'    │ 
            │                     .''      │ 
Time (ms)   │                ...''         │ 
            │            ..''              │ 
            │        ..''                  │ 
            │     ..'                      │ 
            │  .''                         │ 
          0 │ '                            │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
              size = (50, 50, 50), mode = 1  
            ┌──────────────────────────────┐ 
          2 │                              │ 
            │                              │ 
            │                             .│ 
            │                         ..'''│ 
Time (ms)   │                      .''     │ 
            │                ...'''        │ 
            │            ..''              │ 
            │       ...''                  │ 
            │   .'''                       │ 
          0 │ ''                           │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
              size = (50, 50, 50), mode = 2  
            ┌──────────────────────────────┐ 
          1 │                             :│ 
            │                          .'' │ 
            │                       .''    │ 
            │                   ..''       │ 
Time (ms)   │                ..'           │ 
            │             .''              │ 
            │         ..''                 │ 
            │     ..''                     │ 
            │  ..'                         │ 
          0 │ '                            │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
              size = (50, 50, 50), mode = 3  
            ┌──────────────────────────────┐ 
          2 │                              │ 
            │                              │ 
            │                              │ 
            │                          ..''│ 
Time (ms)   │                     ...''    │ 
            │                 ..''         │ 
            │            ...''             │ 
            │      ...'''                  │ 
            │   .''                        │ 
          0 │ ''                           │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
            size = (100, 100, 100), mode = 1 
            ┌──────────────────────────────┐ 
          7 │                             .│ 
            │                          ..' │ 
            │                       ..'    │ 
            │                   ..''       │ 
Time (ms)   │                .''           │ 
            │            ..''              │ 
            │        ...'                  │ 
            │     .''                      │ 
            │  .''                         │ 
          0 │ '                            │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
             size = (100, 100, 100), mode = 2 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                              │ 
             │                         .    │ 
             │                        :'.   │ 
Time (ms)    │                       :  '.  │ 
             │                      :    '. │ 
             │                     :      '.│ 
             │                    :         │ 
             │             ....''''         │ 
           0 │ .......'''''                 │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
            size = (100, 100, 100), mode = 3 
            ┌──────────────────────────────┐ 
          7 │                             :│ 
            │                          .'' │ 
            │                       .''    │ 
            │                    .''       │ 
Time (ms)   │                ..''          │ 
            │             .''              │ 
            │         ..''                 │ 
            │     ..''                     │ 
            │  ..'                         │ 
          0 │ '                            │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
             size = (150, 150, 150), mode = 1 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                            .:│ 
             │                        ..''  │ 
             │                     .''      │ 
Time (ms)    │                 ..''         │ 
             │             ..''             │ 
             │         ..''                 │ 
             │     ..''                     │ 
             │  .''                         │ 
           0 │ '                            │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
             size = (150, 150, 150), mode = 2 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                              │ 
             │                              │ 
             │                           ..'│ 
Time (ms)    │                       ..''   │ 
             │                  ...''       │ 
             │             ...''            │ 
             │        ...''                 │ 
             │   ..'''                      │ 
           0 │ ''                           │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
             size = (150, 150, 150), mode = 3 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                            .:│ 
             │                         ..'  │ 
             │                     ..''     │ 
Time (ms)    │                 ..''         │ 
             │             ..''             │ 
             │         ..''                 │ 
             │     ..''                     │ 
             │  ..'                         │ 
           0 │ '                            │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
             size = (200, 200, 200), mode = 1 
             ┌──────────────────────────────┐ 
          40 │                              │ 
             │                             :│ 
             │                         ..'' │ 
             │                      .''     │ 
Time (ms)    │                 ...''        │ 
             │             ..''             │ 
             │         ..''                 │ 
             │     ..''                     │ 
             │  .''                         │ 
           0 │ '                            │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
             size = (200, 200, 200), mode = 2 
             ┌──────────────────────────────┐ 
          30 │                              │ 
             │                            .'│ 
             │                         .''  │ 
             │                     ..''     │ 
Time (ms)    │                 ..''         │ 
             │             ..''             │ 
             │         ..''                 │ 
             │     ..''                     │ 
             │  ..'                         │ 
           0 │ '                            │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
             size = (200, 200, 200), mode = 3 
             ┌──────────────────────────────┐ 
          40 │                              │ 
             │                             :│ 
             │                          .'' │ 
             │                     ...''    │ 
Time (ms)    │                  .''         │ 
             │              ..''            │ 
             │         ..'''                │ 
             │     ..''                     │ 
             │  ..'                         │ 
           0 │ '                            │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
            size = (1000, 100, 30), mode = 1 
            ┌──────────────────────────────┐ 
          8 │                             .│ 
            │                           .''│ 
            │                        .''   │ 
            │                     .''      │ 
Time (ms)   │                ..'''         │ 
            │            ..''              │ 
            │        ..''                  │ 
            │     .''                      │ 
            │ ..''                         │ 
          0 │                              │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
            size = (1000, 100, 30), mode = 2 
            ┌──────────────────────────────┐ 
          7 │                            .'│ 
            │                         ..'  │ 
            │                       .'     │ 
            │                   ..''       │ 
Time (ms)   │               ..''           │ 
            │          ...''               │ 
            │       ..'                    │ 
            │    ..'                       │ 
            │ ..'                          │ 
          0 │                              │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
             size = (1000, 100, 30), mode = 3 
             ┌──────────────────────────────┐ 
          50 │                             :│ 
             │                          .'' │ 
             │                      ..''    │ 
             │                   ..'        │ 
Time (ms)    │                .''           │ 
             │            ..''              │ 
             │         ..'                  │ 
             │      .''                     │ 
             │   .''                        │ 
           0 │ ''                           │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
Baseline
             size = (30, 100, 1000), mode = 1 
             ┌──────────────────────────────┐ 
          90 │                             .│ 
             │                           .''│ 
             │                        .''   │ 
             │                     .''      │ 
Time (ms)    │                 ..''         │ 
             │             ..''             │ 
             │          ..'                 │ 
             │      ..''                    │ 
             │   ..'                        │ 
           0 │ .'                           │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
            size = (30, 100, 1000), mode = 2 
            ┌──────────────────────────────┐ 
          8 │                             .│ 
            │                           .''│ 
            │                        ..'   │ 
            │                      .'      │ 
Time (ms)   │                  ..''        │ 
            │             ...''            │ 
            │         ..''                 │ 
            │     ..''                     │ 
            │  .''                         │ 
          0 │ '                            │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
            size = (30, 100, 1000), mode = 3 
            ┌──────────────────────────────┐ 
          9 │                             .│ 
            │                          ..' │ 
            │                       ..'    │ 
            │                    ..'       │ 
Time (ms)   │                ..''          │ 
            │            ..''              │ 
            │        ..''                  │ 
            │     ..'                      │ 
            │  ..'                         │ 
          0 │ '                            │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
              size = (50, 50, 50), mode = 1  
            ┌──────────────────────────────┐ 
          3 │                              │ 
            │                              │ 
            │                           ...│ 
            │                       ..''   │ 
Time (ms)   │                    ..'       │ 
            │                ..''          │ 
            │           ..'''              │ 
            │       ..''                   │ 
            │   ..''                       │ 
          0 │ ''                           │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
              size = (50, 50, 50), mode = 2  
            ┌──────────────────────────────┐ 
          1 │                            .:│ 
            │                         ..'  │ 
            │                      ..'     │ 
            │                   .''        │ 
Time (ms)   │               ..''           │ 
            │            ..'               │ 
            │         ..'                  │ 
            │     ..''                     │ 
            │  ..'                         │ 
          0 │ '                            │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
              size = (50, 50, 50), mode = 3  
            ┌──────────────────────────────┐ 
          3 │                              │ 
            │                              │ 
            │                            .:│ 
            │                        ..''  │ 
Time (ms)   │                    ..''      │ 
            │               ...''          │ 
            │           ..''               │ 
            │       ..''                   │ 
            │   ..''                       │ 
          0 │ ''                           │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
             size = (100, 100, 100), mode = 1 
             ┌──────────────────────────────┐ 
          11 │                             .│ 
             │                           .''│ 
             │                       ..''   │ 
             │                    .''       │ 
Time (ms)    │                ..''          │ 
             │             .''              │ 
             │          .''                 │ 
             │      ..''                    │ 
             │   .''                        │ 
           0 │ ''                           │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
            size = (100, 100, 100), mode = 2 
            ┌──────────────────────────────┐ 
          6 │                             .│ 
            │                         .''' │ 
            │                      .''     │ 
            │                    .'        │ 
Time (ms)   │                ..''          │ 
            │             .''              │ 
            │           .'                 │ 
            │          :                   │ 
            │    ..''''                    │ 
          0 │ .''                          │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
             size = (100, 100, 100), mode = 3 
             ┌──────────────────────────────┐ 
          11 │                             .│ 
             │                           .''│ 
             │                        .''   │ 
             │                    ..''      │ 
Time (ms)    │                ..''          │ 
             │             ..'              │ 
             │          .''                 │ 
             │      ..''                    │ 
             │   .''                        │ 
           0 │ ''                           │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
             size = (150, 150, 150), mode = 1 
             ┌──────────────────────────────┐ 
          30 │                              │ 
             │                            .:│ 
             │                         ..'  │ 
             │                      ..'     │ 
Time (ms)    │                  ...'        │ 
             │              ..''            │ 
             │          ..''                │ 
             │      ..''                    │ 
             │   .''                        │ 
           0 │ ''                           │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
             size = (150, 150, 150), mode = 2 
             ┌──────────────────────────────┐ 
          20 │                              │ 
             │                              │ 
             │                              │ 
             │                          ...'│ 
Time (ms)    │                       .''    │ 
             │                  ...''       │ 
             │             ...''            │ 
             │        ...''                 │ 
             │   ..'''                      │ 
           0 │ ''                           │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
             size = (150, 150, 150), mode = 3 
             ┌──────────────────────────────┐ 
          30 │                              │ 
             │                             :│ 
             │                          .'' │ 
             │                      ..''    │ 
Time (ms)    │                  ..''        │ 
             │              ..''            │ 
             │          ..''                │ 
             │      ..''                    │ 
             │   .''                        │ 
           0 │ ''                           │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
             size = (200, 200, 200), mode = 1 
             ┌──────────────────────────────┐ 
          50 │                            .:│ 
             │                          .'  │ 
             │                       .''    │ 
             │                   ..''       │ 
Time (ms)    │               ..''           │ 
             │            .''               │ 
             │        ..''                  │ 
             │     .''                      │ 
             │  .''                         │ 
           0 │ '                            │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
             size = (200, 200, 200), mode = 2 
             ┌──────────────────────────────┐ 
          30 │                             .│ 
             │                           ..'│ 
             │                        ..'   │ 
             │                     .''      │ 
Time (ms)    │                 ..''         │ 
             │             ..''             │ 
             │         ..''                 │ 
             │     ..''                     │ 
             │   .'                         │ 
           0 │ ''                           │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
             size = (200, 200, 200), mode = 3 
             ┌──────────────────────────────┐ 
          50 │                             :│ 
             │                          .'' │ 
             │                     ..'''    │ 
             │                  .''         │ 
Time (ms)    │               ..'            │ 
             │            ..'               │ 
             │        ..''                  │ 
             │     ..'                      │ 
             │  ..'                         │ 
           0 │ '                            │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               
            size = (1000, 100, 30), mode = 1 
            ┌──────────────────────────────┐ 
          9 │                             :│ 
            │                          ..' │ 
            │                       ..'    │ 
            │                    ..'       │ 
Time (ms)   │                ..''          │ 
            │            ..''              │ 
            │        ..''                  │ 
            │     .''                      │ 
            │ ..''                         │ 
          0 │                              │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
            size = (1000, 100, 30), mode = 2 
            ┌──────────────────────────────┐ 
          7 │                            .'│ 
            │                         ..'  │ 
            │                      ..'     │ 
            │                   ..'        │ 
Time (ms)   │               ..''           │ 
            │           ..''               │ 
            │       ..''                   │ 
            │    ..'                       │ 
            │ ..'                          │ 
          0 │                              │ 
            └──────────────────────────────┘ 
             0                          300  
                          Rank               
             size = (1000, 100, 30), mode = 3 
             ┌──────────────────────────────┐ 
          90 │                             .│ 
             │                          ..''│ 
             │                       ..'    │ 
             │                    .''       │ 
Time (ms)    │                ..''          │ 
             │             ..'              │ 
             │          .''                 │ 
             │      ..''                    │ 
             │   ..'                        │ 
           0 │ .'                           │ 
             └──────────────────────────────┘ 
              0                          300  
                           Rank               

Runtime vs. mode

Below are plots showing the runtime in miliseconds of MTTKRP as a function of the mode, for varying sizes and ranks:

size = (30, 100, 1000), rank = 10 size = (30, 100, 1000), rank = 100 size = (30, 100, 1000), rank = 200 size = (30, 100, 1000), rank = 300 size = (50, 50, 50), rank = 10 size = (50, 50, 50), rank = 50 size = (50, 50, 50), rank = 100 size = (50, 50, 50), rank = 150 size = (50, 50, 50), rank = 200 size = (50, 50, 50), rank = 250 size = (50, 50, 50), rank = 300 size = (100, 100, 100), rank = 10 size = (100, 100, 100), rank = 50 size = (100, 100, 100), rank = 100 size = (100, 100, 100), rank = 150 size = (100, 100, 100), rank = 200 size = (100, 100, 100), rank = 250 size = (100, 100, 100), rank = 300 size = (150, 150, 150), rank = 10 size = (150, 150, 150), rank = 50 size = (150, 150, 150), rank = 100 size = (150, 150, 150), rank = 150 size = (150, 150, 150), rank = 200 size = (150, 150, 150), rank = 250 size = (150, 150, 150), rank = 300 size = (200, 200, 200), rank = 10 size = (200, 200, 200), rank = 50 size = (200, 200, 200), rank = 100 size = (200, 200, 200), rank = 150 size = (200, 200, 200), rank = 200 size = (200, 200, 200), rank = 250 size = (200, 200, 200), rank = 300 size = (1000, 100, 30), rank = 10 size = (1000, 100, 30), rank = 100 size = (1000, 100, 30), rank = 200 size = (1000, 100, 30), rank = 300
Target
       size = (30, 100, 1000), rank = 10 
       ┌                              ┐ 
                               ┐╷       
mode 1                         ├┤       
                               ┘╵       
             ┐╷                         
mode 2       ├┤                         
             ┘╵                         
            ┐╷                          
mode 3      ├┤                          
            ┘╵                          
       └                              ┘ 
        0             2              4  
                   Time (ms)            
       size = (30, 100, 1000), rank = 100 
       ┌                              ┐ 
             ┐                     ╷    
mode 1       ├─────────────────────┤    
             ┘                     ╵    
        ╷                               
mode 2  ┤                               
        ╵                               
        ╷                               
mode 3  ┤                               
        ╵                               
       └                              ┘ 
        0            45             90  
                   Time (ms)            
       size = (30, 100, 1000), rank = 200 
       ┌                              ┐ 
                               ┌┬╷      
mode 1                         ┤│┤      
                               └┴╵      
          ╷                             
mode 2    ┤                             
          ╵                             
           ╷                            
mode 3     ┤                            
           ╵                            
       └                              ┘ 
        0            20             40  
                   Time (ms)            
       size = (30, 100, 1000), rank = 300 
       ┌                              ┐ 
                                   ┐╷   
mode 1                             ├┤   
                                   ┘╵   
           ╷                            
mode 2     ┤                            
           ╵                            
           ┌╷                           
mode 3     ┤┤                           
           └╵                           
       └                              ┘ 
        0            25             50  
                   Time (ms)            
        size = (50, 50, 50), rank = 10  
       ┌                              ┐ 
                           ┐        ╷   
mode 1                     ├────────┤   
                           ┘        ╵   
         ┬┐             ╷               
mode 2   │├─────────────┤               
         ┴┘             ╵               
                          ┌─┬─┐    ╷    
mode 3                    ┤ │ ├────┤    
                          └─┴─┘    ╵    
       └                              ┘ 
        0.06        0.115         0.17  
                   Time (ms)            
        size = (50, 50, 50), rank = 50  
       ┌                              ┐ 
                          ┌┐ ╷          
mode 1                    ┤├─┤          
                          └┘ ╵          
              ┌┐   ╷                    
mode 2        ┤├───┤                    
              └┘   ╵                    
                           ┬┐ ╷         
mode 3                     │├─┤         
                           ┴┘ ╵         
       └                              ┘ 
        0.1          0.3           0.5  
                   Time (ms)            
        size = (50, 50, 50), rank = 100 
       ┌                              ┐ 
                              ┐     ╷   
mode 1                        ├─────┤   
                              ┘     ╵   
           ┐     ╷                      
mode 2     ├─────┤                      
           ┘     ╵                      
                              ┌┐   ╷    
mode 3                        ┤├───┤    
                              └┘   ╵    
       └                              ┘ 
        0.3         0.45           0.6  
                   Time (ms)            
        size = (50, 50, 50), rank = 150 
       ┌                              ┐ 
                  ┬╷                    
mode 1            │┤                    
                  ┴╵                    
              ┬┐                  ╷     
mode 2        │├──────────────────┤     
              ┴┘                  ╵     
                  ┐╷                    
mode 3            ├┤                    
                  ┘╵                    
       └                              ┘ 
        0             1              2  
                   Time (ms)            
        size = (50, 50, 50), rank = 200 
       ┌                              ┐ 
                          ╷┬┐   ╷       
mode 1                    ├│├───┤       
                          ╵┴┘   ╵       
        ╷┬┐  ╷                          
mode 2  ├│├──┤                          
        ╵┴┘  ╵                          
                           ╷┌┬┐   ╷     
mode 3                     ├┤│├───┤     
                           ╵└┴┘   ╵     
       └                              ┘ 
        0.6         0.85           1.1  
                   Time (ms)            
        size = (50, 50, 50), rank = 250 
       ┌                              ┐ 
                          ╷  ┌┬─┐╷      
mode 1                    ├──┤│ ├┤      
                          ╵  └┴─┘╵      
          ┌┐ ╷                          
mode 2    ┤├─┤                          
          └┘ ╵                          
                          ┌─┐╷          
mode 3                    ┤ ├┤          
                          └─┘╵          
       └                              ┘ 
        0.7         1.05           1.4  
                   Time (ms)            
        size = (50, 50, 50), rank = 300 
       ┌                              ┐ 
                             ╷┐  ╷      
mode 1                       ├├──┤      
                             ╵┘  ╵      
         ┐  ╷                           
mode 2   ├──┤                           
         ┘  ╵                           
                            ┬─┐  ╷      
mode 3                      │ ├──┤      
                            ┴─┘  ╵      
       └                              ┘ 
        0.9         1.25           1.6  
                   Time (ms)            
       size = (100, 100, 100), rank = 10 
       ┌                              ┐ 
                        ┌┐       ╷      
mode 1                  ┤├───────┤      
                        └┘       ╵      
         ┌┐    ╷                        
mode 2   ┤├────┤                        
         └┘    ╵                        
                      ╷┬┐  ╷            
mode 3                ├│├──┤            
                      ╵┴┘  ╵            
       └                              ┘ 
        0.2          0.5           0.8  
                   Time (ms)            
       size = (100, 100, 100), rank = 50 
       ┌                              ┐ 
                                 ╷╷     
mode 1                           ├┤     
                                 ╵╵     
          ┐  ╷                          
mode 2    ├──┤                          
          ┘  ╵                          
                                ╷┐  ╷   
mode 3                          ├├──┤   
                                ╵┘  ╵   
       └                              ┘ 
        0.7          1.2           1.7  
                   Time (ms)            
       size = (100, 100, 100), rank = 100 
       ┌                              ┐ 
                             ┐ ╷        
mode 1                       ├─┤        
                             ┘ ╵        
          ┌┐╷                           
mode 2    ┤├┤                           
          └┘╵                           
                            ┐ ╷         
mode 3                      ├─┤         
                            ┘ ╵         
       └                              ┘ 
        1             2              3  
                   Time (ms)            
       size = (100, 100, 100), rank = 150 
       ┌                              ┐ 
                               ╷┌─┬╷    
mode 1                         ├┤ │┤    
                               ╵└─┴╵    
        ╷ ┌┬┐ ╷                         
mode 2  ├─┤│├─┤                         
        ╵ └┴┘ ╵                         
                          ╷ ┌┬┐     ╷   
mode 3                    ├─┤│├─────┤   
                          ╵ └┴┘     ╵   
       └                              ┘ 
        2.7         3.15           3.6  
                   Time (ms)            
       size = (100, 100, 100), rank = 200 
       ┌                              ┐ 
                           ╷┌┐     ╷    
mode 1                     ├┤├─────┤    
                           ╵└┘     ╵    
                ┌┐╷                     
mode 2          ┤├┤                     
                └┘╵                     
                          ╷┐ ╷          
mode 3                    ├├─┤          
                          ╵┘ ╵          
       └                              ┘ 
        3             4              5  
                   Time (ms)            
       size = (100, 100, 100), rank = 250 
       ┌                              ┐ 
           ╷                            
mode 1     ┤                            
           ╵                            
         ╷                         ┐╷   
mode 2   ├─────────────────────────├┤   
         ╵                         ┘╵   
           ╷                            
mode 3     ┤                            
           ╵                            
       └                              ┘ 
        4            9.5            15  
                   Time (ms)            
       size = (100, 100, 100), rank = 300 
       ┌                              ┐ 
                     ╷┌┬───╷            
mode 1               ├┤│   ┤            
                     ╵└┴───╵            
            ┌┬┐            ╷            
mode 2      ┤│├────────────┤            
            └┴┘            ╵            
                       ┌┬┐ ╷            
mode 3                 ┤│├─┤            
                       └┴┘ ╵            
       └                              ┘ 
        5            6.5             8  
                   Time (ms)            
       size = (150, 150, 150), rank = 10 
       ┌                              ┐ 
        ╷                               
mode 1  ┤                               
        ╵                               
        ┐                        ╷      
mode 2  ├────────────────────────┤      
        ┘                        ╵      
        ╷                               
mode 3  ┤                               
        ╵                               
       └                              ┘ 
        0            250           500  
                   Time (ms)            
       size = (150, 150, 150), rank = 50 
       ┌                              ┐ 
                           ╷┬┐╷         
mode 1                     ├│├┤         
                           ╵┴┘╵         
        ┌╷                              
mode 2  ┤┤                              
        └╵                              
                            ┌┬─┐ ╷      
mode 3                      ┤│ ├─┤      
                            └┴─┘ ╵      
       └                              ┘ 
        3             4              5  
                   Time (ms)            
       size = (150, 150, 150), rank = 100 
       ┌                              ┐ 
                                  ┌─╷   
mode 1                            ┤ ┤   
                                  └─╵   
               ┌┬╷                      
mode 2         ┤│┤                      
               └┴╵                      
                                 ╷┬┐╷   
mode 3                           ├│├┤   
                                 ╵┴┘╵   
       └                              ┘ 
        4            5.5             7  
                   Time (ms)            
       size = (150, 150, 150), rank = 150 
       ┌                              ┐ 
                               ┌┐ ╷     
mode 1                         ┤├─┤     
                               └┘ ╵     
             ╷┬╷                        
mode 2       ├│┤                        
             ╵┴╵                        
                            ╷┌┐ ╷       
mode 3                      ├┤├─┤       
                            ╵└┘ ╵       
       └                              ┘ 
        6             8             10  
                   Time (ms)            
       size = (150, 150, 150), rank = 200 
       ┌                              ┐ 
                        ┌┐╷             
mode 1                  ┤├┤             
                        └┘╵             
                    ┬╷                  
mode 2              │┤                  
                    ┴╵                  
                        ┬┐         ╷    
mode 3                  │├─────────┤    
                        ┴┘         ╵    
       └                              ┘ 
        0            10             20  
                   Time (ms)            
       size = (150, 150, 150), rank = 250 
       ┌                              ┐ 
                                 ╷ ┌┬╷  
mode 1                           ├─┤│┤  
                                 ╵ └┴╵  
        ┬╷                              
mode 2  │┤                              
        ┴╵                              
                              ╷┐    ╷   
mode 3                        ├├────┤   
                              ╵┘    ╵   
       └                              ┘ 
        11           13             15  
                   Time (ms)            
       size = (150, 150, 150), rank = 300 
       ┌                              ┐ 
                           ┌┬┐  ╷       
mode 1                     ┤│├──┤       
                           └┴┘  ╵       
         ┌┐╷                            
mode 2   ┤├┤                            
         └┘╵                            
                             ┌┬┐   ╷    
mode 3                       ┤│├───┤    
                             └┴┘   ╵    
       └                              ┘ 
        13           16             19  
                   Time (ms)            
       size = (200, 200, 200), rank = 10 
       ┌                              ┐ 
                                 ┐ ╷    
mode 1                           ├─┤    
                                 ┘ ╵    
               ┐╷                       
mode 2         ├┤                       
               ┘╵                       
                         ┐╷             
mode 3                   ├┤             
                         ┘╵             
       └                              ┘ 
        1            2.5             4  
                   Time (ms)            
       size = (200, 200, 200), rank = 50 
       ┌                              ┐ 
                             ╷┐  ╷      
mode 1                       ├├──┤      
                             ╵┘  ╵      
           ╷┬╷                          
mode 2     ├│┤                          
           ╵┴╵                          
                            ┬┐╷         
mode 3                      │├┤         
                            ┴┘╵         
       └                              ┘ 
        6             8             10  
                   Time (ms)            
       size = (200, 200, 200), rank = 100 
       ┌                              ┐ 
                               ┌─╷      
mode 1                         ┤ ┤      
                               └─╵      
             ┌┐╷                        
mode 2       ┤├┤                        
             └┘╵                        
                              ╷ ┬─┐╷    
mode 3                        ├─│ ├┤    
                              ╵ ┴─┘╵    
       └                              ┘ 
        9           11.5            14  
                   Time (ms)            
       size = (200, 200, 200), rank = 150 
       ┌                              ┐ 
                            ┌─┬──┐╷     
mode 1                      ┤ │  ├┤     
                            └─┴──┘╵     
           ┬╷                           
mode 2     │┤                           
           ┴╵                           
                          ┌┬─┐╷         
mode 3                    ┤│ ├┤         
                          └┴─┘╵         
       └                              ┘ 
        13          16.5            20  
                   Time (ms)            
       size = (200, 200, 200), rank = 200 
       ┌                              ┐ 
                ╷╷                      
mode 1          ├┤                      
                ╵╵                      
            ╷╷                          
mode 2      ├┤                          
            ╵╵                          
                ┌┬──┐          ╷        
mode 3          ┤│  ├──────────┤        
                └┴──┘          ╵        
       └                              ┘ 
        10           30             50  
                   Time (ms)            
       size = (200, 200, 200), rank = 250 
       ┌                              ┐ 
                             ╷┌┐  ╷     
mode 1                       ├┤├──┤     
                             ╵└┘  ╵     
         ┌┐╷                            
mode 2   ┤├┤                            
         └┘╵                            
                          ┌───┬─┐ ╷     
mode 3                    ┤   │ ├─┤     
                          └───┴─┘ ╵     
       └                              ┘ 
        21          25.5            30  
                   Time (ms)            
       size = (200, 200, 200), rank = 300 
       ┌                              ┐ 
                            ┌┐╷         
mode 1                      ┤├┤         
                            └┘╵         
                 ┬╷                     
mode 2           │┤                     
                 ┴╵                     
                           ╷┌┬─┐╷       
mode 3                     ├┤│ ├┤       
                           ╵└┴─┘╵       
       └                              ┘ 
        20           30             40  
                   Time (ms)            
       size = (1000, 100, 30), rank = 10 
       ┌                              ┐ 
             ╷╷                         
mode 1       ├┤                         
             ╵╵                         
             ┐╷                         
mode 2       ├┤                         
             ┘╵                         
                             ┬╷         
mode 3                       │┤         
                             ┴╵         
       └                              ┘ 
        0             2              4  
                   Time (ms)            
       size = (1000, 100, 30), rank = 100 
       ┌                              ┐ 
        ┐╷                              
mode 1  ├┤                              
        ┘╵                              
        ┬────┐                     ╷    
mode 2  │    ├─────────────────────┤    
        ┴────┘                     ╵    
                ╷                       
mode 3          ┤                       
                ╵                       
       └                              ┘ 
        0            30             60  
                   Time (ms)            
       size = (1000, 100, 30), rank = 200 
       ┌                              ┐ 
           ╷                            
mode 1     ┤                            
           ╵                            
          ╷                             
mode 2    ┤                             
          ╵                             
                              ┌┬╷       
mode 3                        ┤│┤       
                              └┴╵       
       └                              ┘ 
        0            20             40  
                   Time (ms)            
       size = (1000, 100, 30), rank = 300 
       ┌                              ┐ 
           ┐╷                           
mode 1     ├┤                           
           ┘╵                           
           ╷                            
mode 2     ┤                            
           ╵                            
                                  ╷┌╷   
mode 3                            ├┤┤   
                                  ╵└╵   
       └                              ┘ 
        0            25             50  
                   Time (ms)            
Baseline
       size = (30, 100, 1000), rank = 10 
       ┌                              ┐ 
                                ┐ ╷     
mode 1                          ├─┤     
                                ┘ ╵     
            ╷                           
mode 2      ┤                           
            ╵                           
            ╷                           
mode 3      ┤                           
            ╵                           
       └                              ┘ 
        0            2.5             5  
                   Time (ms)            
       size = (30, 100, 1000), rank = 100 
       ┌                              ┐ 
           ┐                ╷           
mode 1     ├────────────────┤           
           ┘                ╵           
        ╷                               
mode 2  ┤                               
        ╵                               
        ╷                               
mode 3  ┤                               
        ╵                               
       └                              ┘ 
        0            100           200  
                   Time (ms)            
       size = (30, 100, 1000), rank = 200 
       ┌                              ┐ 
                          ┌┐        ╷   
mode 1                    ┤├────────┤   
                          └┘        ╵   
         ╷                              
mode 2   ┤                              
         ╵                              
         ╷                              
mode 3   ┤                              
         ╵                              
       └                              ┘ 
        0            40             80  
                   Time (ms)            
       size = (30, 100, 1000), rank = 300 
       ┌                              ┐ 
                   ┌┐  ╷                
mode 1             ┤├──┤                
                   └┘  ╵                
        ┐╷                              
mode 2  ├┤                              
        ┘╵                              
        ╷                               
mode 3  ┤                               
        ╵                               
       └                              ┘ 
        0            100           200  
                   Time (ms)            
        size = (50, 50, 50), rank = 10  
       ┌                              ┐ 
                              ┌┐    ╷   
mode 1                        ┤├────┤   
                              └┘    ╵   
                 ┬┐      ╷              
mode 2           │├──────┤              
                 ┴┘      ╵              
                              ┬┐   ╷    
mode 3                        │├───┤    
                              ┴┘   ╵    
       └                              ┘ 
        0            0.1           0.2  
                   Time (ms)            
        size = (50, 50, 50), rank = 50  
       ┌                              ┐ 
                               ┬─┐╷     
mode 1                         │ ├┤     
                               ┴─┘╵     
        ┬┐   ╷                          
mode 2  │├───┤                          
        ┴┘   ╵                          
                               ┬┐╷      
mode 3                         │├┤      
                               ┴┘╵      
       └                              ┘ 
        0.2          0.4           0.6  
                   Time (ms)            
        size = (50, 50, 50), rank = 100 
       ┌                              ┐ 
                             ╷┬┐  ╷     
mode 1                       ├│├──┤     
                             ╵┴┘  ╵     
         ┐  ╷                           
mode 2   ├──┤                           
         ┘  ╵                           
                             ┌┐ ╷       
mode 3                       ┤├─┤       
                             └┘ ╵       
       └                              ┘ 
        0.3         0.65             1  
                   Time (ms)            
        size = (50, 50, 50), rank = 150 
       ┌                              ┐ 
                                ┌┬┐╷    
mode 1                          ┤│├┤    
                                └┴┘╵    
          ┬┐╷                           
mode 2    │├┤                           
          ┴┘╵                           
                                 ╷┬┐╷   
mode 3                           ├│├┤   
                                 ╵┴┘╵   
       └                              ┘ 
        0.4         0.85           1.3  
                   Time (ms)            
        size = (50, 50, 50), rank = 200 
       ┌                              ┐ 
                              ╷         
mode 1                        ┤         
                              ╵         
                 ┐╷                     
mode 2           ├┤                     
                 ┘╵                     
                             ╷┬┐  ╷     
mode 3                       ├│├──┤     
                             ╵┴┘  ╵     
       └                              ┘ 
        0             1              2  
                   Time (ms)            
        size = (50, 50, 50), rank = 250 
       ┌                              ┐ 
                          ╷┬╷           
mode 1                    ├│┤           
                          ╵┴╵           
               ┐╷                       
mode 2         ├┤                       
               ┘╵                       
                          ┐ ╷           
mode 3                    ├─┤           
                          ┘ ╵           
       └                              ┘ 
        0            1.5             3  
                   Time (ms)            
        size = (50, 50, 50), rank = 300 
       ┌                              ┐ 
                             ┬╷         
mode 1                       │┤         
                             ┴╵         
                 ╷                      
mode 2           ┤                      
                 ╵                      
                              ┐ ╷       
mode 3                        ├─┤       
                              ┘ ╵       
       └                              ┘ 
        0            1.5             3  
                   Time (ms)            
       size = (100, 100, 100), rank = 10 
       ┌                              ┐ 
                         ╷┐       ╷     
mode 1                   ├├───────┤     
                         ╵┘       ╵     
         ┬┐   ╷                         
mode 2   │├───┤                         
         ┴┘   ╵                         
                         ┌┐  ╷          
mode 3                   ┤├──┤          
                         └┘  ╵          
       └                              ┘ 
        0.2         0.55           0.9  
                   Time (ms)            
       size = (100, 100, 100), rank = 50 
       ┌                              ┐ 
                            ┌╷          
mode 1                      ┤┤          
                            └╵          
               ┐╷                       
mode 2         ├┤                       
               ┘╵                       
                            ┬┐╷         
mode 3                      │├┤         
                            ┴┘╵         
       └                              ┘ 
        0            1.5             3  
                   Time (ms)            
       size = (100, 100, 100), rank = 100 
       ┌                              ┐ 
                                ╷┐ ╷    
mode 1                          ├├─┤    
                                ╵┘ ╵    
         ┐╷                             
mode 2   ├┤                             
         ┘╵                             
                               ┌┬╷      
mode 3                         ┤│┤      
                               └┴╵      
       └                              ┘ 
        1            2.5             4  
                   Time (ms)            
       size = (100, 100, 100), rank = 150 
       ┌                              ┐ 
                               ╷┐╷      
mode 1                         ├├┤      
                               ╵┘╵      
             ┐╷                         
mode 2       ├┤                         
             ┘╵                         
                               ┬┐ ╷     
mode 3                         │├─┤     
                               ┴┘ ╵     
       └                              ┘ 
        2             4              6  
                   Time (ms)            
       size = (100, 100, 100), rank = 200 
       ┌                              ┐ 
                             ┌┬╷        
mode 1                       ┤│┤        
                             └┴╵        
           ┐╷                           
mode 2     ├┤                           
           ┘╵                           
                             ┬┐ ╷       
mode 3                       │├─┤       
                             ┴┘ ╵       
       └                              ┘ 
        3            5.5             8  
                   Time (ms)            
       size = (100, 100, 100), rank = 250 
       ┌                              ┐ 
                                 ┬┐ ╷   
mode 1                           │├─┤   
                                 ┴┘ ╵   
            ╷┬──────────┐           ╷   
mode 2      ├│          ├───────────┤   
            ╵┴──────────┘           ╵   
                                 ┐  ╷   
mode 3                           ├──┤   
                                 ┘  ╵   
       └                              ┘ 
        4            6.5             9  
                   Time (ms)            
       size = (100, 100, 100), rank = 300 
       ┌                              ┐ 
                               ╷┌┐╷     
mode 1                         ├┤├┤     
                               ╵└┘╵     
         ┌╷                             
mode 2   ┤┤                             
         └╵                             
                                ┐╷      
mode 3                          ├┤      
                                ┘╵      
       └                              ┘ 
        5             8             11  
                   Time (ms)            
       size = (150, 150, 150), rank = 10 
       ┌                              ┐ 
        ╷                               
mode 1  ┤                               
        ╵                               
        ┐                         ╷     
mode 2  ├─────────────────────────┤     
        ┘                         ╵     
        ╷                               
mode 3  ┤                               
        ╵                               
       └                              ┘ 
        0            250           500  
                   Time (ms)            
       size = (150, 150, 150), rank = 50 
       ┌                              ┐ 
                              ┬─╷       
mode 1                        │ ┤       
                              ┴─╵       
             ┐╷                         
mode 2       ├┤                         
             ┘╵                         
                            ┌┐╷         
mode 3                      ┤├┤         
                            └┘╵         
       └                              ┘ 
        2            4.5             7  
                   Time (ms)            
       size = (150, 150, 150), rank = 100 
       ┌                              ┐ 
                                 ╷┌╷    
mode 1                           ├┤┤    
                                 ╵└╵    
           ┬╷                           
mode 2     │┤                           
           ┴╵                           
                                 ┌┐ ╷   
mode 3                           ┤├─┤   
                                 └┘ ╵   
       └                              ┘ 
        4             7             10  
                   Time (ms)            
       size = (150, 150, 150), rank = 150 
       ┌                              ┐ 
                                  ┬─╷   
mode 1                            │ ┤   
                                  ┴─╵   
          ╷╷                            
mode 2    ├┤                            
          ╵╵                            
                                 ┬┐ ╷   
mode 3                           │├─┤   
                                 ┴┘ ╵   
       └                              ┘ 
        6            10             14  
                   Time (ms)            
       size = (150, 150, 150), rank = 200 
       ┌                              ┐ 
                                ┬┐ ╷    
mode 1                          │├─┤    
                                ┴┘ ╵    
          ╷                             
mode 2    ┤                             
          ╵                             
                                 ╷┐ ╷   
mode 3                           ├├─┤   
                                 ╵┘ ╵   
       └                              ┘ 
        8            13             18  
                   Time (ms)            
       size = (150, 150, 150), rank = 250 
       ┌                              ┐ 
                       ╷┐╷              
mode 1                 ├├┤              
                       ╵┘╵              
         ┬╷                             
mode 2   │┤                             
         ┴╵                             
                       ┬╷               
mode 3                 │┤               
                       ┴╵               
       └                              ┘ 
        10           20             30  
                   Time (ms)            
       size = (150, 150, 150), rank = 300 
       ┌                              ┐ 
                              ╷╷        
mode 1                        ├┤        
                              ╵╵        
            ┌╷                          
mode 2      ┤┤                          
            └╵                          
                             ╷┬╷        
mode 3                       ├│┤        
                             ╵┴╵        
       └                              ┘ 
        10           20             30  
                   Time (ms)            
       size = (200, 200, 200), rank = 10 
       ┌                              ┐ 
                               ╷╷       
mode 1                         ├┤       
                               ╵╵       
             ┐╷                         
mode 2       ├┤                         
             ┘╵                         
                        ╷               
mode 3                  ┤               
                        ╵               
       └                              ┘ 
        1             3              5  
                   Time (ms)            
       size = (200, 200, 200), rank = 50 
       ┌                              ┐ 
                                  ┬╷    
mode 1                            │┤    
                                  ┴╵    
          ┐╷                            
mode 2    ├┤                            
          ┘╵                            
                                ┬╷      
mode 3                          │┤      
                                ┴╵      
       └                              ┘ 
        6             9             12  
                   Time (ms)            
       size = (200, 200, 200), rank = 100 
       ┌                              ┐ 
                               ╷┌╷      
mode 1                         ├┤┤      
                               ╵└╵      
        ┬───┐                       ╷   
mode 2  │   ├───────────────────────┤   
        ┴───┘                       ╵   
                              ╷╷        
mode 3                        ├┤        
                              ╵╵        
       └                              ┘ 
        10           15             20  
                   Time (ms)            
       size = (200, 200, 200), rank = 150 
       ┌                              ┐ 
                              ┐╷        
mode 1                        ├┤        
                              ┘╵        
             ╷                          
mode 2       ┤                          
             ╵                          
                             ┌╷         
mode 3                       ┤┤         
                             └╵         
       └                              ┘ 
        10           20             30  
                   Time (ms)            
       size = (200, 200, 200), rank = 200 
       ┌                              ┐ 
                    ┬╷                  
mode 1              │┤                  
                    ┴╵                  
            ┐     ╷                     
mode 2      ├─────┤                     
            ┘     ╵                     
                    ┌─┬─┐       ╷       
mode 3              ┤ │ ├───────┤       
                    └─┴─┘       ╵       
       └                              ┘ 
        10           35             60  
                   Time (ms)            
       size = (200, 200, 200), rank = 250 
       ┌                              ┐ 
                          ┌┐╷           
mode 1                    ┤├┤           
                          └┘╵           
         ┐╷                             
mode 2   ├┤                             
         ┘╵                             
                          ╷╷            
mode 3                    ├┤            
                          ╵╵            
       └                              ┘ 
        20           35             50  
                   Time (ms)            
       size = (200, 200, 200), rank = 300 
       ┌                              ┐ 
                                   ╷╷   
mode 1                             ├┤   
                                   ╵╵   
              ┐╷                        
mode 2        ├┤                        
              ┘╵                        
                                   ╷    
mode 3                             ┤    
                                   ╵    
       └                              ┘ 
        20           35             50  
                   Time (ms)            
       size = (1000, 100, 30), rank = 10 
       ┌                              ┐ 
            ╷╷                          
mode 1      ├┤                          
            ╵╵                          
            ╷                           
mode 2      ┤                           
            ╵                           
                                ┬┐╷     
mode 3                          │├┤     
                                ┴┘╵     
       └                              ┘ 
        0            2.5             5  
                   Time (ms)            
       size = (1000, 100, 30), rank = 100 
       ┌                              ┐ 
         ╷                              
mode 1   ┤                              
         ╵                              
        ┐                          ╷    
mode 2  ├──────────────────────────┤    
        ┘                          ╵    
                     ╷╷                 
mode 3               ├┤                 
                     ╵╵                 
       └                              ┘ 
        0            30             60  
                   Time (ms)            
       size = (1000, 100, 30), rank = 200 
       ┌                              ┐ 
         ╷                              
mode 1   ┤                              
         ╵                              
         ╷                              
mode 2   ┤                              
         ╵                              
                           ╷┐        ╷  
mode 3                     ├├────────┤  
                           ╵┘        ╵  
       └                              ┘ 
        0            40             80  
                   Time (ms)            
       size = (1000, 100, 30), rank = 300 
       ┌                              ┐ 
         ╷                              
mode 1   ┤                              
         ╵                              
         ╷                              
mode 2   ┤                              
         ╵                              
                             ┌┬┐    ╷   
mode 3                       ┤│├────┤   
                             └┴┘    ╵   
       └                              ┘ 
        0            55            110  
                   Time (ms)            

@dahong67
Copy link
Owner Author

dahong67 commented Mar 1, 2024

All done. Test failures seem related to TestItemRunner's on Julia nightly. Merging!

@dahong67 dahong67 merged commit 20b2bfb into master Mar 1, 2024
9 of 12 checks passed
@dahong67 dahong67 deleted the dahong67/issue34 branch March 1, 2024 22:18
@dahong67 dahong67 restored the dahong67/issue34 branch March 1, 2024 22:21
@dahong67 dahong67 deleted the dahong67/issue34 branch March 1, 2024 22:22
This was referenced Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Faster Khatri-Rao product
1 participant