-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster MTTKRPs Algorithm #40
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #40 +/- ##
===========================================
- Coverage 100.00% 99.34% -0.66%
===========================================
Files 12 12
Lines 258 306 +48
===========================================
+ Hits 258 304 +46
- Misses 0 2 +2 ☔ View full report in Codecov by Sentry. |
For some reason, although running one iteration of MTTKRPs with the new algorithm versus the old implementation is much faster, but the gcp benchmarks show no improvement:
Benchmark Report for
|
ID | time ratio | memory ratio |
---|---|---|
["gcp", "bernoulliOdds-size(X)=(15, 20, 25), rank(X)=1"] |
0.85 (5%) ✅ | 0.94 (1%) ✅ |
["gcp", "bernoulliOdds-size(X)=(15, 20, 25), rank(X)=2"] |
0.86 (5%) ✅ | 1.07 (1%) ❌ |
["gcp", "bernoulliOdds-size(X)=(30, 40, 50), rank(X)=1"] |
0.80 (5%) ✅ | 1.04 (1%) ❌ |
["gcp", "bernoulliOdds-size(X)=(30, 40, 50), rank(X)=2"] |
0.86 (5%) ✅ | 0.97 (1%) ✅ |
["gcp", "gamma-size(X)=(15, 20, 25), rank(X)=1"] |
1.21 (5%) ❌ | 0.97 (1%) ✅ |
["gcp", "gamma-size(X)=(15, 20, 25), rank(X)=2"] |
0.91 (5%) ✅ | 0.90 (1%) ✅ |
["gcp", "gamma-size(X)=(30, 40, 50), rank(X)=1"] |
1.21 (5%) ❌ | 1.22 (1%) ❌ |
["gcp", "gamma-size(X)=(30, 40, 50), rank(X)=2"] |
0.82 (5%) ✅ | 1.02 (1%) ❌ |
["gcp", "least-squares-size(X)=(15, 20, 25), rank(X)=2"] |
0.89 (5%) ✅ | 1.00 (1%) |
["gcp", "least-squares-size(X)=(30, 40, 50), rank(X)=1"] |
1.22 (5%) ❌ | 1.00 (1%) |
["gcp", "poisson-size(X)=(15, 20, 25), rank(X)=1"] |
1.08 (5%) ❌ | 1.00 (1%) |
["gcp", "poisson-size(X)=(15, 20, 25), rank(X)=2"] |
0.99 (5%) | 1.25 (1%) ❌ |
["gcp", "poisson-size(X)=(30, 40, 50), rank(X)=1"] |
0.82 (5%) ✅ | 0.90 (1%) ✅ |
["gcp", "poisson-size(X)=(30, 40, 50), rank(X)=2"] |
1.31 (5%) ❌ | 1.29 (1%) ❌ |
Benchmark Group List
Here's a list of all the benchmark groups executed by this job:
["gcp"]
Julia versioninfo
Target
Julia Version 1.10.0
Commit 3120989f39 (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
Microsoft Windows [Version 10.0.22621.3155]
CPU: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz:
speed user nice sys idle irq
#1-16 2304 MHz 44819604 0 44240962 7935611915 1134929 ticks
Memory: 31.726390838623047 GB (13396.0078125 MB free)
Uptime: 1.04030714e6 sec
Load Avg: 0.0 0.0 0.0
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 1 on 16 virtual cores
Baseline
Julia Version 1.10.0
Commit 3120989f39 (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
Microsoft Windows [Version 10.0.22621.3155]
CPU: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz:
speed user nice sys idle irq
#1-16 2304 MHz 44924368 0 44263742 7937208507 1135085 ticks
Memory: 31.726390838623047 GB (13700.35546875 MB free)
Uptime: 1.04041489e6 sec
Load Avg: 0.0 0.0 0.0
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 1 on 16 virtual cores
Changing the gcp_func method from using sum(value(loss, X[I], M[I]) for I in CartesianIndices(X) if !ismissing(X[I])) to mapreduce(I -> !ismissing(X[I]) ? value(loss, X[I], M[I]) : 0, +, CartesianIndices(X)) provides a decent memory benefit and some speed-up (both target and baseline here using old MTTKRPs): Benchmark Report for
|
ID | time ratio | memory ratio |
---|---|---|
["gcp", "bernoulliOdds-size(X)=(15, 20, 25), rank(X)=1"] |
0.90 (5%) ✅ | 0.35 (1%) ✅ |
["gcp", "bernoulliOdds-size(X)=(15, 20, 25), rank(X)=2"] |
0.82 (5%) ✅ | 0.33 (1%) ✅ |
["gcp", "bernoulliOdds-size(X)=(30, 40, 50), rank(X)=1"] |
0.88 (5%) ✅ | 0.34 (1%) ✅ |
["gcp", "bernoulliOdds-size(X)=(30, 40, 50), rank(X)=2"] |
0.87 (5%) ✅ | 0.34 (1%) ✅ |
["gcp", "gamma-size(X)=(15, 20, 25), rank(X)=1"] |
1.01 (5%) | 0.40 (1%) ✅ |
["gcp", "gamma-size(X)=(15, 20, 25), rank(X)=2"] |
0.90 (5%) ✅ | 0.41 (1%) ✅ |
["gcp", "gamma-size(X)=(30, 40, 50), rank(X)=1"] |
1.01 (5%) | 0.46 (1%) ✅ |
["gcp", "gamma-size(X)=(30, 40, 50), rank(X)=2"] |
0.78 (5%) ✅ | 0.34 (1%) ✅ |
["gcp", "least-squares-size(X)=(15, 20, 25), rank(X)=2"] |
1.10 (5%) ❌ | 1.00 (1%) |
["gcp", "least-squares-size(X)=(30, 40, 50), rank(X)=1"] |
1.26 (5%) ❌ | 1.00 (1%) |
["gcp", "least-squares-size(X)=(30, 40, 50), rank(X)=2"] |
1.35 (5%) ❌ | 1.00 (1%) |
["gcp", "poisson-size(X)=(15, 20, 25), rank(X)=1"] |
0.84 (5%) ✅ | 0.56 (1%) ✅ |
["gcp", "poisson-size(X)=(15, 20, 25), rank(X)=2"] |
1.05 (5%) ❌ | 0.83 (1%) ✅ |
["gcp", "poisson-size(X)=(30, 40, 50), rank(X)=1"] |
0.97 (5%) | 0.56 (1%) ✅ |
["gcp", "poisson-size(X)=(30, 40, 50), rank(X)=2"] |
0.84 (5%) ✅ | 0.55 (1%) ✅ |
Benchmark Group List
Here's a list of all the benchmark groups executed by this job:
["gcp"]
Julia versioninfo
Target
Julia Version 1.10.0
Commit 3120989f39 (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
Microsoft Windows [Version 10.0.22621.3155]
CPU: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz:
speed user nice sys idle irq
#1-16 2304 MHz 49475134 0 49141931 8601542102 1332603 ticks
Memory: 31.726390838623047 GB (13258.26171875 MB free)
Uptime: 1.122432015e6 sec
Load Avg: 0.0 0.0 0.0
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 1 on 16 virtual cores
Baseline
Julia Version 1.10.0
Commit 3120989f39 (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
Microsoft Windows [Version 10.0.22621.3155]
CPU: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz:
speed user nice sys idle irq
#1-16 2304 MHz 49564492 0 49151838 8603080366 1332882 ticks
Memory: 31.726390838623047 GB (13464.0390625 MB free)
Uptime: 1.122534359e6 sec
Load Avg: 0.0 0.0 0.0
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 1 on 16 virtual cores
Benchmarking comparison of old getindex for CPD versus new getindex that doesn't use sum: Benchmark Report for
|
ID | time ratio | memory ratio |
---|---|---|
["gcp", "bernoulliOdds-size(X)=(15, 20, 25), rank(X)=1"] |
0.33 (5%) ✅ | 0.08 (1%) ✅ |
["gcp", "bernoulliOdds-size(X)=(15, 20, 25), rank(X)=2"] |
0.50 (5%) ✅ | 0.09 (1%) ✅ |
["gcp", "bernoulliOdds-size(X)=(30, 40, 50), rank(X)=1"] |
0.29 (5%) ✅ | 0.08 (1%) ✅ |
["gcp", "bernoulliOdds-size(X)=(30, 40, 50), rank(X)=2"] |
0.48 (5%) ✅ | 0.07 (1%) ✅ |
["gcp", "gamma-size(X)=(15, 20, 25), rank(X)=1"] |
0.47 (5%) ✅ | 0.06 (1%) ✅ |
["gcp", "gamma-size(X)=(15, 20, 25), rank(X)=2"] |
0.48 (5%) ✅ | 0.07 (1%) ✅ |
["gcp", "gamma-size(X)=(30, 40, 50), rank(X)=1"] |
0.44 (5%) ✅ | 0.06 (1%) ✅ |
["gcp", "gamma-size(X)=(30, 40, 50), rank(X)=2"] |
0.45 (5%) ✅ | 0.05 (1%) ✅ |
["gcp", "poisson-size(X)=(15, 20, 25), rank(X)=1"] |
0.42 (5%) ✅ | 0.02 (1%) ✅ |
["gcp", "poisson-size(X)=(15, 20, 25), rank(X)=2"] |
0.97 (5%) | 0.83 (1%) ✅ |
["gcp", "poisson-size(X)=(30, 40, 50), rank(X)=1"] |
0.39 (5%) ✅ | 0.03 (1%) ✅ |
["gcp", "poisson-size(X)=(30, 40, 50), rank(X)=2"] |
0.46 (5%) ✅ | 0.03 (1%) ✅ |
Benchmark Group List
Here's a list of all the benchmark groups executed by this job:
["gcp"]
Julia versioninfo
Target
Julia Version 1.10.0
Commit 3120989f39 (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
Microsoft Windows [Version 10.0.22621.3155]
CPU: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz:
speed user nice sys idle irq
#1-16 2304 MHz 70951977 0 64394166 10622768227 1747463 ticks
Memory: 31.726390838623047 GB (12502.3671875 MB free)
Uptime: 1.321744875e6 sec
Load Avg: 0.0 0.0 0.0
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 1 on 16 virtual cores
Baseline
Julia Version 1.10.0
Commit 3120989f39 (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
Microsoft Windows [Version 10.0.22621.3155]
CPU: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz:
speed user nice sys idle irq
#1-16 2304 MHz 71135820 0 64411196 10624224353 1747806 ticks
Memory: 31.726390838623047 GB (12590.13671875 MB free)
Uptime: 1.321848437e6 sec
Load Avg: 0.0 0.0 0.0
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 1 on 16 virtual cores
…er mttkrps in grad_U
Benchmark Report using new get index for CPD, comparing MTTKRPs implementations. Benchmark Report for
|
ID | time ratio | memory ratio |
---|---|---|
["gcp", "bernoulliOdds-size(X)=(15, 20, 25), rank(X)=1"] |
1.11 (5%) ❌ | 1.07 (1%) ❌ |
["gcp", "bernoulliOdds-size(X)=(15, 20, 25), rank(X)=2"] |
0.87 (5%) ✅ | 0.88 (1%) ✅ |
["gcp", "bernoulliOdds-size(X)=(30, 40, 50), rank(X)=1"] |
0.97 (5%) | 0.96 (1%) ✅ |
["gcp", "gamma-size(X)=(15, 20, 25), rank(X)=1"] |
1.02 (5%) | 1.02 (1%) ❌ |
["gcp", "gamma-size(X)=(15, 20, 25), rank(X)=2"] |
1.02 (5%) | 1.11 (1%) ❌ |
["gcp", "gamma-size(X)=(30, 40, 50), rank(X)=1"] |
0.94 (5%) ✅ | 0.92 (1%) ✅ |
["gcp", "gamma-size(X)=(30, 40, 50), rank(X)=2"] |
0.94 (5%) ✅ | 0.95 (1%) ✅ |
["gcp", "poisson-size(X)=(15, 20, 25), rank(X)=1"] |
1.04 (5%) | 1.08 (1%) ❌ |
["gcp", "poisson-size(X)=(15, 20, 25), rank(X)=2"] |
0.63 (5%) ✅ | 0.46 (1%) ✅ |
["gcp", "poisson-size(X)=(30, 40, 50), rank(X)=1"] |
0.96 (5%) | 0.97 (1%) ✅ |
["gcp", "poisson-size(X)=(30, 40, 50), rank(X)=2"] |
0.87 (5%) ✅ | 0.96 (1%) ✅ |
Benchmark Group List
Here's a list of all the benchmark groups executed by this job:
["gcp"]
Julia versioninfo
Target
Julia Version 1.10.0
Commit 3120989f39 (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
Microsoft Windows [Version 10.0.22621.3155]
CPU: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz:
speed user nice sys idle irq
#1-16 2304 MHz 107470103 0 79055961 13638613649 2131432 ticks
Memory: 31.726390838623047 GB (12977.23828125 MB free)
Uptime: 1.663074234e6 sec
Load Avg: 0.0 0.0 0.0
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 1 on 16 virtual cores
Baseline
Julia Version 1.10.0
Commit 3120989f39 (2023-12-25 18:01 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
Microsoft Windows [Version 10.0.22621.3155]
CPU: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz:
speed user nice sys idle irq
#1-16 2304 MHz 107567853 0 79065914 13640122431 2131572 ticks
Memory: 31.726390838623047 GB (12880.9140625 MB free)
Uptime: 1.663175265e6 sec
Load Avg: 0.0 0.0 0.0
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 1 on 16 virtual cores
Addresses #17.