Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend-specific utilities (derivative, multiderivative, gradient, jacobian) #24

Merged
merged 11 commits into from
Mar 8, 2024

Conversation

gdalle
Copy link
Member

@gdalle gdalle commented Mar 6, 2024

Doc changes

  • Put more info in the README, use it as docs home page
  • Introduce the distinction between "primitives" (pushforward, pullback) and "utilities" (derivative, multiderivative, gradient, jacobian)
  • Improve documentation, especially for backends
  • Avoid exposing the internals of structs (like type parameters), only document constructors and accessors

Structural changes

  • Add a custom type parameter to AbstractBackend controling whether custom utilities are used instead of our fallbacks (that call pushforward and pullback). Implement convenience constructors with the default custom = true
  • Define a few backend manipulation functions like handles_input_type, handles_output_type, autodiff_mode and is_custom
  • Add tests for utilities, make sure that they call both custom and fallback versions

Backend-specific changes

  • Add extensions for PolyesterForwardDiff.jl and Zygote.jl backends
  • Implement custom utilities for each backend (lots of code there)
  • Activate type stability tests for the backends where they pass
  • Get the right pushforward complexity for FiniteDiff.jl backend by derivating the scalar function $t \mapsto f(x + t \delta x)$
  • Add finite difference type information to FiniteDiffBackend type
  • Add ZygoteBackend() shortcut with nicer printing

Benchmark changes

  • Flesh out benchmarks to compare primitives, custom utilities and fallback utilities across all backends
  • Benchmark everything on variants of a neural network layer $y = \sigma(wx + b)$

@codecov-commenter
Copy link

codecov-commenter commented Mar 6, 2024

Codecov Report

Attention: Patch coverage is 87.75510% with 24 lines in your changes are missing coverage. Please review.

Project coverage is 89.96%. Comparing base (58748e6) to head (671540a).

Files Patch % Lines
src/backends.jl 36.00% 16 Missing ⚠️
ext/DifferentiationInterfaceChainRulesCoreExt.jl 50.00% 6 Missing ⚠️
ext/DifferentiationInterfaceZygoteExt.jl 88.88% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #24      +/-   ##
==========================================
- Coverage   92.48%   89.96%   -2.52%     
==========================================
  Files          14       19       +5     
  Lines         173      319     +146     
==========================================
+ Hits          160      287     +127     
- Misses         13       32      +19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@adrhill
Copy link
Collaborator

adrhill commented Mar 6, 2024

Would be nice if we had benchmarks before blindly merging this.

@adrhill
Copy link
Collaborator

adrhill commented Mar 6, 2024

Question: should we make these implementations optional with a backend type parameter?
Like AbstractBackend{default} which decides whether to use the package fallbacks or the backend implems

Worry: if we use these specific implems all the time, we never test our fallbacks

Adding a type parameter to all backends sounds good.

Maybe this could also be dealt with through the extras argument we planned on adding?

ext/DifferentiationInterfaceZygoteExt.jl Outdated Show resolved Hide resolved
Copy link
Contributor

github-actions bot commented Mar 7, 2024

Benchmark result

Judge result

Benchmark Report for /home/runner/work/DifferentiationInterface.jl/DifferentiationInterface.jl

Job Properties

  • Time of benchmarks:
    • Target: 7 Mar 2024 - 11:45
    • Baseline: 7 Mar 2024 - 11:46
  • Package commits:
    • Target: fafc59
    • Baseline: 58748e
  • Julia commits:
    • Target: bd47ec
    • Baseline: bd47ec
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: None
    • Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

Julia versioninfo

Target

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  2445 MHz       1520 s          0 s        119 s      21101 s          0 s
       #2  3236 MHz       1408 s          0 s        125 s      21217 s          0 s
       #3  3242 MHz       1336 s          0 s        145 s      21256 s          0 s
       #4  3234 MHz       1316 s          0 s        143 s      21272 s          0 s
  Memory: 15.606487274169922 GB (14003.53125 MB free)
  Uptime: 2279.02 sec
  Load Avg:  1.8  1.55  0.68
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Baseline

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  2445 MHz       1630 s          0 s        128 s      21669 s          0 s
       #2  3243 MHz       1512 s          0 s        136 s      21789 s          0 s
       #3  2747 MHz       1499 s          0 s        151 s      21774 s          0 s
       #4  2445 MHz       1621 s          0 s        152 s      21645 s          0 s
  Memory: 15.606487274169922 GB (13894.36328125 MB free)
  Uptime: 2347.87 sec
  Load Avg:  1.25  1.43  0.71
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Target result

Benchmark Report for /home/runner/work/DifferentiationInterface.jl/DifferentiationInterface.jl

Job Properties

  • Time of benchmark: 7 Mar 2024 - 11:45
  • Package commit: fafc59
  • Julia commit: bd47ec
  • Julia command flags: None
  • Environment variables: None

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["derivative", (1, 1), "EnzymeForwardBackend{false}()"] 29.000 ns (5%)
["derivative", (1, 1), "EnzymeForwardBackend{true}()"] 29.000 ns (5%)
["derivative", (1, 1), "EnzymeReverseBackend{false}()"] 29.000 ns (5%)
["derivative", (1, 1), "EnzymeReverseBackend{true}()"] 29.000 ns (5%)
["derivative", (1, 1), "FiniteDiffBackend{false, Val{:central}}()"] 20.000 ns (5%)
["derivative", (1, 1), "FiniteDiffBackend{true, Val{:central}}()"] 20.000 ns (5%)
["derivative", (1, 1), "ForwardDiffBackend{false}()"] 20.000 ns (5%)
["derivative", (1, 1), "ForwardDiffBackend{true}()"] 20.000 ns (5%)
["derivative", (1, 1), "ZygoteBackend{false}()"] 29.000 ns (5%)
["derivative", (1, 1), "ZygoteBackend{true}()"] 29.000 ns (5%)
["gradient", (10, 1), "EnzymeForwardBackend{false}()"] 430.000 ns (5%) 1.55 KiB (1%) 11
["gradient", (10, 1), "EnzymeForwardBackend{true}()"] 400.000 ns (5%) 1.55 KiB (1%) 11
["gradient", (10, 1), "EnzymeReverseBackend{false}()"] 831.000 ns (5%) 336 bytes (1%) 10
["gradient", (10, 1), "EnzymeReverseBackend{true}()"] 230.000 ns (5%) 144 bytes (1%) 1
["gradient", (10, 1), "FiniteDiffBackend{false, Val{:central}}()"] 13.365 μs (5%) 4.20 KiB (1%) 81
["gradient", (10, 1), "FiniteDiffBackend{true, Val{:central}}()"] 1.393 μs (5%) 416 bytes (1%) 8
["gradient", (10, 1), "ForwardDiffBackend{false}()"] 561.000 ns (5%) 2.33 KiB (1%) 11
["gradient", (10, 1), "ForwardDiffBackend{true}()"] 912.000 ns (5%) 2.77 KiB (1%) 6
["gradient", (10, 1), "ReverseDiffBackend{false}()"] 651.000 ns (5%) 1.08 KiB (1%) 14
["gradient", (10, 1), "ReverseDiffBackend{true}()"] 661.000 ns (5%) 1.05 KiB (1%) 13
["gradient", (10, 1), "ZygoteBackend{false}()"] 1.713 μs (5%) 1.75 KiB (1%) 36
["gradient", (10, 1), "ZygoteBackend{true}()"] 931.000 ns (5%) 1.00 KiB (1%) 20
["jacobian", (10, 10), "EnzymeForwardBackend{false}()"] 6.492 μs (5%) 8.81 KiB (1%) 123
["jacobian", (10, 10), "EnzymeForwardBackend{true}()"] 21.470 μs (5%) 8.73 KiB (1%) 40
["jacobian", (10, 10), "FiniteDiffBackend{false, Val{:central}}()"] 53.591 μs (5%) 159.91 KiB (1%) 1653
["jacobian", (10, 10), "FiniteDiffBackend{true, Val{:central}}()"] 5.350 μs (5%) 15.64 KiB (1%) 160
["jacobian", (10, 10), "ForwardDiffBackend{false}()"] 1.563 μs (5%) 8.34 KiB (1%) 63
["jacobian", (10, 10), "ForwardDiffBackend{true}()"] 1.182 μs (5%) 4.61 KiB (1%) 7
["jacobian", (10, 10), "ReverseDiffBackend{false}()"] 11.231 μs (5%) 20.53 KiB (1%) 153
["jacobian", (10, 10), "ReverseDiffBackend{true}()"] 1.142 μs (5%) 1.91 KiB (1%) 14
["jacobian", (10, 10), "ZygoteBackend{false}()"] 18.394 μs (5%) 15.84 KiB (1%) 343
["jacobian", (10, 10), "ZygoteBackend{true}()"] 89.927 μs (5%) 15.84 KiB (1%) 409
["multiderivative", (1, 10), "FiniteDiffBackend{false, Val{:central}}()"] 991.000 ns (5%) 1.03 KiB (1%) 9
["multiderivative", (1, 10), "FiniteDiffBackend{true, Val{:central}}()"] 1.443 μs (5%) 1.11 KiB (1%) 13
["multiderivative", (1, 10), "ForwardDiffBackend{false}()"] 250.000 ns (5%) 656 bytes (1%) 4
["multiderivative", (1, 10), "ForwardDiffBackend{true}()"] 220.000 ns (5%) 512 bytes (1%) 3
["multiderivative", (1, 10), "ZygoteBackend{false}()"] 1.734 μs (5%) 1.83 KiB (1%) 13
["multiderivative", (1, 10), "ZygoteBackend{true}()"] 1.773 μs (5%) 1.83 KiB (1%) 13

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["derivative", (1, 1)]
  • ["gradient", (10, 1)]
  • ["jacobian", (10, 10)]
  • ["multiderivative", (1, 10)]

Julia versioninfo

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  2445 MHz       1520 s          0 s        119 s      21101 s          0 s
       #2  3236 MHz       1408 s          0 s        125 s      21217 s          0 s
       #3  3242 MHz       1336 s          0 s        145 s      21256 s          0 s
       #4  3234 MHz       1316 s          0 s        143 s      21272 s          0 s
  Memory: 15.606487274169922 GB (14003.53125 MB free)
  Uptime: 2279.02 sec
  Load Avg:  1.8  1.55  0.68
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Baseline result

Benchmark Report for /home/runner/work/DifferentiationInterface.jl/DifferentiationInterface.jl

Job Properties

  • Time of benchmark: 7 Mar 2024 - 11:46
  • Package commit: 58748e
  • Julia commit: bd47ec
  • Julia command flags: None
  • Environment variables: None

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["forward", "scalar_to_scalar", "10", "EnzymeForwardBackend()"] 29.000 ns (5%)
["forward", "scalar_to_scalar", "10", "FiniteDiffBackend()"] 20.000 ns (5%)
["forward", "scalar_to_scalar", "10", "ForwardDiffBackend()"] 20.000 ns (5%)
["forward", "scalar_to_vector", "10", "FiniteDiffBackend()"] 831.000 ns (5%) 768 bytes (1%) 7
["forward", "scalar_to_vector", "10", "ForwardDiffBackend()"] 100.000 ns (5%) 368 bytes (1%) 2
["forward", "vector_to_vector", "10", "EnzymeForwardBackend()"] 661.000 ns (5%) 464 bytes (1%) 8
["forward", "vector_to_vector", "10", "FiniteDiffBackend()"] 5.511 μs (5%) 15.61 KiB (1%) 159
["forward", "vector_to_vector", "10", "ForwardDiffBackend()"] 140.000 ns (5%) 592 bytes (1%) 3
["reverse", "scalar_to_scalar", "10", "ChainRulesReverseBackend(Zygote.ZygoteRuleConfig{Zygote.Context{false}}(Zygote.Context{false}(nothing)))"] 29.000 ns (5%)
["reverse", "scalar_to_scalar", "10", "EnzymeReverseBackend()"] 29.000 ns (5%)
["reverse", "vector_to_scalar", "10", "ChainRulesReverseBackend(Zygote.ZygoteRuleConfig{Zygote.Context{false}}(Zygote.Context{false}(nothing)))"] 1.643 μs (5%) 1.78 KiB (1%) 35
["reverse", "vector_to_scalar", "10", "EnzymeReverseBackend()"] 6.472 μs (5%) 192 bytes (1%) 9
["reverse", "vector_to_scalar", "10", "ReverseDiffBackend()"] 591.000 ns (5%) 976 bytes (1%) 13
["reverse", "vector_to_vector", "10", "ChainRulesReverseBackend(Zygote.ZygoteRuleConfig{Zygote.Context{false}}(Zygote.Context{false}(nothing)))"] 140.000 ns (5%) 432 bytes (1%) 3
["reverse", "vector_to_vector", "10", "ReverseDiffBackend()"] 1.312 μs (5%) 1.95 KiB (1%) 15

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["forward", "scalar_to_scalar", "10"]
  • ["forward", "scalar_to_vector", "10"]
  • ["forward", "vector_to_vector", "10"]
  • ["reverse", "scalar_to_scalar", "10"]
  • ["reverse", "vector_to_scalar", "10"]
  • ["reverse", "vector_to_vector", "10"]

Julia versioninfo

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  2445 MHz       1630 s          0 s        128 s      21669 s          0 s
       #2  3243 MHz       1512 s          0 s        136 s      21789 s          0 s
       #3  2747 MHz       1499 s          0 s        151 s      21774 s          0 s
       #4  2445 MHz       1621 s          0 s        152 s      21645 s          0 s
  Memory: 15.606487274169922 GB (13894.36328125 MB free)
  Uptime: 2347.87 sec
  Load Avg:  1.25  1.43  0.71
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() lbt
Sys.CPU_THREADS 4

lscpu output:

Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Address sizes:                      48 bits physical, 48 bits virtual
Byte Order:                         Little Endian
CPU(s):                             4
On-line CPU(s) list:                0-3
Vendor ID:                          AuthenticAMD
Model name:                         AMD EPYC 7763 64-Core Processor
CPU family:                         25
Model:                              1
Thread(s) per core:                 2
Core(s) per socket:                 2
Socket(s):                          1
Stepping:                           1
BogoMIPS:                           4890.85
Flags:                              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext invpcid_single vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr rdpru arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload umip vaes vpclmulqdq rdpid fsrm
Virtualization:                     AMD-V
Hypervisor vendor:                  Microsoft
Virtualization type:                full
L1d cache:                          64 KiB (2 instances)
L1i cache:                          64 KiB (2 instances)
L2 cache:                           1 MiB (2 instances)
L3 cache:                           32 MiB (1 instance)
NUMA node(s):                       1
NUMA node0 CPU(s):                  0-3
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit:        Not affected
Vulnerability L1tf:                 Not affected
Vulnerability Mds:                  Not affected
Vulnerability Meltdown:             Not affected
Vulnerability Mmio stale data:      Not affected
Vulnerability Retbleed:             Not affected
Vulnerability Spec rstack overflow: Mitigation; safe RET, no microcode
Vulnerability Spec store bypass:    Vulnerable
Vulnerability Spectre v1:           Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:           Mitigation; Retpolines, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:                Not affected
Vulnerability Tsx async abort:      Not affected
Cpu Property Value
Brand AMD EPYC 7763 64-Core Processor
Vendor :AMD
Architecture :Unknown
Model Family: 0xaf, Model: 0x01, Stepping: 0x01, Type: 0x00
Cores 16 physical cores, 16 logical cores (on executing CPU)
No Hyperthreading hardware capability detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 512, 32768) kbytes
64 byte cache line size
Address Size 48 bits virtual, 48 bits physical
SIMD 256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC runs at constant rate (invariant from clock frequency)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

Copy link
Contributor

github-actions bot commented Mar 7, 2024

Benchmark result

Judge result

Benchmark Report for /home/runner/work/DifferentiationInterface.jl/DifferentiationInterface.jl

Job Properties

  • Time of benchmarks:
    • Target: 7 Mar 2024 - 14:02
    • Baseline: 7 Mar 2024 - 14:03
  • Package commits:
    • Target: 17fad9
    • Baseline: 58748e
  • Julia commits:
    • Target: bd47ec
    • Baseline: bd47ec
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: None
    • Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

Julia versioninfo

Target

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  2445 MHz       1828 s          0 s        142 s       4279 s          0 s
       #2  3210 MHz       2127 s          0 s        117 s       4013 s          0 s
       #3  3243 MHz       1708 s          0 s        140 s       4394 s          0 s
       #4  3275 MHz       1759 s          0 s        144 s       4332 s          0 s
  Memory: 15.606491088867188 GB (14044.80859375 MB free)
  Uptime: 627.84 sec
  Load Avg:  1.04  1.3  0.75
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Baseline

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  3242 MHz       1878 s          0 s        150 s       4912 s          0 s
       #2  3218 MHz       2145 s          0 s        129 s       4674 s          0 s
       #3  2445 MHz       1732 s          0 s        151 s       5050 s          0 s
       #4  3240 MHz       2352 s          0 s        154 s       4421 s          0 s
  Memory: 15.606491088867188 GB (13946.3671875 MB free)
  Uptime: 697.14 sec
  Load Avg:  1.1  1.26  0.78
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Target result

Benchmark Report for /home/runner/work/DifferentiationInterface.jl/DifferentiationInterface.jl

Job Properties

  • Time of benchmark: 7 Mar 2024 - 14:2
  • Package commit: 17fad9
  • Julia commit: bd47ec
  • Julia command flags: None
  • Environment variables: None

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["derivative", (1, 1), "EnzymeForwardBackend{custom}()"] 4.328 ns (5%)
["derivative", (1, 1), "EnzymeForwardBackend{fallback}()"] 4.328 ns (5%)
["derivative", (1, 1), "EnzymeReverseBackend{custom}()"] 4.328 ns (5%)
["derivative", (1, 1), "EnzymeReverseBackend{fallback}()"] 4.328 ns (5%)
["derivative", (1, 1), "FiniteDiffBackend{custom,Val{:central}}()"] 3.406 ns (5%)
["derivative", (1, 1), "FiniteDiffBackend{fallback,Val{:central}}()"] 3.095 ns (5%)
["derivative", (1, 1), "ForwardDiffBackend{custom}()"] 3.095 ns (5%)
["derivative", (1, 1), "ForwardDiffBackend{fallback}()"] 2.785 ns (5%)
["derivative", (1, 1), "PolyesterForwardDiffBackend{custom,4}()"] 3.095 ns (5%)
["derivative", (1, 1), "ZygoteBackend{custom}()"] 3.657 μs (5%) 1.28 KiB (1%) 55
["derivative", (1, 1), "ZygoteBackend{fallback}()"] 3.641 μs (5%) 1.28 KiB (1%) 55
["gradient!", (10, 1), "EnzymeForwardBackend{custom}()"] 407.460 ns (5%) 1.41 KiB (1%) 10
["gradient!", (10, 1), "EnzymeForwardBackend{fallback}()"] 402.139 ns (5%) 1.41 KiB (1%) 10
["gradient!", (10, 1), "EnzymeReverseBackend{custom}()"] 25.610 ns (5%)
["gradient!", (10, 1), "EnzymeReverseBackend{fallback}()"] 845.631 ns (5%) 192 bytes (1%) 9
["gradient!", (10, 1), "FiniteDiffBackend{custom,Val{:central}}()"] 232.047 ns (5%) 144 bytes (1%) 1
["gradient!", (10, 1), "FiniteDiffBackend{fallback,Val{:central}}()"] 995.692 ns (5%) 2.81 KiB (1%) 20
["gradient!", (10, 1), "ForwardDiffBackend{custom}()"] 713.916 ns (5%) 1.78 KiB (1%) 2
["gradient!", (10, 1), "ForwardDiffBackend{fallback}()"] 579.724 ns (5%) 2.19 KiB (1%) 10
["gradient!", (10, 1), "PolyesterForwardDiffBackend{custom,4}()"] 164.970 ns (5%) 512 bytes (1%) 2
["gradient!", (10, 1), "ReverseDiffBackend{custom}()"] 583.607 ns (5%) 928 bytes (1%) 12
["gradient!", (10, 1), "ReverseDiffBackend{fallback}()"] 605.602 ns (5%) 960 bytes (1%) 13
["gradient!", (10, 1), "ZygoteBackend{custom}()"] 3.424 μs (5%) 1.64 KiB (1%) 39
["gradient!", (10, 1), "ZygoteBackend{fallback}()"] 4.288 μs (5%) 1.95 KiB (1%) 52
["gradient", (10, 1), "EnzymeForwardBackend{custom}()"] 434.327 ns (5%) 1.55 KiB (1%) 11
["gradient", (10, 1), "EnzymeForwardBackend{fallback}()"] 435.688 ns (5%) 1.55 KiB (1%) 11
["gradient", (10, 1), "EnzymeReverseBackend{custom}()"] 54.844 ns (5%) 144 bytes (1%) 1
["gradient", (10, 1), "EnzymeReverseBackend{fallback}()"] 877.389 ns (5%) 336 bytes (1%) 10
["gradient", (10, 1), "FiniteDiffBackend{custom,Val{:central}}()"] 1.347 μs (5%) 416 bytes (1%) 8
["gradient", (10, 1), "FiniteDiffBackend{fallback,Val{:central}}()"] 1.007 μs (5%) 2.95 KiB (1%) 21
["gradient", (10, 1), "ForwardDiffBackend{custom}()"] 1.167 μs (5%) 2.77 KiB (1%) 6
["gradient", (10, 1), "ForwardDiffBackend{fallback}()"] 616.651 ns (5%) 2.33 KiB (1%) 11
["gradient", (10, 1), "PolyesterForwardDiffBackend{custom,4}()"] 194.740 ns (5%) 656 bytes (1%) 3
["gradient", (10, 1), "ReverseDiffBackend{custom}()"] 618.902 ns (5%) 1.05 KiB (1%) 13
["gradient", (10, 1), "ReverseDiffBackend{fallback}()"] 630.075 ns (5%) 1.08 KiB (1%) 14
["gradient", (10, 1), "ZygoteBackend{custom}()"] 3.150 μs (5%) 1.41 KiB (1%) 34
["gradient", (10, 1), "ZygoteBackend{fallback}()"] 4.321 μs (5%) 2.09 KiB (1%) 53
["jacobian!", (10, 10), "EnzymeForwardBackend{custom}()"] 7.524 μs (5%) 7.80 KiB (1%) 121
["jacobian!", (10, 10), "EnzymeForwardBackend{fallback}()"] 7.669 μs (5%) 7.80 KiB (1%) 121
["jacobian!", (10, 10), "FiniteDiffBackend{custom,Val{:central}}()"] 6.402 μs (5%) 15.69 KiB (1%) 162
["jacobian!", (10, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 4.684 μs (5%) 11.86 KiB (1%) 121
["jacobian!", (10, 10), "ForwardDiffBackend{custom}()"] 1.413 μs (5%) 3.70 KiB (1%) 5
["jacobian!", (10, 10), "ForwardDiffBackend{fallback}()"] 1.505 μs (5%) 7.33 KiB (1%) 61
["jacobian!", (10, 10), "PolyesterForwardDiffBackend{custom,4}()"] 703.193 ns (5%) 2.08 KiB (1%) 5
["jacobian!", (10, 10), "ReverseDiffBackend{custom}()"] 1.040 μs (5%) 1.03 KiB (1%) 13
["jacobian!", (10, 10), "ReverseDiffBackend{fallback}()"] 11.311 μs (5%) 19.52 KiB (1%) 151
["jacobian!", (10, 10), "ZygoteBackend{custom}()"] 93.094 μs (5%) 19.02 KiB (1%) 535
["jacobian!", (10, 10), "ZygoteBackend{fallback}()"] 44.954 μs (5%) 18.27 KiB (1%) 511
["jacobian", (10, 10), "EnzymeForwardBackend{custom}()"] 2.465 μs (5%) 8.73 KiB (1%) 40
["jacobian", (10, 10), "EnzymeForwardBackend{fallback}()"] 7.747 μs (5%) 8.81 KiB (1%) 123
["jacobian", (10, 10), "FiniteDiffBackend{custom,Val{:central}}()"] 5.836 μs (5%) 15.64 KiB (1%) 160
["jacobian", (10, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 4.810 μs (5%) 12.88 KiB (1%) 123
["jacobian", (10, 10), "ForwardDiffBackend{custom}()"] 1.454 μs (5%) 4.61 KiB (1%) 7
["jacobian", (10, 10), "ForwardDiffBackend{fallback}()"] 1.624 μs (5%) 8.34 KiB (1%) 63
["jacobian", (10, 10), "PolyesterForwardDiffBackend{custom,4}()"] 475.939 ns (5%) 3.09 KiB (1%) 7
["jacobian", (10, 10), "ReverseDiffBackend{custom}()"] 1.095 μs (5%) 1.91 KiB (1%) 14
["jacobian", (10, 10), "ReverseDiffBackend{fallback}()"] 11.271 μs (5%) 20.53 KiB (1%) 153
["jacobian", (10, 10), "ZygoteBackend{custom}()"] 92.903 μs (5%) 18.95 KiB (1%) 532
["jacobian", (10, 10), "ZygoteBackend{fallback}()"] 45.095 μs (5%) 19.28 KiB (1%) 513
["multiderivative!", (1, 10), "EnzymeForwardBackend{custom}()"] 69.146 ns (5%) 288 bytes (1%) 2
["multiderivative!", (1, 10), "EnzymeForwardBackend{fallback}()"] 69.886 ns (5%) 288 bytes (1%) 2
["multiderivative!", (1, 10), "FiniteDiffBackend{custom,Val{:central}}()"] 357.967 ns (5%) 768 bytes (1%) 7
["multiderivative!", (1, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 360.718 ns (5%) 768 bytes (1%) 7
["multiderivative!", (1, 10), "ForwardDiffBackend{custom}()"] 68.434 ns (5%) 368 bytes (1%) 2
["multiderivative!", (1, 10), "ForwardDiffBackend{fallback}()"] 68.942 ns (5%) 368 bytes (1%) 2
["multiderivative!", (1, 10), "PolyesterForwardDiffBackend{custom,4}()"] 68.658 ns (5%) 368 bytes (1%) 2
["multiderivative!", (1, 10), "ZygoteBackend{custom}()"] 26.269 μs (5%) 7.64 KiB (1%) 231
["multiderivative!", (1, 10), "ZygoteBackend{fallback}()"] 26.269 μs (5%) 7.64 KiB (1%) 231
["multiderivative", (1, 10), "EnzymeForwardBackend{custom}()"] 125.782 ns (5%) 576 bytes (1%) 4
["multiderivative", (1, 10), "EnzymeForwardBackend{fallback}()"] 127.104 ns (5%) 576 bytes (1%) 4
["multiderivative", (1, 10), "FiniteDiffBackend{custom,Val{:central}}()"] 855.730 ns (5%) 1.11 KiB (1%) 13
["multiderivative", (1, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 421.342 ns (5%) 1.03 KiB (1%) 9
["multiderivative", (1, 10), "ForwardDiffBackend{custom}()"] 93.049 ns (5%) 512 bytes (1%) 3
["multiderivative", (1, 10), "ForwardDiffBackend{fallback}()"] 123.260 ns (5%) 656 bytes (1%) 4
["multiderivative", (1, 10), "PolyesterForwardDiffBackend{custom,4}()"] 121.749 ns (5%) 656 bytes (1%) 4
["multiderivative", (1, 10), "ZygoteBackend{custom}()"] 26.319 μs (5%) 7.92 KiB (1%) 233
["multiderivative", (1, 10), "ZygoteBackend{fallback}()"] 26.399 μs (5%) 7.92 KiB (1%) 233
["pullback!", (1, 10), "ZygoteBackend{fallback}()"] 2.588 μs (5%) 720 bytes (1%) 21
["pullback!", (10, 1), "EnzymeReverseBackend{fallback}()"] 827.716 ns (5%) 192 bytes (1%) 9
["pullback!", (10, 1), "ReverseDiffBackend{fallback}()"] 591.322 ns (5%) 960 bytes (1%) 13
["pullback!", (10, 1), "ZygoteBackend{fallback}()"] 4.362 μs (5%) 1.95 KiB (1%) 52
["pullback!", (10, 10), "ReverseDiffBackend{fallback}()"] 1.185 μs (5%) 1.94 KiB (1%) 15
["pullback!", (10, 10), "ZygoteBackend{fallback}()"] 4.385 μs (5%) 1.56 KiB (1%) 46
["pullback", (1, 1), "EnzymeReverseBackend{fallback}()"] 4.328 ns (5%)
["pullback", (1, 1), "ZygoteBackend{fallback}()"] 1.678 μs (5%) 448 bytes (1%) 16
["pullback", (1, 10), "ZygoteBackend{fallback}()"] 2.550 μs (5%) 704 bytes (1%) 20
["pullback", (10, 1), "EnzymeReverseBackend{fallback}()"] 866.175 ns (5%) 336 bytes (1%) 10
["pullback", (10, 1), "ReverseDiffBackend{fallback}()"] 620.989 ns (5%) 1.08 KiB (1%) 14
["pullback", (10, 1), "ZygoteBackend{fallback}()"] 2.470 μs (5%) 1008 bytes (1%) 19
["pullback", (10, 10), "ReverseDiffBackend{fallback}()"] 1.363 μs (5%) 2.08 KiB (1%) 16
["pullback", (10, 10), "ZygoteBackend{fallback}()"] 2.634 μs (5%) 944 bytes (1%) 20
["pushforward!", (1, 10), "EnzymeForwardBackend{fallback}()"] 70.161 ns (5%) 288 bytes (1%) 2
["pushforward!", (1, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 356.958 ns (5%) 768 bytes (1%) 7
["pushforward!", (1, 10), "ForwardDiffBackend{fallback}()"] 72.770 ns (5%) 368 bytes (1%) 2
["pushforward!", (10, 1), "EnzymeForwardBackend{fallback}()"] 12.034 ns (5%)
["pushforward!", (10, 1), "FiniteDiffBackend{fallback,Val{:central}}()"] 99.871 ns (5%) 288 bytes (1%) 2
["pushforward!", (10, 1), "ForwardDiffBackend{fallback}()"] 55.037 ns (5%) 224 bytes (1%) 1
["pushforward!", (10, 10), "EnzymeForwardBackend{fallback}()"] 660.012 ns (5%) 464 bytes (1%) 8
["pushforward!", (10, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 433.472 ns (5%) 1.03 KiB (1%) 9
["pushforward!", (10, 10), "ForwardDiffBackend{fallback}()"] 110.772 ns (5%) 592 bytes (1%) 3
["pushforward", (1, 1), "EnzymeForwardBackend{fallback}()"] 4.328 ns (5%)
["pushforward", (1, 1), "FiniteDiffBackend{fallback,Val{:central}}()"] 3.095 ns (5%)
["pushforward", (1, 1), "ForwardDiffBackend{fallback}()"] 2.785 ns (5%)
["pushforward", (1, 10), "EnzymeForwardBackend{fallback}()"] 127.831 ns (5%) 576 bytes (1%) 4
["pushforward", (1, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 419.824 ns (5%) 1.03 KiB (1%) 9
["pushforward", (1, 10), "ForwardDiffBackend{fallback}()"] 120.574 ns (5%) 656 bytes (1%) 4
["pushforward", (10, 1), "EnzymeForwardBackend{fallback}()"] 19.142 ns (5%)
["pushforward", (10, 1), "FiniteDiffBackend{fallback,Val{:central}}()"] 110.238 ns (5%) 288 bytes (1%) 2
["pushforward", (10, 1), "ForwardDiffBackend{fallback}()"] 61.456 ns (5%) 224 bytes (1%) 1
["pushforward", (10, 10), "EnzymeForwardBackend{fallback}()"] 732.898 ns (5%) 752 bytes (1%) 10
["pushforward", (10, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 504.082 ns (5%) 1.31 KiB (1%) 11
["pushforward", (10, 10), "ForwardDiffBackend{fallback}()"] 171.242 ns (5%) 880 bytes (1%) 5

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["derivative", (1, 1)]
  • ["gradient!", (10, 1)]
  • ["gradient", (10, 1)]
  • ["jacobian!", (10, 10)]
  • ["jacobian", (10, 10)]
  • ["multiderivative!", (1, 10)]
  • ["multiderivative", (1, 10)]
  • ["pullback!", (1, 10)]
  • ["pullback!", (10, 1)]
  • ["pullback!", (10, 10)]
  • ["pullback", (1, 1)]
  • ["pullback", (1, 10)]
  • ["pullback", (10, 1)]
  • ["pullback", (10, 10)]
  • ["pushforward!", (1, 10)]
  • ["pushforward!", (10, 1)]
  • ["pushforward!", (10, 10)]
  • ["pushforward", (1, 1)]
  • ["pushforward", (1, 10)]
  • ["pushforward", (10, 1)]
  • ["pushforward", (10, 10)]

Julia versioninfo

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  2445 MHz       1828 s          0 s        142 s       4279 s          0 s
       #2  3210 MHz       2127 s          0 s        117 s       4013 s          0 s
       #3  3243 MHz       1708 s          0 s        140 s       4394 s          0 s
       #4  3275 MHz       1759 s          0 s        144 s       4332 s          0 s
  Memory: 15.606491088867188 GB (14044.80859375 MB free)
  Uptime: 627.84 sec
  Load Avg:  1.04  1.3  0.75
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Baseline result

Benchmark Report for /home/runner/work/DifferentiationInterface.jl/DifferentiationInterface.jl

Job Properties

  • Time of benchmark: 7 Mar 2024 - 14:3
  • Package commit: 58748e
  • Julia commit: bd47ec
  • Julia command flags: None
  • Environment variables: None

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["forward", "scalar_to_scalar", "10", "EnzymeForwardBackend()"] 29.000 ns (5%)
["forward", "scalar_to_scalar", "10", "FiniteDiffBackend()"] 20.000 ns (5%)
["forward", "scalar_to_scalar", "10", "ForwardDiffBackend()"] 20.000 ns (5%)
["forward", "scalar_to_vector", "10", "FiniteDiffBackend()"] 821.000 ns (5%) 768 bytes (1%) 7
["forward", "scalar_to_vector", "10", "ForwardDiffBackend()"] 100.000 ns (5%) 368 bytes (1%) 2
["forward", "vector_to_vector", "10", "EnzymeForwardBackend()"] 692.000 ns (5%) 464 bytes (1%) 8
["forward", "vector_to_vector", "10", "FiniteDiffBackend()"] 5.560 μs (5%) 15.61 KiB (1%) 159
["forward", "vector_to_vector", "10", "ForwardDiffBackend()"] 140.000 ns (5%) 592 bytes (1%) 3
["reverse", "scalar_to_scalar", "10", "ChainRulesReverseBackend(Zygote.ZygoteRuleConfig{Zygote.Context{false}}(Zygote.Context{false}(nothing)))"] 29.000 ns (5%)
["reverse", "scalar_to_scalar", "10", "EnzymeReverseBackend()"] 29.000 ns (5%)
["reverse", "vector_to_scalar", "10", "ChainRulesReverseBackend(Zygote.ZygoteRuleConfig{Zygote.Context{false}}(Zygote.Context{false}(nothing)))"] 1.723 μs (5%) 1.78 KiB (1%) 35
["reverse", "vector_to_scalar", "10", "EnzymeReverseBackend()"] 10.800 μs (5%) 192 bytes (1%) 9
["reverse", "vector_to_scalar", "10", "ReverseDiffBackend()"] 610.000 ns (5%) 976 bytes (1%) 13
["reverse", "vector_to_vector", "10", "ChainRulesReverseBackend(Zygote.ZygoteRuleConfig{Zygote.Context{false}}(Zygote.Context{false}(nothing)))"] 150.000 ns (5%) 432 bytes (1%) 3
["reverse", "vector_to_vector", "10", "ReverseDiffBackend()"] 1.272 μs (5%) 1.95 KiB (1%) 15

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["forward", "scalar_to_scalar", "10"]
  • ["forward", "scalar_to_vector", "10"]
  • ["forward", "vector_to_vector", "10"]
  • ["reverse", "scalar_to_scalar", "10"]
  • ["reverse", "vector_to_scalar", "10"]
  • ["reverse", "vector_to_vector", "10"]

Julia versioninfo

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  3242 MHz       1878 s          0 s        150 s       4912 s          0 s
       #2  3218 MHz       2145 s          0 s        129 s       4674 s          0 s
       #3  2445 MHz       1732 s          0 s        151 s       5050 s          0 s
       #4  3240 MHz       2352 s          0 s        154 s       4421 s          0 s
  Memory: 15.606491088867188 GB (13946.3671875 MB free)
  Uptime: 697.14 sec
  Load Avg:  1.1  1.26  0.78
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() lbt
Sys.CPU_THREADS 4

lscpu output:

Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Address sizes:                      48 bits physical, 48 bits virtual
Byte Order:                         Little Endian
CPU(s):                             4
On-line CPU(s) list:                0-3
Vendor ID:                          AuthenticAMD
Model name:                         AMD EPYC 7763 64-Core Processor
CPU family:                         25
Model:                              1
Thread(s) per core:                 2
Core(s) per socket:                 2
Socket(s):                          1
Stepping:                           1
BogoMIPS:                           4890.85
Flags:                              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext invpcid_single vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr rdpru arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload umip vaes vpclmulqdq rdpid fsrm
Virtualization:                     AMD-V
Hypervisor vendor:                  Microsoft
Virtualization type:                full
L1d cache:                          64 KiB (2 instances)
L1i cache:                          64 KiB (2 instances)
L2 cache:                           1 MiB (2 instances)
L3 cache:                           32 MiB (1 instance)
NUMA node(s):                       1
NUMA node0 CPU(s):                  0-3
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit:        Not affected
Vulnerability L1tf:                 Not affected
Vulnerability Mds:                  Not affected
Vulnerability Meltdown:             Not affected
Vulnerability Mmio stale data:      Not affected
Vulnerability Retbleed:             Not affected
Vulnerability Spec rstack overflow: Mitigation; safe RET, no microcode
Vulnerability Spec store bypass:    Vulnerable
Vulnerability Spectre v1:           Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:           Mitigation; Retpolines, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:                Not affected
Vulnerability Tsx async abort:      Not affected
Cpu Property Value
Brand AMD EPYC 7763 64-Core Processor
Vendor :AMD
Architecture :Unknown
Model Family: 0xaf, Model: 0x01, Stepping: 0x01, Type: 0x00
Cores 16 physical cores, 16 logical cores (on executing CPU)
No Hyperthreading hardware capability detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 512, 32768) kbytes
64 byte cache line size
Address Size 48 bits virtual, 48 bits physical
SIMD 256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC runs at constant rate (invariant from clock frequency)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@gdalle gdalle changed the title Add special cases Backend-specific utilities (derivative, multiderivative, gradient, jacobian) Mar 7, 2024
Copy link
Contributor

github-actions bot commented Mar 8, 2024

Benchmark result

Judge result

Benchmark Report for /home/runner/work/DifferentiationInterface.jl/DifferentiationInterface.jl

Job Properties

  • Time of benchmarks:
    • Target: 8 Mar 2024 - 09:44
    • Baseline: 8 Mar 2024 - 09:45
  • Package commits:
    • Target: 635fd7
    • Baseline: 58748e
  • Julia commits:
    • Target: bd47ec
    • Baseline: bd47ec
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: None
    • Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

Julia versioninfo

Target

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  2730 MHz       1966 s          0 s        136 s       6160 s          0 s
       #2  2445 MHz       1846 s          0 s        130 s       6275 s          0 s
       #3  2445 MHz       1992 s          0 s        138 s       6144 s          0 s
       #4  3243 MHz       1593 s          0 s        146 s       6532 s          0 s
  Memory: 15.606491088867188 GB (14010.04296875 MB free)
  Uptime: 829.91 sec
  Load Avg:  1.09  1.32  0.76
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Baseline

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  2445 MHz       2060 s          0 s        142 s       6736 s          0 s
       #2  3243 MHz       2238 s          0 s        143 s       6548 s          0 s
       #3  3174 MHz       2132 s          0 s        149 s       6669 s          0 s
       #4  2577 MHz       1637 s          0 s        157 s       7155 s          0 s
  Memory: 15.606491088867188 GB (13887.1796875 MB free)
  Uptime: 897.8 sec
  Load Avg:  1.08  1.26  0.78
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Target result

Benchmark Report for /home/runner/work/DifferentiationInterface.jl/DifferentiationInterface.jl

Job Properties

  • Time of benchmark: 8 Mar 2024 - 9:44
  • Package commit: 635fd7
  • Julia commit: bd47ec
  • Julia command flags: None
  • Environment variables: None

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["value_and_derivative", (1, 1), "EnzymeForwardBackend{custom}()"] 25.358 ns (5%)
["value_and_derivative", (1, 1), "EnzymeForwardBackend{fallback}()"] 25.328 ns (5%)
["value_and_derivative", (1, 1), "EnzymeReverseBackend{custom}()"] 25.952 ns (5%)
["value_and_derivative", (1, 1), "EnzymeReverseBackend{fallback}()"] 26.003 ns (5%)
["value_and_derivative", (1, 1), "FiniteDiffBackend{custom,Val{:central}}()"] 31.225 ns (5%)
["value_and_derivative", (1, 1), "FiniteDiffBackend{fallback,Val{:central}}()"] 23.135 ns (5%)
["value_and_derivative", (1, 1), "ForwardDiffBackend{custom}()"] 11.753 ns (5%)
["value_and_derivative", (1, 1), "ForwardDiffBackend{fallback}()"] 7.200 ns (5%)
["value_and_derivative", (1, 1), "PolyesterForwardDiffBackend{custom,4}()"] 7.049 ns (5%)
["value_and_derivative", (1, 1), "ZygoteBackend{custom}()"] 5.717 μs (5%) 1.14 KiB (1%) 40
["value_and_derivative", (1, 1), "ZygoteBackend{fallback}()"] 5.736 μs (5%) 1.14 KiB (1%) 40
["value_and_gradient!", (10, 1), "EnzymeForwardBackend{custom}()"] 827.506 ns (5%) 1.41 KiB (1%) 10
["value_and_gradient!", (10, 1), "EnzymeForwardBackend{fallback}()"] 827.012 ns (5%) 1.41 KiB (1%) 10
["value_and_gradient!", (10, 1), "EnzymeReverseBackend{custom}()"] 92.097 ns (5%)
["value_and_gradient!", (10, 1), "EnzymeReverseBackend{fallback}()"] 93.774 ns (5%)
["value_and_gradient!", (10, 1), "FiniteDiffBackend{custom,Val{:central}}()"] 413.784 ns (5%) 144 bytes (1%) 1
["value_and_gradient!", (10, 1), "FiniteDiffBackend{fallback,Val{:central}}()"] 1.404 μs (5%) 2.81 KiB (1%) 20
["value_and_gradient!", (10, 1), "ForwardDiffBackend{custom}()"] 663.622 ns (5%) 1.84 KiB (1%) 4
["value_and_gradient!", (10, 1), "ForwardDiffBackend{fallback}()"] 792.329 ns (5%) 2.19 KiB (1%) 10
["value_and_gradient!", (10, 1), "PolyesterForwardDiffBackend{custom,4}()"] 292.941 ns (5%) 512 bytes (1%) 2
["value_and_gradient!", (10, 1), "ReverseDiffBackend{custom}()"] 500.621 ns (5%) 1.22 KiB (1%) 18
["value_and_gradient!", (10, 1), "ReverseDiffBackend{fallback}()"] 519.409 ns (5%) 1.25 KiB (1%) 19
["value_and_gradient!", (10, 1), "ZygoteBackend{custom}()"] 180.097 ns (5%) 576 bytes (1%) 6
["value_and_gradient!", (10, 1), "ZygoteBackend{fallback}()"] 4.016 μs (5%) 1.39 KiB (1%) 38
["value_and_gradient", (10, 1), "EnzymeForwardBackend{custom}()"] 848.714 ns (5%) 1.55 KiB (1%) 11
["value_and_gradient", (10, 1), "EnzymeForwardBackend{fallback}()"] 851.718 ns (5%) 1.55 KiB (1%) 11
["value_and_gradient", (10, 1), "EnzymeReverseBackend{custom}()"] 120.845 ns (5%) 144 bytes (1%) 1
["value_and_gradient", (10, 1), "EnzymeReverseBackend{fallback}()"] 125.401 ns (5%) 144 bytes (1%) 1
["value_and_gradient", (10, 1), "FiniteDiffBackend{custom,Val{:central}}()"] 1.255 μs (5%) 400 bytes (1%) 6
["value_and_gradient", (10, 1), "FiniteDiffBackend{fallback,Val{:central}}()"] 1.433 μs (5%) 2.95 KiB (1%) 21
["value_and_gradient", (10, 1), "ForwardDiffBackend{custom}()"] 743.000 ns (5%) 2.03 KiB (1%) 7
["value_and_gradient", (10, 1), "ForwardDiffBackend{fallback}()"] 790.992 ns (5%) 2.33 KiB (1%) 11
["value_and_gradient", (10, 1), "PolyesterForwardDiffBackend{custom,4}()"] 278.396 ns (5%) 656 bytes (1%) 3
["value_and_gradient", (10, 1), "ReverseDiffBackend{custom}()"] 535.916 ns (5%) 1.36 KiB (1%) 19
["value_and_gradient", (10, 1), "ReverseDiffBackend{fallback}()"] 552.393 ns (5%) 1.39 KiB (1%) 20
["value_and_gradient", (10, 1), "ZygoteBackend{custom}()"] 167.735 ns (5%) 576 bytes (1%) 6
["value_and_gradient", (10, 1), "ZygoteBackend{fallback}()"] 4.050 μs (5%) 1.53 KiB (1%) 39
["value_and_jacobian!", (10, 10), "EnzymeForwardBackend{custom}()"] 5.721 μs (5%) 7.31 KiB (1%) 52
["value_and_jacobian!", (10, 10), "EnzymeForwardBackend{fallback}()"] 5.772 μs (5%) 7.31 KiB (1%) 52
["value_and_jacobian!", (10, 10), "FiniteDiffBackend{custom,Val{:central}}()"] 9.908 μs (5%) 19.34 KiB (1%) 202
["value_and_jacobian!", (10, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 9.868 μs (5%) 14.81 KiB (1%) 122
["value_and_jacobian!", (10, 10), "ForwardDiffBackend{custom}()"] 1.934 μs (5%) 4.09 KiB (1%) 8
["value_and_jacobian!", (10, 10), "ForwardDiffBackend{fallback}()"] 3.862 μs (5%) 8.25 KiB (1%) 42
["value_and_jacobian!", (10, 10), "PolyesterForwardDiffBackend{custom,4}()"] 1.592 μs (5%) 3.67 KiB (1%) 9
["value_and_jacobian!", (10, 10), "ReverseDiffBackend{custom}()"] 14.988 μs (5%) 6.95 KiB (1%) 91
["value_and_jacobian!", (10, 10), "ReverseDiffBackend{fallback}()"] 147.164 μs (5%) 77.47 KiB (1%) 922
["value_and_jacobian!", (10, 10), "ZygoteBackend{custom}()"] 146.262 μs (5%) 42.03 KiB (1%) 951
["value_and_jacobian!", (10, 10), "ZygoteBackend{fallback}()"] 54.131 μs (5%) 29.03 KiB (1%) 462
["value_and_jacobian", (10, 10), "EnzymeForwardBackend{custom}()"] 3.444 μs (5%) 9.89 KiB (1%) 47
["value_and_jacobian", (10, 10), "EnzymeForwardBackend{fallback}()"] 6.041 μs (5%) 8.47 KiB (1%) 55
["value_and_jacobian", (10, 10), "FiniteDiffBackend{custom,Val{:central}}()"] 9.868 μs (5%) 19.36 KiB (1%) 202
["value_and_jacobian", (10, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 10.249 μs (5%) 15.97 KiB (1%) 125
["value_and_jacobian", (10, 10), "ForwardDiffBackend{custom}()"] 3.057 μs (5%) 5.00 KiB (1%) 10
["value_and_jacobian", (10, 10), "ForwardDiffBackend{fallback}()"] 4.110 μs (5%) 9.41 KiB (1%) 45
["value_and_jacobian", (10, 10), "PolyesterForwardDiffBackend{custom,4}()"] 1.905 μs (5%) 4.83 KiB (1%) 12
["value_and_jacobian", (10, 10), "ReverseDiffBackend{custom}()"] 15.529 μs (5%) 7.86 KiB (1%) 93
["value_and_jacobian", (10, 10), "ReverseDiffBackend{fallback}()"] 148.195 μs (5%) 78.62 KiB (1%) 925
["value_and_jacobian", (10, 10), "ZygoteBackend{custom}()"] 145.861 μs (5%) 42.05 KiB (1%) 951
["value_and_jacobian", (10, 10), "ZygoteBackend{fallback}()"] 54.321 μs (5%) 30.19 KiB (1%) 465
["value_and_multiderivative!", (1, 10), "EnzymeForwardBackend{custom}()"] 359.667 ns (5%) 288 bytes (1%) 2
["value_and_multiderivative!", (1, 10), "EnzymeForwardBackend{fallback}()"] 360.814 ns (5%) 288 bytes (1%) 2
["value_and_multiderivative!", (1, 10), "FiniteDiffBackend{custom,Val{:central}}()"] 627.869 ns (5%) 768 bytes (1%) 7
["value_and_multiderivative!", (1, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 608.217 ns (5%) 768 bytes (1%) 7
["value_and_multiderivative!", (1, 10), "ForwardDiffBackend{custom}()"] 267.404 ns (5%) 368 bytes (1%) 2
["value_and_multiderivative!", (1, 10), "ForwardDiffBackend{fallback}()"] 176.907 ns (5%) 368 bytes (1%) 2
["value_and_multiderivative!", (1, 10), "PolyesterForwardDiffBackend{custom,4}()"] 164.967 ns (5%) 368 bytes (1%) 2
["value_and_multiderivative!", (1, 10), "ZygoteBackend{custom}()"] 49.312 μs (5%) 19.98 KiB (1%) 481
["value_and_multiderivative!", (1, 10), "ZygoteBackend{fallback}()"] 49.282 μs (5%) 19.98 KiB (1%) 481
["value_and_multiderivative", (1, 10), "EnzymeForwardBackend{custom}()"] 492.656 ns (5%) 576 bytes (1%) 4
["value_and_multiderivative", (1, 10), "EnzymeForwardBackend{fallback}()"] 492.149 ns (5%) 576 bytes (1%) 4
["value_and_multiderivative", (1, 10), "FiniteDiffBackend{custom,Val{:central}}()"] 1.416 μs (5%) 1.12 KiB (1%) 13
["value_and_multiderivative", (1, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 773.835 ns (5%) 1.03 KiB (1%) 9
["value_and_multiderivative", (1, 10), "ForwardDiffBackend{custom}()"] 276.674 ns (5%) 512 bytes (1%) 3
["value_and_multiderivative", (1, 10), "ForwardDiffBackend{fallback}()"] 298.232 ns (5%) 656 bytes (1%) 4
["value_and_multiderivative", (1, 10), "PolyesterForwardDiffBackend{custom,4}()"] 278.253 ns (5%) 656 bytes (1%) 4
["value_and_multiderivative", (1, 10), "ZygoteBackend{custom}()"] 49.462 μs (5%) 20.27 KiB (1%) 483
["value_and_multiderivative", (1, 10), "ZygoteBackend{fallback}()"] 49.402 μs (5%) 20.27 KiB (1%) 483
["value_and_pullback!", (1, 10), "ZygoteBackend{fallback}()"] 4.981 μs (5%) 1.88 KiB (1%) 45
["value_and_pullback!", (10, 1), "EnzymeReverseBackend{fallback}()"] 84.612 ns (5%)
["value_and_pullback!", (10, 1), "ReverseDiffBackend{fallback}()"] 499.381 ns (5%) 1.25 KiB (1%) 19
["value_and_pullback!", (10, 1), "ZygoteBackend{fallback}()"] 3.975 μs (5%) 1.39 KiB (1%) 38
["value_and_pullback!", (10, 10), "ReverseDiffBackend{fallback}()"] 15.128 μs (5%) 7.72 KiB (1%) 92
["value_and_pullback!", (10, 10), "ZygoteBackend{fallback}()"] 5.211 μs (5%) 2.66 KiB (1%) 42
["value_and_pullback", (1, 1), "EnzymeReverseBackend{fallback}()"] 26.270 ns (5%)
["value_and_pullback", (1, 1), "ZygoteBackend{fallback}()"] 5.672 μs (5%) 1.12 KiB (1%) 39
["value_and_pullback", (1, 10), "ZygoteBackend{fallback}()"] 4.959 μs (5%) 1.86 KiB (1%) 44
["value_and_pullback", (10, 1), "EnzymeReverseBackend{fallback}()"] 110.172 ns (5%) 144 bytes (1%) 1
["value_and_pullback", (10, 1), "ReverseDiffBackend{fallback}()"] 533.234 ns (5%) 1.39 KiB (1%) 20
["value_and_pullback", (10, 1), "ZygoteBackend{fallback}()"] 3.962 μs (5%) 1.39 KiB (1%) 38
["value_and_pullback", (10, 10), "ReverseDiffBackend{fallback}()"] 15.008 μs (5%) 7.86 KiB (1%) 93
["value_and_pullback", (10, 10), "ZygoteBackend{fallback}()"] 5.190 μs (5%) 2.66 KiB (1%) 42
["value_and_pushforward!", (1, 10), "EnzymeForwardBackend{fallback}()"] 365.110 ns (5%) 288 bytes (1%) 2
["value_and_pushforward!", (1, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 609.312 ns (5%) 768 bytes (1%) 7
["value_and_pushforward!", (1, 10), "ForwardDiffBackend{fallback}()"] 183.489 ns (5%) 368 bytes (1%) 2
["value_and_pushforward!", (10, 1), "EnzymeForwardBackend{fallback}()"] 38.913 ns (5%)
["value_and_pushforward!", (10, 1), "FiniteDiffBackend{fallback,Val{:central}}()"] 162.902 ns (5%) 288 bytes (1%) 2
["value_and_pushforward!", (10, 1), "ForwardDiffBackend{fallback}()"] 83.595 ns (5%) 224 bytes (1%) 1
["value_and_pushforward!", (10, 10), "EnzymeForwardBackend{fallback}()"] 526.656 ns (5%) 576 bytes (1%) 4
["value_and_pushforward!", (10, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 941.774 ns (5%) 1.45 KiB (1%) 12
["value_and_pushforward!", (10, 10), "ForwardDiffBackend{fallback}()"] 357.462 ns (5%) 816 bytes (1%) 4
["value_and_pushforward", (1, 1), "EnzymeForwardBackend{fallback}()"] 25.348 ns (5%)
["value_and_pushforward", (1, 1), "FiniteDiffBackend{fallback,Val{:central}}()"] 50.641 ns (5%) 80 bytes (1%) 3
["value_and_pushforward", (1, 1), "ForwardDiffBackend{fallback}()"] 7.341 ns (5%)
["value_and_pushforward", (1, 10), "EnzymeForwardBackend{fallback}()"] 490.041 ns (5%) 576 bytes (1%) 4
["value_and_pushforward", (1, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 753.697 ns (5%) 1.03 KiB (1%) 9
["value_and_pushforward", (1, 10), "ForwardDiffBackend{fallback}()"] 306.899 ns (5%) 656 bytes (1%) 4
["value_and_pushforward", (10, 1), "EnzymeForwardBackend{fallback}()"] 59.020 ns (5%)
["value_and_pushforward", (10, 1), "FiniteDiffBackend{fallback,Val{:central}}()"] 165.256 ns (5%) 288 bytes (1%) 2
["value_and_pushforward", (10, 1), "ForwardDiffBackend{fallback}()"] 92.410 ns (5%) 224 bytes (1%) 1
["value_and_pushforward", (10, 10), "EnzymeForwardBackend{fallback}()"] 760.031 ns (5%) 1008 bytes (1%) 7
["value_and_pushforward", (10, 10), "FiniteDiffBackend{fallback,Val{:central}}()"] 1.198 μs (5%) 1.88 KiB (1%) 15
["value_and_pushforward", (10, 10), "ForwardDiffBackend{fallback}()"] 593.589 ns (5%) 1.22 KiB (1%) 7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["value_and_derivative", (1, 1)]
  • ["value_and_gradient!", (10, 1)]
  • ["value_and_gradient", (10, 1)]
  • ["value_and_jacobian!", (10, 10)]
  • ["value_and_jacobian", (10, 10)]
  • ["value_and_multiderivative!", (1, 10)]
  • ["value_and_multiderivative", (1, 10)]
  • ["value_and_pullback!", (1, 10)]
  • ["value_and_pullback!", (10, 1)]
  • ["value_and_pullback!", (10, 10)]
  • ["value_and_pullback", (1, 1)]
  • ["value_and_pullback", (1, 10)]
  • ["value_and_pullback", (10, 1)]
  • ["value_and_pullback", (10, 10)]
  • ["value_and_pushforward!", (1, 10)]
  • ["value_and_pushforward!", (10, 1)]
  • ["value_and_pushforward!", (10, 10)]
  • ["value_and_pushforward", (1, 1)]
  • ["value_and_pushforward", (1, 10)]
  • ["value_and_pushforward", (10, 1)]
  • ["value_and_pushforward", (10, 10)]

Julia versioninfo

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  2730 MHz       1966 s          0 s        136 s       6160 s          0 s
       #2  2445 MHz       1846 s          0 s        130 s       6275 s          0 s
       #3  2445 MHz       1992 s          0 s        138 s       6144 s          0 s
       #4  3243 MHz       1593 s          0 s        146 s       6532 s          0 s
  Memory: 15.606491088867188 GB (14010.04296875 MB free)
  Uptime: 829.91 sec
  Load Avg:  1.09  1.32  0.76
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Baseline result

Benchmark Report for /home/runner/work/DifferentiationInterface.jl/DifferentiationInterface.jl

Job Properties

  • Time of benchmark: 8 Mar 2024 - 9:45
  • Package commit: 58748e
  • Julia commit: bd47ec
  • Julia command flags: None
  • Environment variables: None

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["forward", "scalar_to_scalar", "10", "EnzymeForwardBackend()"] 29.000 ns (5%)
["forward", "scalar_to_scalar", "10", "FiniteDiffBackend()"] 20.000 ns (5%)
["forward", "scalar_to_scalar", "10", "ForwardDiffBackend()"] 20.000 ns (5%)
["forward", "scalar_to_vector", "10", "FiniteDiffBackend()"] 831.000 ns (5%) 768 bytes (1%) 7
["forward", "scalar_to_vector", "10", "ForwardDiffBackend()"] 100.000 ns (5%) 368 bytes (1%) 2
["forward", "vector_to_vector", "10", "EnzymeForwardBackend()"] 701.000 ns (5%) 464 bytes (1%) 8
["forward", "vector_to_vector", "10", "FiniteDiffBackend()"] 5.410 μs (5%) 15.61 KiB (1%) 159
["forward", "vector_to_vector", "10", "ForwardDiffBackend()"] 140.000 ns (5%) 592 bytes (1%) 3
["reverse", "scalar_to_scalar", "10", "ChainRulesReverseBackend(Zygote.ZygoteRuleConfig{Zygote.Context{false}}(Zygote.Context{false}(nothing)))"] 29.000 ns (5%)
["reverse", "scalar_to_scalar", "10", "EnzymeReverseBackend()"] 29.000 ns (5%)
["reverse", "vector_to_scalar", "10", "ChainRulesReverseBackend(Zygote.ZygoteRuleConfig{Zygote.Context{false}}(Zygote.Context{false}(nothing)))"] 1.763 μs (5%) 1.78 KiB (1%) 35
["reverse", "vector_to_scalar", "10", "EnzymeReverseBackend()"] 6.823 μs (5%) 192 bytes (1%) 9
["reverse", "vector_to_scalar", "10", "ReverseDiffBackend()"] 621.000 ns (5%) 976 bytes (1%) 13
["reverse", "vector_to_vector", "10", "ChainRulesReverseBackend(Zygote.ZygoteRuleConfig{Zygote.Context{false}}(Zygote.Context{false}(nothing)))"] 150.000 ns (5%) 432 bytes (1%) 3
["reverse", "vector_to_vector", "10", "ReverseDiffBackend()"] 1.322 μs (5%) 1.95 KiB (1%) 15

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["forward", "scalar_to_scalar", "10"]
  • ["forward", "scalar_to_vector", "10"]
  • ["forward", "vector_to_vector", "10"]
  • ["reverse", "scalar_to_scalar", "10"]
  • ["reverse", "vector_to_scalar", "10"]
  • ["reverse", "vector_to_vector", "10"]

Julia versioninfo

Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      Ubuntu 22.04.4 LTS
  uname: Linux 6.5.0-1015-azure #15~22.04.1-Ubuntu SMP Tue Feb 13 01:15:12 UTC 2024 x86_64 x86_64
  CPU: AMD EPYC 7763 64-Core Processor: 
              speed         user         nice          sys         idle          irq
       #1  2445 MHz       2060 s          0 s        142 s       6736 s          0 s
       #2  3243 MHz       2238 s          0 s        143 s       6548 s          0 s
       #3  3174 MHz       2132 s          0 s        149 s       6669 s          0 s
       #4  2577 MHz       1637 s          0 s        157 s       7155 s          0 s
  Memory: 15.606491088867188 GB (13887.1796875 MB free)
  Uptime: 897.8 sec
  Load Avg:  1.08  1.26  0.78
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() lbt
Sys.CPU_THREADS 4

lscpu output:

Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Address sizes:                      48 bits physical, 48 bits virtual
Byte Order:                         Little Endian
CPU(s):                             4
On-line CPU(s) list:                0-3
Vendor ID:                          AuthenticAMD
Model name:                         AMD EPYC 7763 64-Core Processor
CPU family:                         25
Model:                              1
Thread(s) per core:                 2
Core(s) per socket:                 2
Socket(s):                          1
Stepping:                           1
BogoMIPS:                           4890.85
Flags:                              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext invpcid_single vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr rdpru arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload umip vaes vpclmulqdq rdpid fsrm
Virtualization:                     AMD-V
Hypervisor vendor:                  Microsoft
Virtualization type:                full
L1d cache:                          64 KiB (2 instances)
L1i cache:                          64 KiB (2 instances)
L2 cache:                           1 MiB (2 instances)
L3 cache:                           32 MiB (1 instance)
NUMA node(s):                       1
NUMA node0 CPU(s):                  0-3
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit:        Not affected
Vulnerability L1tf:                 Not affected
Vulnerability Mds:                  Not affected
Vulnerability Meltdown:             Not affected
Vulnerability Mmio stale data:      Not affected
Vulnerability Retbleed:             Not affected
Vulnerability Spec rstack overflow: Mitigation; safe RET, no microcode
Vulnerability Spec store bypass:    Vulnerable
Vulnerability Spectre v1:           Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:           Mitigation; Retpolines, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds:                Not affected
Vulnerability Tsx async abort:      Not affected
Cpu Property Value
Brand AMD EPYC 7763 64-Core Processor
Vendor :AMD
Architecture :Unknown
Model Family: 0xaf, Model: 0x01, Stepping: 0x01, Type: 0x00
Cores 16 physical cores, 16 logical cores (on executing CPU)
No Hyperthreading hardware capability detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 512, 32768) kbytes
64 byte cache line size
Address Size 48 bits virtual, 48 bits physical
SIMD 256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC runs at constant rate (invariant from clock frequency)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@gdalle gdalle merged commit 47e26ec into main Mar 8, 2024
3 checks passed
@gdalle gdalle deleted the gd/specialcases branch March 8, 2024 11:54
@gdalle gdalle mentioned this pull request Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants