Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better equivalence-checker error messages #1193

Merged
merged 6 commits into from
Apr 21, 2022

Conversation

JasonGross
Copy link
Collaborator

@JasonGross JasonGross commented Apr 14, 2022

The dag now prints out where some indices came from, and assembly lines
generating values that don't exist in PHOAS are annotated with the dag
indices they generate.

We also pave the way to display information about indices that appear
only in the assembly and not in the PHOAS. I'm not sure what the right strategy here is.
One idea:
When we transition from an inequality that contains indices appearing only in assembly to an inequality containing indices that appear in both, recursively expand all indices appearing only in asm

Edit: performance is pretty terrible:

Timing Diff since 831d6114a2c4b13b51c5985a48185c7bd2d5a3a8

     After |  Peak Mem | File Name                                |     Before |  Peak Mem ||     Change || Change (mem) | % Change | % Change (mem)
----------------------------------------------------------------------------------------------------------------------------------------------------
122m52.12s | 302392 ko | Total Time / Peak Mem                    | 103m15.27s | 327788 ko || +19m36.84s ||    -25396 ko |  +18.99% |         -7.74%
----------------------------------------------------------------------------------------------------------------------------------------------------
 52m14.52s | 302392 ko | fiat-amd64/p434-mul.only-status          |  44m39.92s | 327788 ko ||  +7m34.59s ||    -25396 ko |  +16.96% |         -7.74%
 49m00.50s | 289860 ko | fiat-amd64/p434-square.only-status       |  41m46.12s | 280424 ko ||  +7m14.38s ||      9436 ko |  +17.33% |         +3.36%
  7m03.23s |  93856 ko | fiat-amd64/p384-mul.only-status          |   5m42.95s |  93804 ko ||  +1m20.28s ||        52 ko |  +23.40% |         +0.05%
  6m17.51s |  93864 ko | fiat-amd64/p384-square.only-status       |   5m09.63s |  84120 ko ||  +1m07.87s ||      9744 ko |  +21.92% |        +11.58%
  2m30.23s |  54948 ko | fiat-amd64/p448-mul.only-status          |   1m46.05s |  50276 ko ||  +0m44.17s ||      4672 ko |  +41.65% |         +9.29%
  1m47.63s |  50300 ko | fiat-amd64/p521-mul.only-status          |   1m10.69s |  45816 ko ||  +0m36.93s ||      4484 ko |  +52.25% |         +9.78%
  0m57.09s |  42144 ko | fiat-amd64/p448-square.only-status       |   0m41.84s |  42168 ko ||  +0m15.25s ||       -24 ko |  +36.44% |         -0.05%
  0m33.92s |  38808 ko | fiat-amd64/p224-square.only-status       |   0m21.42s |  38620 ko ||  +0m12.50s ||       188 ko |  +58.35% |         +0.48%
  0m33.25s |  36436 ko | fiat-amd64/p224-mul.only-status          |   0m22.94s |  36476 ko ||  +0m10.30s ||       -40 ko |  +44.94% |         -0.10%
  0m17.85s |  36252 ko | fiat-amd64/p256-mul.only-status          |   0m11.08s |  36188 ko ||  +0m06.77s ||        64 ko |  +61.10% |         +0.17%
  0m14.26s |  38104 ko | fiat-amd64/p256-square.only-status       |   0m09.72s |  37620 ko ||  +0m04.53s ||       484 ko |  +46.70% |         +1.28%
  0m24.48s |  38740 ko | fiat-amd64/secp256k1-mul.only-status     |   0m20.92s |  38780 ko ||  +0m03.55s ||       -40 ko |  +17.01% |         -0.10%
  0m23.26s |  36288 ko | fiat-amd64/p521-square.only-status       |   0m20.08s |  36152 ko ||  +0m03.18s ||       136 ko |  +15.83% |         +0.37%
  0m22.36s |  38900 ko | fiat-amd64/secp256k1-square.only-status  |   0m19.29s |  39028 ko ||  +0m03.07s ||      -128 ko |  +15.91% |         -0.32%
  0m04.70s |  26684 ko | fiat-amd64/curve25519-mul.only-status    |   0m05.02s |  25720 ko ||  -0m00.31s ||       964 ko |   -6.37% |         +3.74%
  0m03.60s |  26752 ko | fiat-amd64/curve25519-square.only-status |   0m03.85s |  25700 ko ||  -0m00.25s ||      1052 ko |   -6.49% |         +4.09%
  0m01.88s |  22584 ko | fiat-amd64/poly1305-mul.only-status      |   0m01.93s |  22620 ko ||  -0m00.05s ||       -36 ko |   -2.59% |         -0.15%
  0m01.66s |  22168 ko | fiat-amd64/poly1305-square.only-status   |   0m01.70s |  22296 ko ||  -0m00.04s ||      -128 ko |   -2.35% |         -0.57%
  0m00.19s |  21616 ko | fiat-amd64/p224-sub.only-status          |   0m00.13s |  21284 ko ||  +0m00.06s ||       332 ko |  +46.15% |         +1.55%

@JasonGross JasonGross force-pushed the asm-better-errors branch 6 times, most recently from 195a9b6 to 9b4b03a Compare April 15, 2022 10:51
@JasonGross
Copy link
Collaborator Author

It seems that we increase the time of assembly checking too much. Maybe thunking the error info will help?

@JasonGross JasonGross force-pushed the asm-better-errors branch 6 times, most recently from c6652fc to e02b6bf Compare April 19, 2022 22:23
@JasonGross
Copy link
Collaborator Author

Is it worth the 20% performance hit to get better error messages? Maybe we can make up the time by removing all the nats?

The dag now prints out where some indices came from, and assembly lines
generating values that don't exist in PHOAS are annotated with the dag
indices they generate.

We also pave the way to display information about indices that appear
only in the assembly and not in the PHOAS.

We now get errors like:

```
check_args
/* Autogenerated: 'src/ExtractionOCaml/word_by_word_montgomery' p256 64 '2^256 - 2^224 + 2^192 + 2^96 - 1' mul --no-wide-int --shiftr-avoid-uint1 --hints-file 'fiat-amd64/boringssl_intel_manual_mul_p256.asm' */
/* curve description: p256 */
/* machine_wordsize = 64 (from "64") */
/* requested operations: mul */
/* m = 0xffffffff00000001000000000000000000000000ffffffffffffffffffffffff (from "2^256 - 2^224 + 2^192 + 2^96 - 1") */
/*                                                                    */
/* NOTE: In addition to the bounds specified above each function, all */
/*   functions synthesized for this Montgomery arithmetic require the */
/*   input to be strictly less than the prime modulus (m), and also   */
/*   require the input to be in the unique saturated representation.  */
/*   All functions also ensure that these two properties are true of  */
/*   return values.                                                   */
/*  */
/* Computed values: */
/*   eval z = z[0] + (z[1] << 64) + (z[2] << 128) + (z[3] << 192) */
/*   bytes_eval z = z[0] + (z[1] << 8) + (z[2] << 16) + (z[3] << 24) + (z[4] << 32) + (z[5] << 40) + (z[6] << 48) + (z[7] << 56) + (z[8] << 64) + (z[9] << 72) + (z[10] << 80) + (z[11] << 88) + (z[12] << 96) + (z[13] << 104) + (z[14] << 112) + (z[15] << 120) + (z[16] << 128) + (z[17] << 136) + (z[18] << 144) + (z[19] << 152) + (z[20] << 160) + (z[21] << 168) + (z[22] << 176) + (z[23] << 184) + (z[24] << 192) + (z[25] << 200) + (z[26] << 208) + (z[27] << 216) + (z[28] << 224) + (z[29] << 232) + (z[30] << 240) + (z[31] << 248) */
/*   twos_complement_eval z = let x1 := z[0] + (z[1] << 64) + (z[2] << 128) + (z[3] << 192) in */
/*                            if x1 & (2^256-1) < 2^255 then x1 & (2^256-1) else (x1 & (2^256-1)) - 2^256 */

In fiat_p256_mul:
Error while checking for equivalence of syntax tree and assembly:
The syntax tree:
(λ x1 x2,
  let x3 := x1[1] (* : uint64_t *) in
  let x4 := x1[2] (* : uint64_t *) in
  let x5 := x1[3] (* : uint64_t *) in
  let x6 := x1[0] (* : uint64_t *) in
  let x7 := Z.mul_split(2^64, None, (x6, Some [0x0 ~> 0xffffffffffffffff], (x2[3], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x8 := Z.mul_split(2^64, None, (x6, Some [0x0 ~> 0xffffffffffffffff], (x2[2], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x9 := Z.mul_split(2^64, None, (x6, Some [0x0 ~> 0xffffffffffffffff], (x2[1], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x10 := Z.mul_split(2^64, None, (x6, Some [0x0 ~> 0xffffffffffffffff], (x2[0], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x11 := Z.add_get_carry(2^64, None, (x10₂, Some [0x0 ~> 0xffffffffffffffff], (x9₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x12 := Z.add_with_get_carry(2^64, None, (x11₂, Some [0x0 ~> 0x1], (x9₂, Some [0x0 ~> 0xffffffffffffffff], (x8₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x13 := Z.add_with_get_carry(2^64, None, (x12₂, Some [0x0 ~> 0x1], (x8₂, Some [0x0 ~> 0xffffffffffffffff], (x7₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x14 := x13₂ + x7₂ (* : uint64_t *) in
  let x15 := Z.mul_split(2^64, None, (x10₁, Some [0x0 ~> 0xffffffffffffffff], (0xffffffff00000001, None))) in
  let x16 := Z.mul_split(2^64, None, (x10₁, Some [0x0 ~> 0xffffffffffffffff], (2^32-1, None))) in
  let x17 := Z.mul_split(2^64, None, (x10₁, Some [0x0 ~> 0xffffffffffffffff], (2^64-1, None))) in
  let x18 := Z.add_get_carry(2^64, None, (x17₂, Some [0x0 ~> 0xffffffffffffffff], (x16₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x19 := x18₂ + x16₂ (* : uint64_t *) in
  let x20 := Z.add_get_carry(2^64, None, (x10₁, Some [0x0 ~> 0xffffffffffffffff], (x17₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x21 := Z.add_with_get_carry(2^64, None, (x20₂, Some [0x0 ~> 0x1], (x11₁, Some [0x0 ~> 0xffffffffffffffff], (x18₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x22 := Z.add_with_get_carry(2^64, None, (x21₂, Some [0x0 ~> 0x1], (x12₁, Some [0x0 ~> 0xffffffffffffffff], (x19, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x23 := Z.add_with_get_carry(2^64, None, (x22₂, Some [0x0 ~> 0x1], (x13₁, Some [0x0 ~> 0xffffffffffffffff], (x15₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x24 := Z.add_with_get_carry(2^64, None, (x23₂, Some [0x0 ~> 0x1], (x14, Some [0x0 ~> 0xffffffffffffffff], (x15₂, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x25 := Z.mul_split(2^64, None, (x3, Some [0x0 ~> 0xffffffffffffffff], (x2[3], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x26 := Z.mul_split(2^64, None, (x3, Some [0x0 ~> 0xffffffffffffffff], (x2[2], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x27 := Z.mul_split(2^64, None, (x3, Some [0x0 ~> 0xffffffffffffffff], (x2[1], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x28 := Z.mul_split(2^64, None, (x3, Some [0x0 ~> 0xffffffffffffffff], (x2[0], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x29 := Z.add_get_carry(2^64, None, (x28₂, Some [0x0 ~> 0xffffffffffffffff], (x27₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x30 := Z.add_with_get_carry(2^64, None, (x29₂, Some [0x0 ~> 0x1], (x27₂, Some [0x0 ~> 0xffffffffffffffff], (x26₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x31 := Z.add_with_get_carry(2^64, None, (x30₂, Some [0x0 ~> 0x1], (x26₂, Some [0x0 ~> 0xffffffffffffffff], (x25₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x32 := x31₂ + x25₂ (* : uint64_t *) in
  let x33 := Z.add_get_carry(2^64, None, (x21₁, Some [0x0 ~> 0xffffffffffffffff], (x28₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x34 := Z.add_with_get_carry(2^64, None, (x33₂, Some [0x0 ~> 0x1], (x22₁, Some [0x0 ~> 0xffffffffffffffff], (x29₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x35 := Z.add_with_get_carry(2^64, None, (x34₂, Some [0x0 ~> 0x1], (x23₁, Some [0x0 ~> 0xffffffffffffffff], (x30₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x36 := Z.add_with_get_carry(2^64, None, (x35₂, Some [0x0 ~> 0x1], (x24₁, Some [0x0 ~> 0xffffffffffffffff], (x31₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x37 := Z.add_with_get_carry(2^64, None, (x36₂, Some [0x0 ~> 0x1], (x24₂, Some [0x0 ~> 0x1], (x32, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x38 := Z.mul_split(2^64, None, (x33₁, Some [0x0 ~> 0xffffffffffffffff], (0xffffffff00000001, None))) in
  let x39 := Z.mul_split(2^64, None, (x33₁, Some [0x0 ~> 0xffffffffffffffff], (2^32-1, None))) in
  let x40 := Z.mul_split(2^64, None, (x33₁, Some [0x0 ~> 0xffffffffffffffff], (2^64-1, None))) in
  let x41 := Z.add_get_carry(2^64, None, (x40₂, Some [0x0 ~> 0xffffffffffffffff], (x39₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x42 := x41₂ + x39₂ (* : uint64_t *) in
  let x43 := Z.add_get_carry(2^64, None, (x33₁, Some [0x0 ~> 0xffffffffffffffff], (x40₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x44 := Z.add_with_get_carry(2^64, None, (x43₂, Some [0x0 ~> 0x1], (x34₁, Some [0x0 ~> 0xffffffffffffffff], (x41₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x45 := Z.add_with_get_carry(2^64, None, (x44₂, Some [0x0 ~> 0x1], (x35₁, Some [0x0 ~> 0xffffffffffffffff], (x42, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x46 := Z.add_with_get_carry(2^64, None, (x45₂, Some [0x0 ~> 0x1], (x36₁, Some [0x0 ~> 0xffffffffffffffff], (x38₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x47 := Z.add_with_get_carry(2^64, None, (x46₂, Some [0x0 ~> 0x1], (x37₁, Some [0x0 ~> 0xffffffffffffffff], (x38₂, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x48 := x47₂ + x37₂ (* : uint64_t *) in
  let x49 := Z.mul_split(2^64, None, (x4, Some [0x0 ~> 0xffffffffffffffff], (x2[3], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x50 := Z.mul_split(2^64, None, (x4, Some [0x0 ~> 0xffffffffffffffff], (x2[2], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x51 := Z.mul_split(2^64, None, (x4, Some [0x0 ~> 0xffffffffffffffff], (x2[1], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x52 := Z.mul_split(2^64, None, (x4, Some [0x0 ~> 0xffffffffffffffff], (x2[0], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x53 := Z.add_get_carry(2^64, None, (x52₂, Some [0x0 ~> 0xffffffffffffffff], (x51₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x54 := Z.add_with_get_carry(2^64, None, (x53₂, Some [0x0 ~> 0x1], (x51₂, Some [0x0 ~> 0xffffffffffffffff], (x50₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x55 := Z.add_with_get_carry(2^64, None, (x54₂, Some [0x0 ~> 0x1], (x50₂, Some [0x0 ~> 0xffffffffffffffff], (x49₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x56 := x55₂ + x49₂ (* : uint64_t *) in
  let x57 := Z.add_get_carry(2^64, None, (x44₁, Some [0x0 ~> 0xffffffffffffffff], (x52₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x58 := Z.add_with_get_carry(2^64, None, (x57₂, Some [0x0 ~> 0x1], (x45₁, Some [0x0 ~> 0xffffffffffffffff], (x53₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x59 := Z.add_with_get_carry(2^64, None, (x58₂, Some [0x0 ~> 0x1], (x46₁, Some [0x0 ~> 0xffffffffffffffff], (x54₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x60 := Z.add_with_get_carry(2^64, None, (x59₂, Some [0x0 ~> 0x1], (x47₁, Some [0x0 ~> 0xffffffffffffffff], (x55₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x61 := Z.add_with_get_carry(2^64, None, (x60₂, Some [0x0 ~> 0x1], (x48, Some [0x0 ~> 0xffffffffffffffff], (x56, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x62 := Z.mul_split(2^64, None, (x57₁, Some [0x0 ~> 0xffffffffffffffff], (0xffffffff00000001, None))) in
  let x63 := Z.mul_split(2^64, None, (x57₁, Some [0x0 ~> 0xffffffffffffffff], (2^32-1, None))) in
  let x64 := Z.mul_split(2^64, None, (x57₁, Some [0x0 ~> 0xffffffffffffffff], (2^64-1, None))) in
  let x65 := Z.add_get_carry(2^64, None, (x64₂, Some [0x0 ~> 0xffffffffffffffff], (x63₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x66 := x65₂ + x63₂ (* : uint64_t *) in
  let x67 := Z.add_get_carry(2^64, None, (x57₁, Some [0x0 ~> 0xffffffffffffffff], (x64₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x68 := Z.add_with_get_carry(2^64, None, (x67₂, Some [0x0 ~> 0x1], (x58₁, Some [0x0 ~> 0xffffffffffffffff], (x65₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x69 := Z.add_with_get_carry(2^64, None, (x68₂, Some [0x0 ~> 0x1], (x59₁, Some [0x0 ~> 0xffffffffffffffff], (x66, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x70 := Z.add_with_get_carry(2^64, None, (x69₂, Some [0x0 ~> 0x1], (x60₁, Some [0x0 ~> 0xffffffffffffffff], (x62₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x71 := Z.add_with_get_carry(2^64, None, (x70₂, Some [0x0 ~> 0x1], (x61₁, Some [0x0 ~> 0xffffffffffffffff], (x62₂, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x72 := x71₂ + x61₂ (* : uint64_t *) in
  let x73 := Z.mul_split(2^64, None, (x5, Some [0x0 ~> 0xffffffffffffffff], (x2[3], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x74 := Z.mul_split(2^64, None, (x5, Some [0x0 ~> 0xffffffffffffffff], (x2[2], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x75 := Z.mul_split(2^64, None, (x5, Some [0x0 ~> 0xffffffffffffffff], (x2[1], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x76 := Z.mul_split(2^64, None, (x5, Some [0x0 ~> 0xffffffffffffffff], (x2[0], Some [0x0 ~> 0xffffffffffffffff]))) in
  let x77 := Z.add_get_carry(2^64, None, (x76₂, Some [0x0 ~> 0xffffffffffffffff], (x75₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x78 := Z.add_with_get_carry(2^64, None, (x77₂, Some [0x0 ~> 0x1], (x75₂, Some [0x0 ~> 0xffffffffffffffff], (x74₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x79 := Z.add_with_get_carry(2^64, None, (x78₂, Some [0x0 ~> 0x1], (x74₂, Some [0x0 ~> 0xffffffffffffffff], (x73₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x80 := x79₂ + x73₂ (* : uint64_t *) in
  let x81 := Z.add_get_carry(2^64, None, (x68₁, Some [0x0 ~> 0xffffffffffffffff], (x76₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x82 := Z.add_with_get_carry(2^64, None, (x81₂, Some [0x0 ~> 0x1], (x69₁, Some [0x0 ~> 0xffffffffffffffff], (x77₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x83 := Z.add_with_get_carry(2^64, None, (x82₂, Some [0x0 ~> 0x1], (x70₁, Some [0x0 ~> 0xffffffffffffffff], (x78₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x84 := Z.add_with_get_carry(2^64, None, (x83₂, Some [0x0 ~> 0x1], (x71₁, Some [0x0 ~> 0xffffffffffffffff], (x79₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x85 := Z.add_with_get_carry(2^64, None, (x84₂, Some [0x0 ~> 0x1], (x72, Some [0x0 ~> 0xffffffffffffffff], (x80, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x86 := Z.mul_split(2^64, None, (x81₁, Some [0x0 ~> 0xffffffffffffffff], (0xffffffff00000001, None))) in
  let x87 := Z.mul_split(2^64, None, (x81₁, Some [0x0 ~> 0xffffffffffffffff], (2^32-1, None))) in
  let x88 := Z.mul_split(2^64, None, (x81₁, Some [0x0 ~> 0xffffffffffffffff], (2^64-1, None))) in
  let x89 := Z.add_get_carry(2^64, None, (x88₂, Some [0x0 ~> 0xffffffffffffffff], (x87₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x90 := x89₂ + x87₂ (* : uint64_t *) in
  let x91 := Z.add_get_carry(2^64, None, (x81₁, Some [0x0 ~> 0xffffffffffffffff], (x88₁, Some [0x0 ~> 0xffffffffffffffff]))) in
  let x92 := Z.add_with_get_carry(2^64, None, (x91₂, Some [0x0 ~> 0x1], (x82₁, Some [0x0 ~> 0xffffffffffffffff], (x89₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x93 := Z.add_with_get_carry(2^64, None, (x92₂, Some [0x0 ~> 0x1], (x83₁, Some [0x0 ~> 0xffffffffffffffff], (x90, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x94 := Z.add_with_get_carry(2^64, None, (x93₂, Some [0x0 ~> 0x1], (x84₁, Some [0x0 ~> 0xffffffffffffffff], (x86₁, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x95 := Z.add_with_get_carry(2^64, None, (x94₂, Some [0x0 ~> 0x1], (x85₁, Some [0x0 ~> 0xffffffffffffffff], (x86₂, Some [0x0 ~> 0xffffffffffffffff])))) in
  let x96 := x95₂ + x85₂ (* : uint64_t *) in
  let x97 := Z.sub_with_get_borrow(2^64, None, (0, None, (x92₁, Some [0x0 ~> 0xffffffffffffffff], (2^64-1, None)))) in
  let x98 := Z.sub_with_get_borrow(2^64, None, (x97₂, Some [0x0 ~> 0x1], (x93₁, Some [0x0 ~> 0xffffffffffffffff], (2^32-1, None)))) in
  let x99 := Z.sub_with_get_borrow(2^64, None, (x98₂, Some [0x0 ~> 0x1], (x94₁, Some [0x0 ~> 0xffffffffffffffff], (0, None)))) in
  let x100 := Z.sub_with_get_borrow(2^64, None, (x99₂, Some [0x0 ~> 0x1], (x95₁, Some [0x0 ~> 0xffffffffffffffff], (0xffffffff00000001, None)))) in
  let x101 := Z.sub_with_get_borrow(2^64, None, (x100₂, Some [0x0 ~> 0x1], (x96, Some [0x0 ~> 0xffffffffffffffff], (0, None)))) in
  let x102 := Z.zselect(x101₂, Some [0x0 ~> 0x1], (x97₁, Some [0x0 ~> 0xffffffffffffffff], (x92₁, Some [0x0 ~> 0xffffffffffffffff]))) (* : uint64_t *) in
  let x103 := Z.zselect(x101₂, Some [0x0 ~> 0x1], (x98₁, Some [0x0 ~> 0xffffffffffffffff], (x93₁, Some [0x0 ~> 0xffffffffffffffff]))) (* : uint64_t *) in
  let x104 := Z.zselect(x101₂, Some [0x0 ~> 0x1], (x99₁, Some [0x0 ~> 0xffffffffffffffff], (x94₁, Some [0x0 ~> 0xffffffffffffffff]))) (* : uint64_t *) in
  let x105 := Z.zselect(x101₂, Some [0x0 ~> 0x1], (x100₁, Some [0x0 ~> 0xffffffffffffffff], (x95₁, Some [0x0 ~> 0xffffffffffffffff]))) (* : uint64_t *) in
  x102 :: x103 :: x104 :: x105 :: []
)

which can be pretty-printed as:
/*
 * Input Bounds:
 *   arg1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]]
 *   arg2: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]]
 * Output Bounds:
 *   out1: None
 */
void f(uint64_t out1[4], const uint64_t arg1[4], const uint64_t arg2[4]) {
  uint64_t x1;
  uint64_t x2;
  uint64_t x3;
  uint64_t x4;
  uint64_t x5;
  uint64_t x6;
  uint64_t x7;
  uint64_t x8;
  uint64_t x9;
  uint64_t x10;
  uint64_t x11;
  uint64_t x12;
  uint64_t x13;
  uint1 x14;
  uint64_t x15;
  uint1 x16;
  uint64_t x17;
  uint1 x18;
  uint64_t x19;
  uint64_t x20;
  uint64_t x21;
  uint64_t x22;
  uint64_t x23;
  uint64_t x24;
  uint64_t x25;
  uint64_t x26;
  uint1 x27;
  uint64_t x28;
  uint64_t x29;
  uint1 x30;
  uint64_t x31;
  uint1 x32;
  uint64_t x33;
  uint1 x34;
  uint64_t x35;
  uint1 x36;
  uint64_t x37;
  uint1 x38;
  uint64_t x39;
  uint64_t x40;
  uint64_t x41;
  uint64_t x42;
  uint64_t x43;
  uint64_t x44;
  uint64_t x45;
  uint64_t x46;
  uint64_t x47;
  uint1 x48;
  uint64_t x49;
  uint1 x50;
  uint64_t x51;
  uint1 x52;
  uint64_t x53;
  uint64_t x54;
  uint1 x55;
  uint64_t x56;
  uint1 x57;
  uint64_t x58;
  uint1 x59;
  uint64_t x60;
  uint1 x61;
  uint64_t x62;
  uint1 x63;
  uint64_t x64;
  uint64_t x65;
  uint64_t x66;
  uint64_t x67;
  uint64_t x68;
  uint64_t x69;
  uint64_t x70;
  uint1 x71;
  uint64_t x72;
  uint64_t x73;
  uint1 x74;
  uint64_t x75;
  uint1 x76;
  uint64_t x77;
  uint1 x78;
  uint64_t x79;
  uint1 x80;
  uint64_t x81;
  uint1 x82;
  uint64_t x83;
  uint64_t x84;
  uint64_t x85;
  uint64_t x86;
  uint64_t x87;
  uint64_t x88;
  uint64_t x89;
  uint64_t x90;
  uint64_t x91;
  uint64_t x92;
  uint1 x93;
  uint64_t x94;
  uint1 x95;
  uint64_t x96;
  uint1 x97;
  uint64_t x98;
  uint64_t x99;
  uint1 x100;
  uint64_t x101;
  uint1 x102;
  uint64_t x103;
  uint1 x104;
  uint64_t x105;
  uint1 x106;
  uint64_t x107;
  uint1 x108;
  uint64_t x109;
  uint64_t x110;
  uint64_t x111;
  uint64_t x112;
  uint64_t x113;
  uint64_t x114;
  uint64_t x115;
  uint1 x116;
  uint64_t x117;
  uint64_t x118;
  uint1 x119;
  uint64_t x120;
  uint1 x121;
  uint64_t x122;
  uint1 x123;
  uint64_t x124;
  uint1 x125;
  uint64_t x126;
  uint1 x127;
  uint64_t x128;
  uint64_t x129;
  uint64_t x130;
  uint64_t x131;
  uint64_t x132;
  uint64_t x133;
  uint64_t x134;
  uint64_t x135;
  uint64_t x136;
  uint64_t x137;
  uint1 x138;
  uint64_t x139;
  uint1 x140;
  uint64_t x141;
  uint1 x142;
  uint64_t x143;
  uint64_t x144;
  uint1 x145;
  uint64_t x146;
  uint1 x147;
  uint64_t x148;
  uint1 x149;
  uint64_t x150;
  uint1 x151;
  uint64_t x152;
  uint1 x153;
  uint64_t x154;
  uint64_t x155;
  uint64_t x156;
  uint64_t x157;
  uint64_t x158;
  uint64_t x159;
  uint64_t x160;
  uint1 x161;
  uint64_t x162;
  uint64_t x163;
  uint1 x164;
  uint64_t x165;
  uint1 x166;
  uint64_t x167;
  uint1 x168;
  uint64_t x169;
  uint1 x170;
  uint64_t x171;
  uint1 x172;
  uint64_t x173;
  uint64_t x174;
  uint1 x175;
  uint64_t x176;
  uint1 x177;
  uint64_t x178;
  uint1 x179;
  uint64_t x180;
  uint1 x181;
  uint64_t x182;
  uint1 x183;
  uint64_t x184;
  uint64_t x185;
  uint64_t x186;
  uint64_t x187;
  x1 = (arg1[1]);
  x2 = (arg1[2]);
  x3 = (arg1[3]);
  x4 = (arg1[0]);
  mulx_u64(&x5, &x6, x4, (arg2[3]));
  mulx_u64(&x7, &x8, x4, (arg2[2]));
  mulx_u64(&x9, &x10, x4, (arg2[1]));
  mulx_u64(&x11, &x12, x4, (arg2[0]));
  addcarryx_u64(&x13, &x14, 0x0, x12, x9);
  addcarryx_u64(&x15, &x16, x14, x10, x7);
  addcarryx_u64(&x17, &x18, x16, x8, x5);
  x19 = (x18 + x6);
  mulx_u64(&x20, &x21, x11, UINT64_C(0xffffffff00000001));
  mulx_u64(&x22, &x23, x11, UINT32_C(0xffffffff));
  mulx_u64(&x24, &x25, x11, UINT64_C(0xffffffffffffffff));
  addcarryx_u64(&x26, &x27, 0x0, x25, x22);
  x28 = (x27 + x23);
  addcarryx_u64(&x29, &x30, 0x0, x11, x24);
  addcarryx_u64(&x31, &x32, x30, x13, x26);
  addcarryx_u64(&x33, &x34, x32, x15, x28);
  addcarryx_u64(&x35, &x36, x34, x17, x20);
  addcarryx_u64(&x37, &x38, x36, x19, x21);
  mulx_u64(&x39, &x40, x1, (arg2[3]));
  mulx_u64(&x41, &x42, x1, (arg2[2]));
  mulx_u64(&x43, &x44, x1, (arg2[1]));
  mulx_u64(&x45, &x46, x1, (arg2[0]));
  addcarryx_u64(&x47, &x48, 0x0, x46, x43);
  addcarryx_u64(&x49, &x50, x48, x44, x41);
  addcarryx_u64(&x51, &x52, x50, x42, x39);
  x53 = (x52 + x40);
  addcarryx_u64(&x54, &x55, 0x0, x31, x45);
  addcarryx_u64(&x56, &x57, x55, x33, x47);
  addcarryx_u64(&x58, &x59, x57, x35, x49);
  addcarryx_u64(&x60, &x61, x59, x37, x51);
  addcarryx_u64(&x62, &x63, x61, x38, x53);
  mulx_u64(&x64, &x65, x54, UINT64_C(0xffffffff00000001));
  mulx_u64(&x66, &x67, x54, UINT32_C(0xffffffff));
  mulx_u64(&x68, &x69, x54, UINT64_C(0xffffffffffffffff));
  addcarryx_u64(&x70, &x71, 0x0, x69, x66);
  x72 = (x71 + x67);
  addcarryx_u64(&x73, &x74, 0x0, x54, x68);
  addcarryx_u64(&x75, &x76, x74, x56, x70);
  addcarryx_u64(&x77, &x78, x76, x58, x72);
  addcarryx_u64(&x79, &x80, x78, x60, x64);
  addcarryx_u64(&x81, &x82, x80, x62, x65);
  x83 = ((uint64_t)x82 + x63);
  mulx_u64(&x84, &x85, x2, (arg2[3]));
  mulx_u64(&x86, &x87, x2, (arg2[2]));
  mulx_u64(&x88, &x89, x2, (arg2[1]));
  mulx_u64(&x90, &x91, x2, (arg2[0]));
  addcarryx_u64(&x92, &x93, 0x0, x91, x88);
  addcarryx_u64(&x94, &x95, x93, x89, x86);
  addcarryx_u64(&x96, &x97, x95, x87, x84);
  x98 = (x97 + x85);
  addcarryx_u64(&x99, &x100, 0x0, x75, x90);
  addcarryx_u64(&x101, &x102, x100, x77, x92);
  addcarryx_u64(&x103, &x104, x102, x79, x94);
  addcarryx_u64(&x105, &x106, x104, x81, x96);
  addcarryx_u64(&x107, &x108, x106, x83, x98);
  mulx_u64(&x109, &x110, x99, UINT64_C(0xffffffff00000001));
  mulx_u64(&x111, &x112, x99, UINT32_C(0xffffffff));
  mulx_u64(&x113, &x114, x99, UINT64_C(0xffffffffffffffff));
  addcarryx_u64(&x115, &x116, 0x0, x114, x111);
  x117 = (x116 + x112);
  addcarryx_u64(&x118, &x119, 0x0, x99, x113);
  addcarryx_u64(&x120, &x121, x119, x101, x115);
  addcarryx_u64(&x122, &x123, x121, x103, x117);
  addcarryx_u64(&x124, &x125, x123, x105, x109);
  addcarryx_u64(&x126, &x127, x125, x107, x110);
  x128 = ((uint64_t)x127 + x108);
  mulx_u64(&x129, &x130, x3, (arg2[3]));
  mulx_u64(&x131, &x132, x3, (arg2[2]));
  mulx_u64(&x133, &x134, x3, (arg2[1]));
  mulx_u64(&x135, &x136, x3, (arg2[0]));
  addcarryx_u64(&x137, &x138, 0x0, x136, x133);
  addcarryx_u64(&x139, &x140, x138, x134, x131);
  addcarryx_u64(&x141, &x142, x140, x132, x129);
  x143 = (x142 + x130);
  addcarryx_u64(&x144, &x145, 0x0, x120, x135);
  addcarryx_u64(&x146, &x147, x145, x122, x137);
  addcarryx_u64(&x148, &x149, x147, x124, x139);
  addcarryx_u64(&x150, &x151, x149, x126, x141);
  addcarryx_u64(&x152, &x153, x151, x128, x143);
  mulx_u64(&x154, &x155, x144, UINT64_C(0xffffffff00000001));
  mulx_u64(&x156, &x157, x144, UINT32_C(0xffffffff));
  mulx_u64(&x158, &x159, x144, UINT64_C(0xffffffffffffffff));
  addcarryx_u64(&x160, &x161, 0x0, x159, x156);
  x162 = (x161 + x157);
  addcarryx_u64(&x163, &x164, 0x0, x144, x158);
  addcarryx_u64(&x165, &x166, x164, x146, x160);
  addcarryx_u64(&x167, &x168, x166, x148, x162);
  addcarryx_u64(&x169, &x170, x168, x150, x154);
  addcarryx_u64(&x171, &x172, x170, x152, x155);
  x173 = ((uint64_t)x172 + x153);
  subborrowx_u64(&x174, &x175, 0x0, x165, UINT64_C(0xffffffffffffffff));
  subborrowx_u64(&x176, &x177, x175, x167, UINT32_C(0xffffffff));
  subborrowx_u64(&x178, &x179, x177, x169, 0x0);
  subborrowx_u64(&x180, &x181, x179, x171, UINT64_C(0xffffffff00000001));
  subborrowx_u64(&x182, &x183, x181, x173, 0x0);
  cmovznz_u64(&x184, x183, x174, x165);
  cmovznz_u64(&x185, x183, x176, x167);
  cmovznz_u64(&x186, x183, x178, x169);
  cmovznz_u64(&x187, x183, x180, x171);
  out1[0] = x184;
  out1[1] = x185;
  out1[2] = x186;
  out1[3] = x187;
}

Assembly:
ecp_nistz256_mul_mont:
push rbp
push rbx
push r12
push r13
push r14
push r15
xchg rdx, rsi                  ; hack: swap args
mov rbx, rdx
mov rax, QWORD PTR [rbx]
mov r9, QWORD PTR [rsi]
mov r10, QWORD PTR [rsi + 0x08 * 1]
mov r11, QWORD PTR [rsi + 0x08 * 2]
mov r12, QWORD PTR [rsi + 0x08 * 3]
mov rbp, rax
mul r9
mov r14, 4294967295
mov r8, rax
mov rax, rbp
mov r9, rdx
mul r10
mov r15, 18446744069414584321
add r9, rax	; #549, #582, #583, #584
mov rax, rbp
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
mov r10, rdx
mul r11
add r10, rax	; #553, #554, #594, #595, #596, #644, #645, #646
mov rax, rbp
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
mov r11, rdx
mul r12
add r11, rax	; #558, #559, #560, #606, #607, #608, #656, #657, #658, #706, #707, #708
mov rax, r8
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
xor r13, r13	; #564
mov r12, rdx
mov rbp, r8
shl r8, 32	; #565, #566, #567
mul r15	; #625, #626, #627, #628, #687, #688, #689, #690, #749, #750, #751, #752
shr rbp, 32	; #568, #629, #691, #753
add r9, r8	; #569, #570, #571
adc r10, rbp	; #572, #573, #574
adc r11, rax	; #575, #576, #577
mov rax, QWORD PTR [rbx + 0x08 * 1]
adc r12, rdx	; #578, #579, #580
adc r13, 0
xor r8, r8	; #581
mov rbp, rax
mul QWORD PTR [rsi]
add r9, rax	; #549, #582, #583, #584
mov rax, rbp
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
mov rcx, rdx
mul QWORD PTR [rsi + 0x08 * 1]
add r10, rcx	; #588, #589, #590
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
add r10, rax	; #553, #554, #594, #595, #596, #644, #645, #646
mov rax, rbp
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
mov rcx, rdx
mul QWORD PTR [rsi + 0x08 * 2]
add r11, rcx	; #600, #601, #602, #650, #651, #652
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
add r11, rax	; #558, #559, #560, #606, #607, #608, #656, #657, #658, #706, #707, #708
mov rax, rbp
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
mov rcx, rdx
mul QWORD PTR [rsi + 0x08 * 3]
add r12, rcx	; #612, #613, #614, #662, #663, #664, #712, #713, #714
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
add r12, rax	; #618, #619, #620, #668, #669, #670, #718, #719, #720
mov rax, r9
adc r13, rdx	; #621, #622, #623, #639, #640, #641
adc r8, 0	; #642
mov rbp, r9
shl r9, 32	; #624
mul r15	; #625, #626, #627, #628, #687, #688, #689, #690, #749, #750, #751, #752
shr rbp, 32	; #568, #629, #691, #753
add r10, r9	; #630, #631, #632
adc r11, rbp	; #633, #634, #635
adc r12, rax	; #636, #637, #638
mov rax, QWORD PTR [rbx + 0x08 * 2]
adc r13, rdx	; #621, #622, #623, #639, #640, #641
adc r8, 0	; #642
xor r9, r9	; #643
mov rbp, rax
mul QWORD PTR [rsi]
add r10, rax	; #553, #554, #594, #595, #596, #644, #645, #646
mov rax, rbp
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
mov rcx, rdx
mul QWORD PTR [rsi + 0x08 * 1]
add r11, rcx	; #600, #601, #602, #650, #651, #652
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
add r11, rax	; #558, #559, #560, #606, #607, #608, #656, #657, #658, #706, #707, #708
mov rax, rbp
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
mov rcx, rdx
mul QWORD PTR [rsi + 0x08 * 2]
add r12, rcx	; #612, #613, #614, #662, #663, #664, #712, #713, #714
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
add r12, rax	; #618, #619, #620, #668, #669, #670, #718, #719, #720
mov rax, rbp
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
mov rcx, rdx
mul QWORD PTR [rsi + 0x08 * 3]
add r13, rcx	; #674, #675, #676, #724, #725, #726
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
add r13, rax	; #680, #681, #682, #730, #731, #732
mov rax, r10
adc r8, rdx	; #683, #684, #685, #701, #702, #703
adc r9, 0	; #704
mov rbp, r10
shl r10, 32	; #686
mul r15	; #625, #626, #627, #628, #687, #688, #689, #690, #749, #750, #751, #752
shr rbp, 32	; #568, #629, #691, #753
add r11, r10	; #692, #693, #694
adc r12, rbp	; #695, #696, #697
adc r13, rax	; #698, #699, #700
mov rax, QWORD PTR [rbx + 0x08 * 3]
adc r8, rdx	; #683, #684, #685, #701, #702, #703
adc r9, 0	; #704
xor r10, r10	; #705
mov rbp, rax
mul QWORD PTR [rsi]
add r11, rax	; #558, #559, #560, #606, #607, #608, #656, #657, #658, #706, #707, #708
mov rax, rbp
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
mov rcx, rdx
mul QWORD PTR [rsi + 0x08 * 1]
add r12, rcx	; #612, #613, #614, #662, #663, #664, #712, #713, #714
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
add r12, rax	; #618, #619, #620, #668, #669, #670, #718, #719, #720
mov rax, rbp
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
mov rcx, rdx
mul QWORD PTR [rsi + 0x08 * 2]
add r13, rcx	; #674, #675, #676, #724, #725, #726
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
add r13, rax	; #680, #681, #682, #730, #731, #732
mov rax, rbp
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
mov rcx, rdx
mul QWORD PTR [rsi + 0x08 * 3]
add r8, rcx	; #736, #737, #738
adc rdx, 0	; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741
add r8, rax	; #742, #743, #744
mov rax, r11
adc r9, rdx	; #745, #746, #747, #763, #764, #765
adc r10, 0	; #766
mov rbp, r11
shl r11, 32	; #748
mul r15	; #625, #626, #627, #628, #687, #688, #689, #690, #749, #750, #751, #752
shr rbp, 32	; #568, #629, #691, #753
add r12, r11	; #754, #755, #756
adc r13, rbp	; #757, #758, #759
mov rcx, r12
adc r8, rax	; #760, #761, #762
adc r9, rdx	; #745, #746, #747, #763, #764, #765
mov rbp, r13
adc r10, 0	; #766
sub r12, 18446744073709551615	; #767, #768
mov rbx, r8
sbb r13, r14	; #769, #770, #771
sbb r8, 0	; #772, #773, #774
mov rdx, r9
sbb r9, r15	; #775, #776, #777
sbb r10, 0	; #778, #779, #780
cmovb r12, rcx	; #781
cmovb r13, rbp	; #782
mov QWORD PTR [rdi], r12
cmovb r8, rbx	; #783
mov QWORD PTR [rdi + 0x08 * 1], r13
cmovb r9, rdx	; #784
mov QWORD PTR [rdi + 0x08 * 2], r8
mov QWORD PTR [rdi + 0x08 * 3], r9
mov r15, QWORD PTR [rsp]
mov r14, QWORD PTR [rsp + 0x08 * 1]
mov r13, QWORD PTR [rsp + 0x08 * 2]
mov r12, QWORD PTR [rsp + 0x08 * 3]
mov rbx, QWORD PTR [rsp + 0x08 * 4]
mov rbp, QWORD PTR [rsp + 0x08 * 5]
;lea    rsp,[rsp+0x30]
ret

Equivalence checking error:
Unable to unify:
In environment:
(*symbolic_state*) {|
  dag_state :=
(*dag*)[
(*0*) (old 64 0, []); 	(*build_inputs*)
(*1*) (old 64 1, []); 	(*build_inputs*)
(*2*) (old 64 2, []); 	(*build_inputs*)
(*3*) (old 64 3, []); 	(*build_inputs*)
(*4*) (old 64 4, []); 	(*build_inputs*)
(*5*) (old 64 5, []); 	(*build_inputs*)
(*6*) (old 64 6, []); 	(*build_inputs*)
(*7*) (old 64 7, []); 	(*build_inputs*)
(*8*) (const 0, []);
(*9*) (const 18446744073709551616, []);
(*10*) (const 64, []);
(*11*) (mulZ, [3, 4]);
(*12*) (mul 64, [3, 4]);
(*13*) (shrZ, [11, 10]);
(*14*) (shr 64, [11, 10]);
(*15*) (mulZ, [3, 5]);
(*16*) (mul 64, [3, 5]);
(*17*) (shrZ, [15, 10]);
(*18*) (shr 64, [15, 10]);
(*19*) (mulZ, [3, 6]);
(*20*) (mul 64, [3, 6]);
(*21*) (shrZ, [19, 10]);
(*22*) (shr 64, [19, 10]);
(*23*) (mulZ, [3, 7]);
(*24*) (mul 64, [3, 7]);
(*25*) (shrZ, [23, 10]);
(*26*) (shr 64, [23, 10]);
(*27*) (add 64, [20, 26]);
(*28*) (addcarryZ 64, [26, 20]);
(*29*) (addcarry 64, [20, 26]);
(*30*) (add 64, [16, 22, 29]);
(*31*) (addcarryZ 64, [29, 22, 16]);
(*32*) (addcarry 64, [16, 22, 29]);
(*33*) (add 64, [12, 18, 32]);
(*34*) (addcarryZ 64, [32, 18, 12]);
(*35*) (addcarry 64, [12, 18, 32]);
(*36*) (addZ, [35, 14]);
(*37*) (add 64, [14, 35]);
(*38*) (const 18446744069414584321, []);
(*39*) (mulZ, [24, 38]);
(*40*) (mul 64, [3, 7, 38]);
(*41*) (shrZ, [39, 10]);
(*42*) (shr 64, [39, 10]);
(*43*) (const 4294967295, []);
(*44*) (mulZ, [24, 43]);
(*45*) (mul 64, [3, 7, 43]);
(*46*) (shrZ, [44, 10]);
(*47*) (shr 64, [44, 10]);
(*48*) (const 18446744073709551615, []);
(*49*) (mulZ, [24, 48]);
(*50*) (mul 64, [3, 7, 48]);
(*51*) (shrZ, [49, 10]);
(*52*) (shr 64, [49, 10]);
(*53*) (add 64, [45, 52]);
(*54*) (addcarryZ 64, [52, 45]);
(*55*) (addcarry 64, [45, 52]);
(*56*) (addZ, [55, 47]);
(*57*) (add 64, [47, 55]);
(*58*) (mul 64, [3, 7, 8]);
(*59*) (add 64, [58]);
(*60*) (addcarryZ 64, [24, 50]);
(*61*) (addcarry 64, [24, 50]);
(*62*) (add 64, [20, 26, 45, 52, 61]);
(*63*) (addcarryZ 64, [61, 27, 53]);
(*64*) (addcarry 64, [27, 53, 61]);
(*65*) (add 64, [16, 22, 29, 47, 55, 64]);
(*66*) (addcarryZ 64, [64, 30, 57]);
(*67*) (addcarry 64, [30, 57, 64]);
(*68*) (add 64, [12, 18, 32, 40, 67]);
(*69*) (addcarryZ 64, [67, 33, 40]);
(*70*) (addcarry 64, [33, 40, 67]);
(*71*) (add 64, [14, 35, 42, 70]);
(*72*) (addcarryZ 64, [70, 37, 42]);
(*73*) (addcarry 64, [37, 42, 70]);
(*74*) (mulZ, [2, 4]);
(*75*) (mul 64, [2, 4]);
(*76*) (shrZ, [74, 10]);
(*77*) (shr 64, [74, 10]);
(*78*) (mulZ, [2, 5]);
(*79*) (mul 64, [2, 5]);
(*80*) (shrZ, [78, 10]);
(*81*) (shr 64, [78, 10]);
(*82*) (mulZ, [2, 6]);
(*83*) (mul 64, [2, 6]);
(*84*) (shrZ, [82, 10]);
(*85*) (shr 64, [82, 10]);
(*86*) (mulZ, [2, 7]);
(*87*) (mul 64, [2, 7]);
(*88*) (shrZ, [86, 10]);
(*89*) (shr 64, [86, 10]);
(*90*) (add 64, [83, 89]);
(*91*) (addcarryZ 64, [89, 83]);
(*92*) (addcarry 64, [83, 89]);
(*93*) (add 64, [79, 85, 92]);
(*94*) (addcarryZ 64, [92, 85, 79]);
(*95*) (addcarry 64, [79, 85, 92]);
(*96*) (add 64, [75, 81, 95]);
(*97*) (addcarryZ 64, [95, 81, 75]);
(*98*) (addcarry 64, [75, 81, 95]);
(*99*) (addZ, [98, 77]);
(*100*) (add 64, [77, 98]);
(*101*) (add 64, [20, 26, 45, 52, 61, 87]);
(*102*) (addcarryZ 64, [62, 87]);
(*103*) (addcarry 64, [62, 87]);
(*104*) (add 64, [16, 22, 29, 47, 55, 64, 83, 89, 103]);
(*105*) (addcarryZ 64, [103, 65, 90]);
(*106*) (addcarry 64, [65, 90, 103]);
(*107*) (add 64, [12, 18, 32, 40, 67, 79, 85, 92, 106]);
(*108*) (addcarryZ 64, [106, 68, 93]);
(*109*) (addcarry 64, [68, 93, 106]);
(*110*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109]);
(*111*) (addcarryZ 64, [109, 71, 96]);
(*112*) (addcarry 64, [71, 96, 109]);
(*113*) (add 64, [73, 77, 98, 112]);
(*114*) (addcarryZ 64, [112, 73, 100]);
(*115*) (addcarry 64, [73, 100, 112]);
(*116*) (mulZ, [38, 101]);
(*117*) (mul 64, [38, 101]);
(*118*) (shrZ, [116, 10]);
(*119*) (shr 64, [116, 10]);
(*120*) (mulZ, [43, 101]);
(*121*) (mul 64, [43, 101]);
(*122*) (shrZ, [120, 10]);
(*123*) (shr 64, [120, 10]);
(*124*) (mulZ, [48, 101]);
(*125*) (mul 64, [48, 101]);
(*126*) (shrZ, [124, 10]);
(*127*) (shr 64, [124, 10]);
(*128*) (add 64, [121, 127]);
(*129*) (addcarryZ 64, [127, 121]);
(*130*) (addcarry 64, [121, 127]);
(*131*) (addZ, [130, 123]);
(*132*) (add 64, [123, 130]);
(*133*) (add 64, [20, 26, 45, 52, 61, 87, 125]);
(*134*) (addcarryZ 64, [101, 125]);
(*135*) (addcarry 64, [101, 125]);
(*136*) (add 64, [16, 22, 29, 47, 55, 64, 83, 89, 103, 121, 127, 135]);
(*137*) (addcarryZ 64, [135, 104, 128]);
(*138*) (addcarry 64, [104, 128, 135]);
(*139*) (add 64, [12, 18, 32, 40, 67, 79, 85, 92, 106, 123, 130, 138]);
(*140*) (addcarryZ 64, [138, 107, 132]);
(*141*) (addcarry 64, [107, 132, 138]);
(*142*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109, 117, 141]);
(*143*) (addcarryZ 64, [141, 110, 117]);
(*144*) (addcarry 64, [110, 117, 141]);
(*145*) (add 64, [73, 77, 98, 112, 119, 144]);
(*146*) (addcarryZ 64, [144, 113, 119]);
(*147*) (addcarry 64, [113, 119, 144]);
(*148*) (addZ, [147, 115]);
(*149*) (add 64, [115, 147]);
(*150*) (mulZ, [1, 4]);
(*151*) (mul 64, [1, 4]);
(*152*) (shrZ, [150, 10]);
(*153*) (shr 64, [150, 10]);
(*154*) (mulZ, [1, 5]);
(*155*) (mul 64, [1, 5]);
(*156*) (shrZ, [154, 10]);
(*157*) (shr 64, [154, 10]);
(*158*) (mulZ, [1, 6]);
(*159*) (mul 64, [1, 6]);
(*160*) (shrZ, [158, 10]);
(*161*) (shr 64, [158, 10]);
(*162*) (mulZ, [1, 7]);
(*163*) (mul 64, [1, 7]);
(*164*) (shrZ, [162, 10]);
(*165*) (shr 64, [162, 10]);
(*166*) (add 64, [159, 165]);
(*167*) (addcarryZ 64, [165, 159]);
(*168*) (addcarry 64, [159, 165]);
(*169*) (add 64, [155, 161, 168]);
(*170*) (addcarryZ 64, [168, 161, 155]);
(*171*) (addcarry 64, [155, 161, 168]);
(*172*) (add 64, [151, 157, 171]);
(*173*) (addcarryZ 64, [171, 157, 151]);
(*174*) (addcarry 64, [151, 157, 171]);
(*175*) (addZ, [174, 153]);
(*176*) (add 64, [153, 174]);
(*177*) (add 64, [16, 22, 29, 47, 55, 64, 83, 89, 103, 121, 127, 135, 163]);
(*178*) (addcarryZ 64, [136, 163]);
(*179*) (addcarry 64, [136, 163]);
(*180*) (add 64, [12, 18, 32, 40, 67, 79, 85, 92, 106, 123, 130, 138, 159, 165, 179]);
(*181*) (addcarryZ 64, [179, 139, 166]);
(*182*) (addcarry 64, [139, 166, 179]);
(*183*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109, 117, 141, 155, 161, 168, 182]);
(*184*) (addcarryZ 64, [182, 142, 169]);
(*185*) (addcarry 64, [142, 169, 182]);
(*186*) (add 64, [73, 77, 98, 112, 119, 144, 151, 157, 171, 185]);
(*187*) (addcarryZ 64, [185, 145, 172]);
(*188*) (addcarry 64, [145, 172, 185]);
(*189*) (add 64, [115, 147, 153, 174, 188]);
(*190*) (addcarryZ 64, [188, 149, 176]);
(*191*) (addcarry 64, [149, 176, 188]);
(*192*) (mulZ, [38, 177]);
(*193*) (mul 64, [38, 177]);
(*194*) (shrZ, [192, 10]);
(*195*) (shr 64, [192, 10]);
(*196*) (mulZ, [43, 177]);
(*197*) (mul 64, [43, 177]);
(*198*) (shrZ, [196, 10]);
(*199*) (shr 64, [196, 10]);
(*200*) (mulZ, [48, 177]);
(*201*) (mul 64, [48, 177]);
(*202*) (shrZ, [200, 10]);
(*203*) (shr 64, [200, 10]);
(*204*) (add 64, [197, 203]);
(*205*) (addcarryZ 64, [203, 197]);
(*206*) (addcarry 64, [197, 203]);
(*207*) (addZ, [206, 199]);
(*208*) (add 64, [199, 206]);
(*209*) (add 64, [16, 22, 29, 47, 55, 64, 83, 89, 103, 121, 127, 135, 163, 201]);
(*210*) (addcarryZ 64, [177, 201]);
(*211*) (addcarry 64, [177, 201]);
(*212*) (add 64, [12, 18, 32, 40, 67, 79, 85, 92, 106, 123, 130, 138, 159, 165, 179, 197, 203, 211]);
(*213*) (addcarryZ 64, [211, 180, 204]);
(*214*) (addcarry 64, [180, 204, 211]);
(*215*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109, 117, 141, 155, 161, 168, 182, 199, 206, 214]);
(*216*) (addcarryZ 64, [214, 183, 208]);
(*217*) (addcarry 64, [183, 208, 214]);
(*218*) (add 64, [73, 77, 98, 112, 119, 144, 151, 157, 171, 185, 193, 217]);
(*219*) (addcarryZ 64, [217, 186, 193]);
(*220*) (addcarry 64, [186, 193, 217]);
(*221*) (add 64, [115, 147, 153, 174, 188, 195, 220]);
(*222*) (addcarryZ 64, [220, 189, 195]);
(*223*) (addcarry 64, [189, 195, 220]);
(*224*) (addZ, [223, 191]);
(*225*) (add 64, [191, 223]);
(*226*) (mulZ, [0, 4]);
(*227*) (mul 64, [0, 4]);
(*228*) (shrZ, [226, 10]);
(*229*) (shr 64, [226, 10]);
(*230*) (mulZ, [0, 5]);
(*231*) (mul 64, [0, 5]);
(*232*) (shrZ, [230, 10]);
(*233*) (shr 64, [230, 10]);
(*234*) (mulZ, [0, 6]);
(*235*) (mul 64, [0, 6]);
(*236*) (shrZ, [234, 10]);
(*237*) (shr 64, [234, 10]);
(*238*) (mulZ, [0, 7]);
(*239*) (mul 64, [0, 7]);
(*240*) (shrZ, [238, 10]);
(*241*) (shr 64, [238, 10]);
(*242*) (add 64, [235, 241]);
(*243*) (addcarryZ 64, [241, 235]);
(*244*) (addcarry 64, [235, 241]);
(*245*) (add 64, [231, 237, 244]);
(*246*) (addcarryZ 64, [244, 237, 231]);
(*247*) (addcarry 64, [231, 237, 244]);
(*248*) (add 64, [227, 233, 247]);
(*249*) (addcarryZ 64, [247, 233, 227]);
(*250*) (addcarry 64, [227, 233, 247]);
(*251*) (addZ, [250, 229]);
(*252*) (add 64, [229, 250]);
(*253*) (add 64, [12, 18, 32, 40, 67, 79, 85, 92, 106, 123, 130, 138, 159, 165, 179, 197, 203, 211, 239]);
(*254*) (addcarryZ 64, [212, 239]);
(*255*) (addcarry 64, [212, 239]);
(*256*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109, 117, 141, 155, 161, 168, 182, 199, 206, 214, 235, 241, 255]);
(*257*) (addcarryZ 64, [255, 215, 242]);
(*258*) (addcarry 64, [215, 242, 255]);
(*259*) (add 64, [73, 77, 98, 112, 119, 144, 151, 157, 171, 185, 193, 217, 231, 237, 244, 258]);
(*260*) (addcarryZ 64, [258, 218, 245]);
(*261*) (addcarry 64, [218, 245, 258]);
(*262*) (add 64, [115, 147, 153, 174, 188, 195, 220, 227, 233, 247, 261]);
(*263*) (addcarryZ 64, [261, 221, 248]);
(*264*) (addcarry 64, [221, 248, 261]);
(*265*) (add 64, [191, 223, 229, 250, 264]);
(*266*) (addcarryZ 64, [264, 225, 252]);
(*267*) (addcarry 64, [225, 252, 264]);
(*268*) (mulZ, [38, 253]);
(*269*) (mul 64, [38, 253]);
(*270*) (shrZ, [268, 10]);
(*271*) (shr 64, [268, 10]);
(*272*) (mulZ, [43, 253]);
(*273*) (mul 64, [43, 253]);
(*274*) (shrZ, [272, 10]);
(*275*) (shr 64, [272, 10]);
(*276*) (mulZ, [48, 253]);
(*277*) (mul 64, [48, 253]);
(*278*) (shrZ, [276, 10]);
(*279*) (shr 64, [276, 10]);
(*280*) (add 64, [273, 279]);
(*281*) (addcarryZ 64, [279, 273]);
(*282*) (addcarry 64, [273, 279]);
(*283*) (addZ, [282, 275]);
(*284*) (add 64, [275, 282]);
(*285*) (add 64, [12, 18, 32, 40, 67, 79, 85, 92, 106, 123, 130, 138, 159, 165, 179, 197, 203, 211, 239, 277]);
(*286*) (addcarryZ 64, [253, 277]);
(*287*) (addcarry 64, [253, 277]);
(*288*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109, 117, 141, 155, 161, 168, 182, 199, 206, 214, 235, 241, 255, 273, 279, 287]);
(*289*) (addcarryZ 64, [287, 256, 280]);
(*290*) (addcarry 64, [256, 280, 287]);
(*291*) (add 64, [73, 77, 98, 112, 119, 144, 151, 157, 171, 185, 193, 217, 231, 237, 244, 258, 275, 282, 290]);
(*292*) (addcarryZ 64, [290, 259, 284]);
(*293*) (addcarry 64, [259, 284, 290]);
(*294*) (add 64, [115, 147, 153, 174, 188, 195, 220, 227, 233, 247, 261, 269, 293]);
(*295*) (addcarryZ 64, [293, 262, 269]);
(*296*) (addcarry 64, [262, 269, 293]);
(*297*) (add 64, [191, 223, 229, 250, 264, 271, 296]);
(*298*) (addcarryZ 64, [296, 265, 271]);
(*299*) (addcarry 64, [265, 271, 296]);
(*300*) (addZ, [299, 267]);
(*301*) (add 64, [267, 299]);
(*302*) (const 1, []);
(*303*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109, 117, 141, 155, 161, 168, 182, 199, 206, 214, 235, 241, 255, 273, 279, 287, 302]);
(*304*) (subborrowZ 64, [288, 48]);
(*305*) (subborrow 64, [288, 48]);
(*306*) (neg 64, [305]);
(*307*) (add 64, [38, 73, 77, 98, 112, 119, 144, 151, 157, 171, 185, 193, 217, 231, 237, 244, 258, 275, 282, 290, 306]);
(*308*) (subborrowZ 64, [291, 43, 305]);
(*309*) (subborrow 64, [291, 43, 305]);
(*310*) (neg 64, [309]);
(*311*) (add 64, [115, 147, 153, 174, 188, 195, 220, 227, 233, 247, 261, 269, 293, 310]);
(*312*) (subborrowZ 64, [294, 309]);
(*313*) (subborrow 64, [294, 309]);
(*314*) (neg 64, [313]);
(*315*) (add 64, [43, 191, 223, 229, 250, 264, 271, 296, 314]);
(*316*) (subborrowZ 64, [297, 38, 313]);
(*317*) (subborrow 64, [297, 38, 313]);
(*318*) (neg 64, [317]);
(*319*) (add 64, [267, 299, 318]);
(*320*) (subborrowZ 64, [301, 317]);
(*321*) (subborrow 64, [301, 317]);
(*322*) (selectznz, [321, 303, 288]);
(*323*) (selectznz, [321, 307, 291]);
(*324*) (selectznz, [321, 311, 294]);
(*325*) (selectznz, [321, 315, 297]);
(*326*) (old 64 326, []); 	(*init_symbolic_state*)
(*327*) (old 64 327, []); 	(*init_symbolic_state*)
(*328*) (old 64 328, []); 	(*init_symbolic_state*)
(*329*) (old 64 329, []); 	(*init_symbolic_state*)
(*330*) (old 64 330, []); 	(*init_symbolic_state*)
(*331*) (old 64 331, []); 	(*init_symbolic_state*)
(*332*) (old 64 332, []); 	(*init_symbolic_state*)
(*333*) (old 64 333, []); 	(*init_symbolic_state*)
(*334*) (old 64 334, []); 	(*init_symbolic_state*)
(*335*) (old 64 335, []); 	(*init_symbolic_state*)
(*336*) (old 64 336, []); 	(*init_symbolic_state*)
(*337*) (old 64 337, []); 	(*init_symbolic_state*)
(*338*) (old 64 338, []); 	(*init_symbolic_state*)
(*339*) (old 64 339, []); 	(*init_symbolic_state*)
(*340*) (old 64 340, []); 	(*init_symbolic_state*)
(*341*) (old 64 341, []); 	(*init_symbolic_state*)
(*342*) (old 64 342, []); 	(*output_placeholders*)
(*343*) (old 64 343, []); 	(*output_placeholders*)
(*344*) (old 64 344, []); 	(*output_placeholders*)
(*345*) (old 64 345, []); 	(*output_placeholders*)
(*346*) (old 64 346, []); 	(*inputaddrs*)
(*347*) (const 8, []); 	(*inputaddrs*)
(*348*) (add 64, [346, 347]); 	(*inputaddrs*)
(*349*) (const 16, []); 	(*inputaddrs*)
(*350*) (add 64, [346, 349]); 	(*inputaddrs*)
(*351*) (const 24, []); 	(*inputaddrs*)
(*352*) (add 64, [346, 351]); 	(*inputaddrs*)
(*353*) (old 64 353, []); 	(*inputaddrs*)
(*354*) (add 64, [347, 353]); 	(*inputaddrs*)
(*355*) (add 64, [349, 353]); 	(*inputaddrs*)
(*356*) (add 64, [351, 353]); 	(*inputaddrs*)
(*357*) (old 64 357, []); 	(*outputaddrs*)
(*358*) (add 64, [347, 357]); 	(*outputaddrs*)
(*359*) (add 64, [349, 357]); 	(*outputaddrs*)
(*360*) (add 64, [351, 357]); 	(*outputaddrs*)
(*361*) (old 64 361, []); 	(*stack_base*)
(*362*) (old 64 362, []); 	(*stack_base*)
(*363*) (old 64 363, []); 	(*stack_base*)
(*364*) (old 64 364, []); 	(*stack_base*)
(*365*) (old 64 365, []); 	(*stack_base*)
(*366*) (old 64 366, []); 	(*stack_base*)
(*367*) (old 64 367, []); 	(*stack_base*)
(*368*) (old 64 368, []); 	(*stack_base*)
(*369*) (old 64 369, []); 	(*stack_base*)
(*370*) (old 64 370, []); 	(*stack_base*)
(*371*) (old 64 371, []); 	(*stack_base*)
(*372*) (old 64 372, []); 	(*stack_base*)
(*373*) (old 64 373, []); 	(*stack_base*)
(*374*) (old 64 374, []); 	(*stack_base*)
(*375*) (old 64 375, []); 	(*stack_base*)
(*376*) (old 64 376, []); 	(*stack_base*)
(*377*) (old 64 377, []); 	(*stack_base*)
(*378*) (old 64 378, []); 	(*stack_base*)
(*379*) (old 64 379, []); 	(*stack_base*)
(*380*) (old 64 380, []); 	(*stack_base*)
(*381*) (old 64 381, []); 	(*stack_base*)
(*382*) (old 64 382, []); 	(*stack_base*)
(*383*) (old 64 383, []); 	(*stack_base*)
(*384*) (old 64 384, []); 	(*stack_base*)
(*385*) (old 64 385, []); 	(*stack_base*)
(*386*) (old 64 386, []); 	(*stack_base*)
(*387*) (old 64 387, []); 	(*stack_base*)
(*388*) (old 64 388, []); 	(*stack_base*)
(*389*) (old 64 389, []); 	(*stack_base*)
(*390*) (old 64 390, []); 	(*stack_base*)
(*391*) (old 64 391, []); 	(*stack_base*)
(*392*) (old 64 392, []); 	(*stack_base*)
(*393*) (old 64 393, []); 	(*stack_base*)
(*394*) (old 64 394, []); 	(*stack_base*)
(*395*) (old 64 395, []); 	(*stack_base*)
(*396*) (old 64 396, []); 	(*stack_base*)
(*397*) (old 64 397, []); 	(*stack_base*)
(*398*) (old 64 398, []); 	(*stack_base*)
(*399*) (old 64 399, []); 	(*stack_base*)
(*400*) (old 64 400, []); 	(*stack_base*)
(*401*) (old 64 401, []); 	(*stack_base*)
(*402*) (old 64 402, []); 	(*stack_base*)
(*403*) (old 64 403, []); 	(*stack_base*)
(*404*) (old 64 404, []); 	(*stack_base*)
(*405*) (old 64 405, []); 	(*stack_base*)
(*406*) (old 64 406, []); 	(*stack_base*)
(*407*) (old 64 407, []); 	(*stack_base*)
(*408*) (old 64 408, []); 	(*stack_base*)
(*409*) (old 64 409, []); 	(*stack_base*)
(*410*) (const 18446744073709551232, []); 	(*stack_base*)
(*411*) (add 64, [409, 410]); 	(*stack_base*)
(*412*) (const 18446744073709551240, []); 	(*stack_base*)
(*413*) (add 64, [409, 412]); 	(*stack_base*)
(*414*) (const 18446744073709551248, []); 	(*stack_base*)
(*415*) (add 64, [409, 414]); 	(*stack_base*)
(*416*) (const 18446744073709551256, []); 	(*stack_base*)
(*417*) (add 64, [409, 416]); 	(*stack_base*)
(*418*) (const 32, []); 	(*stack_base*)
(*419*) (const 18446744073709551264, []); 	(*stack_base*)
(*420*) (add 64, [409, 419]); 	(*stack_base*)
(*421*) (const 40, []); 	(*stack_base*)
(*422*) (const 18446744073709551272, []); 	(*stack_base*)
(*423*) (add 64, [409, 422]); 	(*stack_base*)
(*424*) (const 48, []); 	(*stack_base*)
(*425*) (const 18446744073709551280, []); 	(*stack_base*)
(*426*) (add 64, [409, 425]); 	(*stack_base*)
(*427*) (const 56, []); 	(*stack_base*)
(*428*) (const 18446744073709551288, []); 	(*stack_base*)
(*429*) (add 64, [409, 428]); 	(*stack_base*)
(*430*) (const 18446744073709551296, []); 	(*stack_base*)
(*431*) (add 64, [409, 430]); 	(*stack_base*)
(*432*) (const 72, []); 	(*stack_base*)
(*433*) (const 18446744073709551304, []); 	(*stack_base*)
(*434*) (add 64, [409, 433]); 	(*stack_base*)
(*435*) (const 80, []); 	(*stack_base*)
(*436*) (const 18446744073709551312, []); 	(*stack_base*)
(*437*) (add 64, [409, 436]); 	(*stack_base*)
(*438*) (const 88, []); 	(*stack_base*)
(*439*) (const 18446744073709551320, []); 	(*stack_base*)
(*440*) (add 64, [409, 439]); 	(*stack_base*)
(*441*) (const 96, []); 	(*stack_base*)
(*442*) (const 18446744073709551328, []); 	(*stack_base*)
(*443*) (add 64, [409, 442]); 	(*stack_base*)
(*444*) (const 104, []); 	(*stack_base*)
(*445*) (const 18446744073709551336, []); 	(*stack_base*)
(*446*) (add 64, [409, 445]); 	(*stack_base*)
(*447*) (const 112, []); 	(*stack_base*)
(*448*) (const 18446744073709551344, []); 	(*stack_base*)
(*449*) (add 64, [409, 448]); 	(*stack_base*)
(*450*) (const 120, []); 	(*stack_base*)
(*451*) (const 18446744073709551352, []); 	(*stack_base*)
(*452*) (add 64, [409, 451]); 	(*stack_base*)
(*453*) (const 128, []); 	(*stack_base*)
(*454*) (const 18446744073709551360, []); 	(*stack_base*)
(*455*) (add 64, [409, 454]); 	(*stack_base*)
(*456*) (const 136, []); 	(*stack_base*)
(*457*) (const 18446744073709551368, []); 	(*stack_base*)
(*458*) (add 64, [409, 457]); 	(*stack_base*)
(*459*) (const 144, []); 	(*stack_base*)
(*460*) (const 18446744073709551376, []); 	(*stack_base*)
(*461*) (add 64, [409, 460]); 	(*stack_base*)
(*462*) (const 152, []); 	(*stack_base*)
(*463*) (const 18446744073709551384, []); 	(*stack_base*)
(*464*) (add 64, [409, 463]); 	(*stack_base*)
(*465*) (const 160, []); 	(*stack_base*)
(*466*) (const 18446744073709551392, []); 	(*stack_base*)
(*467*) (add 64, [409, 466]); 	(*stack_base*)
(*468*) (const 168, []); 	(*stack_base*)
(*469*) (const 18446744073709551400, []); 	(*stack_base*)
(*470*) (add 64, [409, 469]); 	(*stack_base*)
(*471*) (const 176, []); 	(*stack_base*)
(*472*) (const 18446744073709551408, []); 	(*stack_base*)
(*473*) (add 64, [409, 472]); 	(*stack_base*)
(*474*) (const 184, []); 	(*stack_base*)
(*475*) (const 18446744073709551416, []); 	(*stack_base*)
(*476*) (add 64, [409, 475]); 	(*stack_base*)
(*477*) (const 192, []); 	(*stack_base*)
(*478*) (const 18446744073709551424, []); 	(*stack_base*)
(*479*) (add 64, [409, 478]); 	(*stack_base*)
(*480*) (const 200, []); 	(*stack_base*)
(*481*) (const 18446744073709551432, []); 	(*stack_base*)
(*482*) (add 64, [409, 481]); 	(*stack_base*)
(*483*) (const 208, []); 	(*stack_base*)
(*484*) (const 18446744073709551440, []); 	(*stack_base*)
(*485*) (add 64, [409, 484]); 	(*stack_base*)
(*486*) (const 216, []); 	(*stack_base*)
(*487*) (const 18446744073709551448, []); 	(*stack_base*)
(*488*) (add 64, [409, 487]); 	(*stack_base*)
(*489*) (const 224, []); 	(*stack_base*)
(*490*) (const 18446744073709551456, []); 	(*stack_base*)
(*491*) (add 64, [409, 490]); 	(*stack_base*)
(*492*) (const 232, []); 	(*stack_base*)
(*493*) (const 18446744073709551464, []); 	(*stack_base*)
(*494*) (add 64, [409, 493]); 	(*stack_base*)
(*495*) (const 240, []); 	(*stack_base*)
(*496*) (const 18446744073709551472, []); 	(*stack_base*)
(*497*) (add 64, [409, 496]); 	(*stack_base*)
(*498*) (const 248, []); 	(*stack_base*)
(*499*) (const 18446744073709551480, []); 	(*stack_base*)
(*500*) (add 64, [409, 499]); 	(*stack_base*)
(*501*) (const 256, []); 	(*stack_base*)
(*502*) (const 18446744073709551488, []); 	(*stack_base*)
(*503*) (add 64, [409, 502]); 	(*stack_base*)
(*504*) (const 264, []); 	(*stack_base*)
(*505*) (const 18446744073709551496, []); 	(*stack_base*)
(*506*) (add 64, [409, 505]); 	(*stack_base*)
(*507*) (const 272, []); 	(*stack_base*)
(*508*) (const 18446744073709551504, []); 	(*stack_base*)
(*509*) (add 64, [409, 508]); 	(*stack_base*)
(*510*) (const 280, []); 	(*stack_base*)
(*511*) (const 18446744073709551512, []); 	(*stack_base*)
(*512*) (add 64, [409, 511]); 	(*stack_base*)
(*513*) (const 288, []); 	(*stack_base*)
(*514*) (const 18446744073709551520, []); 	(*stack_base*)
(*515*) (add 64, [409, 514]); 	(*stack_base*)
(*516*) (const 296, []); 	(*stack_base*)
(*517*) (const 18446744073709551528, []); 	(*stack_base*)
(*518*) (add 64, [409, 517]); 	(*stack_base*)
(*519*) (const 304, []); 	(*stack_base*)
(*520*) (const 18446744073709551536, []); 	(*stack_base*)
(*521*) (add 64, [409, 520]); 	(*stack_base*)
(*522*) (const 312, []); 	(*stack_base*)
(*523*) (const 18446744073709551544, []); 	(*stack_base*)
(*524*) (add 64, [409, 523]); 	(*stack_base*)
(*525*) (const 320, []); 	(*stack_base*)
(*526*) (const 18446744073709551552, []); 	(*stack_base*)
(*527*) (add 64, [409, 526]); 	(*stack_base*)
(*528*) (const 328, []); 	(*stack_base*)
(*529*) (const 18446744073709551560, []); 	(*stack_base*)
(*530*) (add 64, [409, 529]); 	(*stack_base*)
(*531*) (const 336, []); 	(*stack_base*)
(*532*) (const 18446744073709551568, []); 	(*stack_base*)
(*533*) (add 64, [409, 532]); 	(*stack_base*)
(*534*) (const 344, []); 	(*stack_base*)
(*535*) (const 18446744073709551576, []); 	(*stack_base*)
(*536*) (add 64, [409, 535]); 	(*stack_base*)
(*537*) (const 352, []); 	(*stack_base*)
(*538*) (const 18446744073709551584, []); 	(*stack_base*)
(*539*) (add 64, [409, 538]); 	(*stack_base*)
(*540*) (const 360, []); 	(*stack_base*)
(*541*) (const 18446744073709551592, []); 	(*stack_base*)
(*542*) (add 64, [409, 541]); 	(*stack_base*)
(*543*) (const 368, []); 	(*stack_base*)
(*544*) (const 18446744073709551600, []); 	(*stack_base*)
(*545*) (add 64, [409, 544]); 	(*stack_base*)
(*546*) (const 376, []); 	(*stack_base*)
(*547*) (const 18446744073709551608, []); 	(*stack_base*)
(*548*) (add 64, [409, 547]); 	(*stack_base*)
(*549*) (addoverflow 64, [20, 26]);
(*550*) (add 64, [22, 29]);
(*551*) (addcarry 64, [22, 29]);
(*552*) (addoverflow 64, [22, 29]);
(*553*) (addcarry 64, [16, 550]);
(*554*) (addoverflow 64, [16, 550]);
(*555*) (add 64, [18, 553]);
(*556*) (addcarry 64, [18, 553]);
(*557*) (addoverflow 64, [18, 553]);
(*558*) (add 64, [12, 18, 553]);
(*559*) (addcarry 64, [12, 555]);
(*560*) (addoverflow 64, [12, 555]);
(*561*) (add 64, [14, 559]);
(*562*) (addcarry 64, [14, 559]);
(*563*) (addoverflow 64, [14, 559]);
(*564*) (xorZ, [339, 339]);
(*565*) (const 63, []);
(*566*) (const 4294967296, []);
(*567*) (mul 64, [3, 7, 566]);
(*568*) (shr 64, [24, 418]);
(*569*) (add 64, [20, 26, 567]);
(*570*) (addcarry 64, [27, 567]);
(*571*) (addoverflow 64, [27, 567]);
(*572*) (add 64, [16, 22, 29, 568, 570]);
(*573*) (addcarry 64, [30, 568, 570]);
(*574*) (addoverflow 64, [30, 568, 570]);
(*575*) (add 64, [12, 18, 40, 553, 573]);
(*576*) (addcarry 64, [40, 558, 573]);
(*577*) (addoverflow 64, [40, 558, 573]);
(*578*) (add 64, [14, 42, 559, 576]);
(*579*) (addcarry 64, [42, 561, 576]);
(*580*) (addoverflow 64, [42, 561, 576]);
(*581*) (xorZ, [567, 567]);
(*582*) (add 64, [20, 26, 87, 567]);
(*583*) (addcarry 64, [87, 569]);
(*584*) (addoverflow 64, [87, 569]);
(*585*) (add 64, [89, 583]);
(*586*) (addcarry 64, [89, 583]);
(*587*) (addoverflow 64, [89, 583]);
(*588*) (add 64, [16, 22, 29, 89, 568, 570, 583]);
(*589*) (addcarry 64, [572, 585]);
(*590*) (addoverflow 64, [572, 585]);
(*591*) (add 64, [85, 589]);
(*592*) (addcarry 64, [85, 589]);
(*593*) (addoverflow 64, [85, 589]);
(*594*) (add 64, [16, 22, 29, 83, 89, 568, 570, 583]);
(*595*) (addcarry 64, [83, 588]);
(*596*) (addoverflow 64, [83, 588]);
(*597*) (add 64, [85, 589, 595]);
(*598*) (addcarry 64, [591, 595]);
(*599*) (addoverflow 64, [591, 595]);
(*600*) (add 64, [12, 18, 40, 85, 553, 573, 589, 595]);
(*601*) (addcarry 64, [575, 597]);
(*602*) (addoverflow 64, [575, 597]);
(*603*) (add 64, [81, 601]);
(*604*) (addcarry 64, [81, 601]);
(*605*) (addoverflow 64, [81, 601]);
(*606*) (add 64, [12, 18, 40, 79, 85, 553, 573, 589, 595]);
(*607*) (addcarry 64, [79, 600]);
(*608*) (addoverflow 64, [79, 600]);
(*609*) (add 64, [81, 601, 607]);
(*610*) (addcarry 64, [603, 607]);
(*611*) (addoverflow 64, [603, 607]);
(*612*) (add 64, [14, 42, 81, 559, 576, 601, 607]);
(*613*) (addcarry 64, [578, 609]);
(*614*) (addoverflow 64, [578, 609]);
(*615*) (add 64, [77, 613]);
(*616*) (addcarry 64, [77, 613]);
(*617*) (addoverflow 64, [77, 613]);
(*618*) (add 64, [14, 42, 75, 81, 559, 576, 601, 607]);
(*619*) (addcarry 64, [75, 612]);
(*620*) (addoverflow 64, [75, 612]);
(*621*) (add 64, [77, 579, 613, 619]);
(*622*) (addcarry 64, [579, 615, 619]);
(*623*) (addoverflow 64, [579, 615, 619]);
(*624*) (mul 64, [566, 582]);
(*625*) (mulZ, [38, 582]);
(*626*) (shrZ, [625, 10]);
(*627*) (mul 64, [38, 582]);
(*628*) (shr 64, [625, 10]);
(*629*) (shr 64, [582, 418]);
(*630*) (add 64, [16, 22, 29, 83, 89, 568, 570, 583, 624]);
(*631*) (addcarry 64, [594, 624]);
(*632*) (addoverflow 64, [594, 624]);
(*633*) (add 64, [12, 18, 40, 79, 85, 553, 573, 589, 595, 629, 631]);
(*634*) (addcarry 64, [606, 629, 631]);
(*635*) (addoverflow 64, [606, 629, 631]);
(*636*) (add 64, [14, 42, 75, 81, 559, 576, 601, 607, 627, 634]);
(*637*) (addcarry 64, [618, 627, 634]);
(*638*) (addoverflow 64, [618, 627, 634]);
(*639*) (add 64, [77, 579, 613, 619, 628, 637]);
(*640*) (addcarry 64, [621, 628, 637]);
(*641*) (addoverflow 64, [621, 628, 637]);
(*642*) (add 64, [622, 640]);
(*643*) (xorZ, [624, 624]);
(*644*) (add 64, [16, 22, 29, 83, 89, 163, 568, 570, 583, 624]);
(*645*) (addcarry 64, [163, 630]);
(*646*) (addoverflow 64, [163, 630]);
(*647*) (add 64, [165, 645]);
(*648*) (addcarry 64, [165, 645]);
(*649*) (addoverflow 64, [165, 645]);
(*650*) (add 64, [12, 18, 40, 79, 85, 165, 553, 573, 589, 595, 629, 631, 645]);
(*651*) (addcarry 64, [633, 647]);
(*652*) (addoverflow 64, [633, 647]);
(*653*) (add 64, [161, 651]);
(*654*) (addcarry 64, [161, 651]);
(*655*) (addoverflow 64, [161, 651]);
(*656*) (add 64, [12, 18, 40, 79, 85, 159, 165, 553, 573, 589, 595, 629, 631, 645]);
(*657*) (addcarry 64, [159, 650]);
(*658*) (addoverflow 64, [159, 650]);
(*659*) (add 64, [161, 651, 657]);
(*660*) (addcarry 64, [653, 657]);
(*661*) (addoverflow 64, [653, 657]);
(*662*) (add 64, [14, 42, 75, 81, 161, 559, 576, 601, 607, 627, 634, 651, 657]);
(*663*) (addcarry 64, [636, 659]);
(*664*) (addoverflow 64, [636, 659]);
(*665*) (add 64, [157, 663]);
(*666*) (addcarry 64, [157, 663]);
(*667*) (addoverflow 64, [157, 663]);
(*668*) (add 64, [14, 42, 75, 81, 155, 161, 559, 576, 601, 607, 627, 634, 651, 657]);
(*669*) (addcarry 64, [155, 662]);
(*670*) (addoverflow 64, [155, 662]);
(*671*) (add 64, [157, 663, 669]);
(*672*) (addcarry 64, [665, 669]);
(*673*) (addoverflow 64, [665, 669]);
(*674*) (add 64, [77, 157, 579, 613, 619, 628, 637, 663, 669]);
(*675*) (addcarry 64, [639, 671]);
(*676*) (addoverflow 64, [639, 671]);
(*677*) (add 64, [153, 675]);
(*678*) (addcarry 64, [153, 675]);
(*679*) (addoverflow 64, [153, 675]);
(*680*) (add 64, [77, 151, 157, 579, 613, 619, 628, 637, 663, 669]);
(*681*) (addcarry 64, [151, 674]);
(*682*) (addoverflow 64, [151, 674]);
(*683*) (add 64, [153, 622, 640, 675, 681]);
(*684*) (addcarry 64, [642, 677, 681]);
(*685*) (addoverflow 64, [642, 677, 681]);
(*686*) (mul 64, [566, 644]);
(*687*) (mulZ, [38, 644]);
(*688*) (shrZ, [687, 10]);
(*689*) (mul 64, [38, 644]);
(*690*) (shr 64, [687, 10]);
(*691*) (shr 64, [644, 418]);
(*692*) (add 64, [12, 18, 40, 79, 85, 159, 165, 553, 573, 589, 595, 629, 631, 645, 686]);
(*693*) (addcarry 64, [656, 686]);
(*694*) (addoverflow 64, [656, 686]);
(*695*) (add 64, [14, 42, 75, 81, 155, 161, 559, 576, 601, 607, 627, 634, 651, 657, 691, 693]);
(*696*) (addcarry 64, [668, 691, 693]);
(*697*) (addoverflow 64, [668, 691, 693]);
(*698*) (add 64, [77, 151, 157, 579, 613, 619, 628, 637, 663, 669, 689, 696]);
(*699*) (addcarry 64, [680, 689, 696]);
(*700*) (addoverflow 64, [680, 689, 696]);
(*701*) (add 64, [153, 622, 640, 675, 681, 690, 699]);
(*702*) (addcarry 64, [683, 690, 699]);
(*703*) (addoverflow 64, [683, 690, 699]);
(*704*) (add 64, [684, 702]);
(*705*) (xorZ, [686, 686]);
(*706*) (add 64, [12, 18, 40, 79, 85, 159, 165, 239, 553, 573, 589, 595, 629, 631, 645, 686]);
(*707*) (addcarry 64, [239, 692]);
(*708*) (addoverflow 64, [239, 692]);
(*709*) (add 64, [241, 707]);
(*710*) (addcarry 64, [241, 707]);
(*711*) (addoverflow 64, [241, 707]);
(*712*) (add 64, [14, 42, 75, 81, 155, 161, 241, 559, 576, 601, 607, 627, 634, 651, 657, 691, 693, 707]);
(*713*) (addcarry 64, [695, 709]);
(*714*) (addoverflow 64, [695, 709]);
(*715*) (add 64, [237, 713]);
(*716*) (addcarry 64, [237, 713]);
(*717*) (addoverflow 64, [237, 713]);
(*718*) (add 64, [14, 42, 75, 81, 155, 161, 235, 241, 559, 576, 601, 607, 627, 634, 651, 657, 691, 693, 707]);
(*719*) (addcarry 64, [235, …
Hopefully this is fast enough to merge the other changes, and we can
revert this at some future point when I figure out how to annotate dag
indices with actual descriptions without making everything insanely
slow.
<details><summary>Timing Diff since 831d611</summary>
<p>

```
     After |  Peak Mem | File Name                                |     Before |  Peak Mem ||     Change || Change (mem) | % Change | % Change (mem)
----------------------------------------------------------------------------------------------------------------------------------------------------
122m52.12s | 302392 ko | Total Time / Peak Mem                    | 103m15.27s | 327788 ko || +19m36.84s ||    -25396 ko |  +18.99% |         -7.74%
----------------------------------------------------------------------------------------------------------------------------------------------------
 52m14.52s | 302392 ko | fiat-amd64/p434-mul.only-status          |  44m39.92s | 327788 ko ||  +7m34.59s ||    -25396 ko |  +16.96% |         -7.74%
 49m00.50s | 289860 ko | fiat-amd64/p434-square.only-status       |  41m46.12s | 280424 ko ||  +7m14.38s ||      9436 ko |  +17.33% |         +3.36%
  7m03.23s |  93856 ko | fiat-amd64/p384-mul.only-status          |   5m42.95s |  93804 ko ||  +1m20.28s ||        52 ko |  +23.40% |         +0.05%
  6m17.51s |  93864 ko | fiat-amd64/p384-square.only-status       |   5m09.63s |  84120 ko ||  +1m07.87s ||      9744 ko |  +21.92% |        +11.58%
  2m30.23s |  54948 ko | fiat-amd64/p448-mul.only-status          |   1m46.05s |  50276 ko ||  +0m44.17s ||      4672 ko |  +41.65% |         +9.29%
  1m47.63s |  50300 ko | fiat-amd64/p521-mul.only-status          |   1m10.69s |  45816 ko ||  +0m36.93s ||      4484 ko |  +52.25% |         +9.78%
  0m57.09s |  42144 ko | fiat-amd64/p448-square.only-status       |   0m41.84s |  42168 ko ||  +0m15.25s ||       -24 ko |  +36.44% |         -0.05%
  0m33.92s |  38808 ko | fiat-amd64/p224-square.only-status       |   0m21.42s |  38620 ko ||  +0m12.50s ||       188 ko |  +58.35% |         +0.48%
  0m33.25s |  36436 ko | fiat-amd64/p224-mul.only-status          |   0m22.94s |  36476 ko ||  +0m10.30s ||       -40 ko |  +44.94% |         -0.10%
  0m17.85s |  36252 ko | fiat-amd64/p256-mul.only-status          |   0m11.08s |  36188 ko ||  +0m06.77s ||        64 ko |  +61.10% |         +0.17%
  0m14.26s |  38104 ko | fiat-amd64/p256-square.only-status       |   0m09.72s |  37620 ko ||  +0m04.53s ||       484 ko |  +46.70% |         +1.28%
  0m24.48s |  38740 ko | fiat-amd64/secp256k1-mul.only-status     |   0m20.92s |  38780 ko ||  +0m03.55s ||       -40 ko |  +17.01% |         -0.10%
  0m23.26s |  36288 ko | fiat-amd64/p521-square.only-status       |   0m20.08s |  36152 ko ||  +0m03.18s ||       136 ko |  +15.83% |         +0.37%
  0m22.36s |  38900 ko | fiat-amd64/secp256k1-square.only-status  |   0m19.29s |  39028 ko ||  +0m03.07s ||      -128 ko |  +15.91% |         -0.32%
  0m04.70s |  26684 ko | fiat-amd64/curve25519-mul.only-status    |   0m05.02s |  25720 ko ||  -0m00.31s ||       964 ko |   -6.37% |         +3.74%
  0m03.60s |  26752 ko | fiat-amd64/curve25519-square.only-status |   0m03.85s |  25700 ko ||  -0m00.25s ||      1052 ko |   -6.49% |         +4.09%
  0m01.88s |  22584 ko | fiat-amd64/poly1305-mul.only-status      |   0m01.93s |  22620 ko ||  -0m00.05s ||       -36 ko |   -2.59% |         -0.15%
  0m01.66s |  22168 ko | fiat-amd64/poly1305-square.only-status   |   0m01.70s |  22296 ko ||  -0m00.04s ||      -128 ko |   -2.35% |         -0.57%
  0m00.19s |  21616 ko | fiat-amd64/p224-sub.only-status          |   0m00.13s |  21284 ko ||  +0m00.06s ||       332 ko |  +46.15% |         +1.55%

```
</p>
</details>
@JasonGross
Copy link
Collaborator Author

Here's some performance data for p256 mul (compare with #1207 (comment))
p256-mul-1193.perf.data.tar.gz
p256-mul-1193-perf-super-folded

@JasonGross
Copy link
Collaborator Author

Some of the overhead seems to come from the extra detupling in the function passed to List.indexof in dag_reverse_lookup.

@JasonGross
Copy link
Collaborator Author

Will try to performance optimize later, got OOB ack from Andres on merging

@JasonGross JasonGross merged commit c10d53f into mit-plv:master Apr 21, 2022
@JasonGross JasonGross deleted the asm-better-errors branch April 21, 2022 07:05
@JasonGross
Copy link
Collaborator Author

It seems like another significant chunk of the overhead is caused by increase GC (which would explain the high variance in overheads), and the a significant chunk of the time the GC is invoked by node_beq / merge_node / list_beq, which points again to the importance of optimizing dag_reverse_lookup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant