-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better equivalence-checker error messages #1193
Conversation
195a9b6
to
9b4b03a
Compare
It seems that we increase the time of assembly checking too much. Maybe thunking the error info will help? |
c6652fc
to
e02b6bf
Compare
Is it worth the 20% performance hit to get better error messages? Maybe we can make up the time by removing all the nats? |
The dag now prints out where some indices came from, and assembly lines generating values that don't exist in PHOAS are annotated with the dag indices they generate. We also pave the way to display information about indices that appear only in the assembly and not in the PHOAS. We now get errors like: ``` check_args /* Autogenerated: 'src/ExtractionOCaml/word_by_word_montgomery' p256 64 '2^256 - 2^224 + 2^192 + 2^96 - 1' mul --no-wide-int --shiftr-avoid-uint1 --hints-file 'fiat-amd64/boringssl_intel_manual_mul_p256.asm' */ /* curve description: p256 */ /* machine_wordsize = 64 (from "64") */ /* requested operations: mul */ /* m = 0xffffffff00000001000000000000000000000000ffffffffffffffffffffffff (from "2^256 - 2^224 + 2^192 + 2^96 - 1") */ /* */ /* NOTE: In addition to the bounds specified above each function, all */ /* functions synthesized for this Montgomery arithmetic require the */ /* input to be strictly less than the prime modulus (m), and also */ /* require the input to be in the unique saturated representation. */ /* All functions also ensure that these two properties are true of */ /* return values. */ /* */ /* Computed values: */ /* eval z = z[0] + (z[1] << 64) + (z[2] << 128) + (z[3] << 192) */ /* bytes_eval z = z[0] + (z[1] << 8) + (z[2] << 16) + (z[3] << 24) + (z[4] << 32) + (z[5] << 40) + (z[6] << 48) + (z[7] << 56) + (z[8] << 64) + (z[9] << 72) + (z[10] << 80) + (z[11] << 88) + (z[12] << 96) + (z[13] << 104) + (z[14] << 112) + (z[15] << 120) + (z[16] << 128) + (z[17] << 136) + (z[18] << 144) + (z[19] << 152) + (z[20] << 160) + (z[21] << 168) + (z[22] << 176) + (z[23] << 184) + (z[24] << 192) + (z[25] << 200) + (z[26] << 208) + (z[27] << 216) + (z[28] << 224) + (z[29] << 232) + (z[30] << 240) + (z[31] << 248) */ /* twos_complement_eval z = let x1 := z[0] + (z[1] << 64) + (z[2] << 128) + (z[3] << 192) in */ /* if x1 & (2^256-1) < 2^255 then x1 & (2^256-1) else (x1 & (2^256-1)) - 2^256 */ In fiat_p256_mul: Error while checking for equivalence of syntax tree and assembly: The syntax tree: (λ x1 x2, let x3 := x1[1] (* : uint64_t *) in let x4 := x1[2] (* : uint64_t *) in let x5 := x1[3] (* : uint64_t *) in let x6 := x1[0] (* : uint64_t *) in let x7 := Z.mul_split(2^64, None, (x6, Some [0x0 ~> 0xffffffffffffffff], (x2[3], Some [0x0 ~> 0xffffffffffffffff]))) in let x8 := Z.mul_split(2^64, None, (x6, Some [0x0 ~> 0xffffffffffffffff], (x2[2], Some [0x0 ~> 0xffffffffffffffff]))) in let x9 := Z.mul_split(2^64, None, (x6, Some [0x0 ~> 0xffffffffffffffff], (x2[1], Some [0x0 ~> 0xffffffffffffffff]))) in let x10 := Z.mul_split(2^64, None, (x6, Some [0x0 ~> 0xffffffffffffffff], (x2[0], Some [0x0 ~> 0xffffffffffffffff]))) in let x11 := Z.add_get_carry(2^64, None, (x10₂, Some [0x0 ~> 0xffffffffffffffff], (x9₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x12 := Z.add_with_get_carry(2^64, None, (x11₂, Some [0x0 ~> 0x1], (x9₂, Some [0x0 ~> 0xffffffffffffffff], (x8₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x13 := Z.add_with_get_carry(2^64, None, (x12₂, Some [0x0 ~> 0x1], (x8₂, Some [0x0 ~> 0xffffffffffffffff], (x7₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x14 := x13₂ + x7₂ (* : uint64_t *) in let x15 := Z.mul_split(2^64, None, (x10₁, Some [0x0 ~> 0xffffffffffffffff], (0xffffffff00000001, None))) in let x16 := Z.mul_split(2^64, None, (x10₁, Some [0x0 ~> 0xffffffffffffffff], (2^32-1, None))) in let x17 := Z.mul_split(2^64, None, (x10₁, Some [0x0 ~> 0xffffffffffffffff], (2^64-1, None))) in let x18 := Z.add_get_carry(2^64, None, (x17₂, Some [0x0 ~> 0xffffffffffffffff], (x16₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x19 := x18₂ + x16₂ (* : uint64_t *) in let x20 := Z.add_get_carry(2^64, None, (x10₁, Some [0x0 ~> 0xffffffffffffffff], (x17₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x21 := Z.add_with_get_carry(2^64, None, (x20₂, Some [0x0 ~> 0x1], (x11₁, Some [0x0 ~> 0xffffffffffffffff], (x18₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x22 := Z.add_with_get_carry(2^64, None, (x21₂, Some [0x0 ~> 0x1], (x12₁, Some [0x0 ~> 0xffffffffffffffff], (x19, Some [0x0 ~> 0xffffffffffffffff])))) in let x23 := Z.add_with_get_carry(2^64, None, (x22₂, Some [0x0 ~> 0x1], (x13₁, Some [0x0 ~> 0xffffffffffffffff], (x15₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x24 := Z.add_with_get_carry(2^64, None, (x23₂, Some [0x0 ~> 0x1], (x14, Some [0x0 ~> 0xffffffffffffffff], (x15₂, Some [0x0 ~> 0xffffffffffffffff])))) in let x25 := Z.mul_split(2^64, None, (x3, Some [0x0 ~> 0xffffffffffffffff], (x2[3], Some [0x0 ~> 0xffffffffffffffff]))) in let x26 := Z.mul_split(2^64, None, (x3, Some [0x0 ~> 0xffffffffffffffff], (x2[2], Some [0x0 ~> 0xffffffffffffffff]))) in let x27 := Z.mul_split(2^64, None, (x3, Some [0x0 ~> 0xffffffffffffffff], (x2[1], Some [0x0 ~> 0xffffffffffffffff]))) in let x28 := Z.mul_split(2^64, None, (x3, Some [0x0 ~> 0xffffffffffffffff], (x2[0], Some [0x0 ~> 0xffffffffffffffff]))) in let x29 := Z.add_get_carry(2^64, None, (x28₂, Some [0x0 ~> 0xffffffffffffffff], (x27₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x30 := Z.add_with_get_carry(2^64, None, (x29₂, Some [0x0 ~> 0x1], (x27₂, Some [0x0 ~> 0xffffffffffffffff], (x26₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x31 := Z.add_with_get_carry(2^64, None, (x30₂, Some [0x0 ~> 0x1], (x26₂, Some [0x0 ~> 0xffffffffffffffff], (x25₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x32 := x31₂ + x25₂ (* : uint64_t *) in let x33 := Z.add_get_carry(2^64, None, (x21₁, Some [0x0 ~> 0xffffffffffffffff], (x28₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x34 := Z.add_with_get_carry(2^64, None, (x33₂, Some [0x0 ~> 0x1], (x22₁, Some [0x0 ~> 0xffffffffffffffff], (x29₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x35 := Z.add_with_get_carry(2^64, None, (x34₂, Some [0x0 ~> 0x1], (x23₁, Some [0x0 ~> 0xffffffffffffffff], (x30₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x36 := Z.add_with_get_carry(2^64, None, (x35₂, Some [0x0 ~> 0x1], (x24₁, Some [0x0 ~> 0xffffffffffffffff], (x31₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x37 := Z.add_with_get_carry(2^64, None, (x36₂, Some [0x0 ~> 0x1], (x24₂, Some [0x0 ~> 0x1], (x32, Some [0x0 ~> 0xffffffffffffffff])))) in let x38 := Z.mul_split(2^64, None, (x33₁, Some [0x0 ~> 0xffffffffffffffff], (0xffffffff00000001, None))) in let x39 := Z.mul_split(2^64, None, (x33₁, Some [0x0 ~> 0xffffffffffffffff], (2^32-1, None))) in let x40 := Z.mul_split(2^64, None, (x33₁, Some [0x0 ~> 0xffffffffffffffff], (2^64-1, None))) in let x41 := Z.add_get_carry(2^64, None, (x40₂, Some [0x0 ~> 0xffffffffffffffff], (x39₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x42 := x41₂ + x39₂ (* : uint64_t *) in let x43 := Z.add_get_carry(2^64, None, (x33₁, Some [0x0 ~> 0xffffffffffffffff], (x40₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x44 := Z.add_with_get_carry(2^64, None, (x43₂, Some [0x0 ~> 0x1], (x34₁, Some [0x0 ~> 0xffffffffffffffff], (x41₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x45 := Z.add_with_get_carry(2^64, None, (x44₂, Some [0x0 ~> 0x1], (x35₁, Some [0x0 ~> 0xffffffffffffffff], (x42, Some [0x0 ~> 0xffffffffffffffff])))) in let x46 := Z.add_with_get_carry(2^64, None, (x45₂, Some [0x0 ~> 0x1], (x36₁, Some [0x0 ~> 0xffffffffffffffff], (x38₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x47 := Z.add_with_get_carry(2^64, None, (x46₂, Some [0x0 ~> 0x1], (x37₁, Some [0x0 ~> 0xffffffffffffffff], (x38₂, Some [0x0 ~> 0xffffffffffffffff])))) in let x48 := x47₂ + x37₂ (* : uint64_t *) in let x49 := Z.mul_split(2^64, None, (x4, Some [0x0 ~> 0xffffffffffffffff], (x2[3], Some [0x0 ~> 0xffffffffffffffff]))) in let x50 := Z.mul_split(2^64, None, (x4, Some [0x0 ~> 0xffffffffffffffff], (x2[2], Some [0x0 ~> 0xffffffffffffffff]))) in let x51 := Z.mul_split(2^64, None, (x4, Some [0x0 ~> 0xffffffffffffffff], (x2[1], Some [0x0 ~> 0xffffffffffffffff]))) in let x52 := Z.mul_split(2^64, None, (x4, Some [0x0 ~> 0xffffffffffffffff], (x2[0], Some [0x0 ~> 0xffffffffffffffff]))) in let x53 := Z.add_get_carry(2^64, None, (x52₂, Some [0x0 ~> 0xffffffffffffffff], (x51₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x54 := Z.add_with_get_carry(2^64, None, (x53₂, Some [0x0 ~> 0x1], (x51₂, Some [0x0 ~> 0xffffffffffffffff], (x50₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x55 := Z.add_with_get_carry(2^64, None, (x54₂, Some [0x0 ~> 0x1], (x50₂, Some [0x0 ~> 0xffffffffffffffff], (x49₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x56 := x55₂ + x49₂ (* : uint64_t *) in let x57 := Z.add_get_carry(2^64, None, (x44₁, Some [0x0 ~> 0xffffffffffffffff], (x52₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x58 := Z.add_with_get_carry(2^64, None, (x57₂, Some [0x0 ~> 0x1], (x45₁, Some [0x0 ~> 0xffffffffffffffff], (x53₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x59 := Z.add_with_get_carry(2^64, None, (x58₂, Some [0x0 ~> 0x1], (x46₁, Some [0x0 ~> 0xffffffffffffffff], (x54₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x60 := Z.add_with_get_carry(2^64, None, (x59₂, Some [0x0 ~> 0x1], (x47₁, Some [0x0 ~> 0xffffffffffffffff], (x55₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x61 := Z.add_with_get_carry(2^64, None, (x60₂, Some [0x0 ~> 0x1], (x48, Some [0x0 ~> 0xffffffffffffffff], (x56, Some [0x0 ~> 0xffffffffffffffff])))) in let x62 := Z.mul_split(2^64, None, (x57₁, Some [0x0 ~> 0xffffffffffffffff], (0xffffffff00000001, None))) in let x63 := Z.mul_split(2^64, None, (x57₁, Some [0x0 ~> 0xffffffffffffffff], (2^32-1, None))) in let x64 := Z.mul_split(2^64, None, (x57₁, Some [0x0 ~> 0xffffffffffffffff], (2^64-1, None))) in let x65 := Z.add_get_carry(2^64, None, (x64₂, Some [0x0 ~> 0xffffffffffffffff], (x63₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x66 := x65₂ + x63₂ (* : uint64_t *) in let x67 := Z.add_get_carry(2^64, None, (x57₁, Some [0x0 ~> 0xffffffffffffffff], (x64₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x68 := Z.add_with_get_carry(2^64, None, (x67₂, Some [0x0 ~> 0x1], (x58₁, Some [0x0 ~> 0xffffffffffffffff], (x65₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x69 := Z.add_with_get_carry(2^64, None, (x68₂, Some [0x0 ~> 0x1], (x59₁, Some [0x0 ~> 0xffffffffffffffff], (x66, Some [0x0 ~> 0xffffffffffffffff])))) in let x70 := Z.add_with_get_carry(2^64, None, (x69₂, Some [0x0 ~> 0x1], (x60₁, Some [0x0 ~> 0xffffffffffffffff], (x62₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x71 := Z.add_with_get_carry(2^64, None, (x70₂, Some [0x0 ~> 0x1], (x61₁, Some [0x0 ~> 0xffffffffffffffff], (x62₂, Some [0x0 ~> 0xffffffffffffffff])))) in let x72 := x71₂ + x61₂ (* : uint64_t *) in let x73 := Z.mul_split(2^64, None, (x5, Some [0x0 ~> 0xffffffffffffffff], (x2[3], Some [0x0 ~> 0xffffffffffffffff]))) in let x74 := Z.mul_split(2^64, None, (x5, Some [0x0 ~> 0xffffffffffffffff], (x2[2], Some [0x0 ~> 0xffffffffffffffff]))) in let x75 := Z.mul_split(2^64, None, (x5, Some [0x0 ~> 0xffffffffffffffff], (x2[1], Some [0x0 ~> 0xffffffffffffffff]))) in let x76 := Z.mul_split(2^64, None, (x5, Some [0x0 ~> 0xffffffffffffffff], (x2[0], Some [0x0 ~> 0xffffffffffffffff]))) in let x77 := Z.add_get_carry(2^64, None, (x76₂, Some [0x0 ~> 0xffffffffffffffff], (x75₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x78 := Z.add_with_get_carry(2^64, None, (x77₂, Some [0x0 ~> 0x1], (x75₂, Some [0x0 ~> 0xffffffffffffffff], (x74₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x79 := Z.add_with_get_carry(2^64, None, (x78₂, Some [0x0 ~> 0x1], (x74₂, Some [0x0 ~> 0xffffffffffffffff], (x73₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x80 := x79₂ + x73₂ (* : uint64_t *) in let x81 := Z.add_get_carry(2^64, None, (x68₁, Some [0x0 ~> 0xffffffffffffffff], (x76₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x82 := Z.add_with_get_carry(2^64, None, (x81₂, Some [0x0 ~> 0x1], (x69₁, Some [0x0 ~> 0xffffffffffffffff], (x77₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x83 := Z.add_with_get_carry(2^64, None, (x82₂, Some [0x0 ~> 0x1], (x70₁, Some [0x0 ~> 0xffffffffffffffff], (x78₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x84 := Z.add_with_get_carry(2^64, None, (x83₂, Some [0x0 ~> 0x1], (x71₁, Some [0x0 ~> 0xffffffffffffffff], (x79₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x85 := Z.add_with_get_carry(2^64, None, (x84₂, Some [0x0 ~> 0x1], (x72, Some [0x0 ~> 0xffffffffffffffff], (x80, Some [0x0 ~> 0xffffffffffffffff])))) in let x86 := Z.mul_split(2^64, None, (x81₁, Some [0x0 ~> 0xffffffffffffffff], (0xffffffff00000001, None))) in let x87 := Z.mul_split(2^64, None, (x81₁, Some [0x0 ~> 0xffffffffffffffff], (2^32-1, None))) in let x88 := Z.mul_split(2^64, None, (x81₁, Some [0x0 ~> 0xffffffffffffffff], (2^64-1, None))) in let x89 := Z.add_get_carry(2^64, None, (x88₂, Some [0x0 ~> 0xffffffffffffffff], (x87₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x90 := x89₂ + x87₂ (* : uint64_t *) in let x91 := Z.add_get_carry(2^64, None, (x81₁, Some [0x0 ~> 0xffffffffffffffff], (x88₁, Some [0x0 ~> 0xffffffffffffffff]))) in let x92 := Z.add_with_get_carry(2^64, None, (x91₂, Some [0x0 ~> 0x1], (x82₁, Some [0x0 ~> 0xffffffffffffffff], (x89₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x93 := Z.add_with_get_carry(2^64, None, (x92₂, Some [0x0 ~> 0x1], (x83₁, Some [0x0 ~> 0xffffffffffffffff], (x90, Some [0x0 ~> 0xffffffffffffffff])))) in let x94 := Z.add_with_get_carry(2^64, None, (x93₂, Some [0x0 ~> 0x1], (x84₁, Some [0x0 ~> 0xffffffffffffffff], (x86₁, Some [0x0 ~> 0xffffffffffffffff])))) in let x95 := Z.add_with_get_carry(2^64, None, (x94₂, Some [0x0 ~> 0x1], (x85₁, Some [0x0 ~> 0xffffffffffffffff], (x86₂, Some [0x0 ~> 0xffffffffffffffff])))) in let x96 := x95₂ + x85₂ (* : uint64_t *) in let x97 := Z.sub_with_get_borrow(2^64, None, (0, None, (x92₁, Some [0x0 ~> 0xffffffffffffffff], (2^64-1, None)))) in let x98 := Z.sub_with_get_borrow(2^64, None, (x97₂, Some [0x0 ~> 0x1], (x93₁, Some [0x0 ~> 0xffffffffffffffff], (2^32-1, None)))) in let x99 := Z.sub_with_get_borrow(2^64, None, (x98₂, Some [0x0 ~> 0x1], (x94₁, Some [0x0 ~> 0xffffffffffffffff], (0, None)))) in let x100 := Z.sub_with_get_borrow(2^64, None, (x99₂, Some [0x0 ~> 0x1], (x95₁, Some [0x0 ~> 0xffffffffffffffff], (0xffffffff00000001, None)))) in let x101 := Z.sub_with_get_borrow(2^64, None, (x100₂, Some [0x0 ~> 0x1], (x96, Some [0x0 ~> 0xffffffffffffffff], (0, None)))) in let x102 := Z.zselect(x101₂, Some [0x0 ~> 0x1], (x97₁, Some [0x0 ~> 0xffffffffffffffff], (x92₁, Some [0x0 ~> 0xffffffffffffffff]))) (* : uint64_t *) in let x103 := Z.zselect(x101₂, Some [0x0 ~> 0x1], (x98₁, Some [0x0 ~> 0xffffffffffffffff], (x93₁, Some [0x0 ~> 0xffffffffffffffff]))) (* : uint64_t *) in let x104 := Z.zselect(x101₂, Some [0x0 ~> 0x1], (x99₁, Some [0x0 ~> 0xffffffffffffffff], (x94₁, Some [0x0 ~> 0xffffffffffffffff]))) (* : uint64_t *) in let x105 := Z.zselect(x101₂, Some [0x0 ~> 0x1], (x100₁, Some [0x0 ~> 0xffffffffffffffff], (x95₁, Some [0x0 ~> 0xffffffffffffffff]))) (* : uint64_t *) in x102 :: x103 :: x104 :: x105 :: [] ) which can be pretty-printed as: /* * Input Bounds: * arg1: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] * arg2: [[0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff], [0x0 ~> 0xffffffffffffffff]] * Output Bounds: * out1: None */ void f(uint64_t out1[4], const uint64_t arg1[4], const uint64_t arg2[4]) { uint64_t x1; uint64_t x2; uint64_t x3; uint64_t x4; uint64_t x5; uint64_t x6; uint64_t x7; uint64_t x8; uint64_t x9; uint64_t x10; uint64_t x11; uint64_t x12; uint64_t x13; uint1 x14; uint64_t x15; uint1 x16; uint64_t x17; uint1 x18; uint64_t x19; uint64_t x20; uint64_t x21; uint64_t x22; uint64_t x23; uint64_t x24; uint64_t x25; uint64_t x26; uint1 x27; uint64_t x28; uint64_t x29; uint1 x30; uint64_t x31; uint1 x32; uint64_t x33; uint1 x34; uint64_t x35; uint1 x36; uint64_t x37; uint1 x38; uint64_t x39; uint64_t x40; uint64_t x41; uint64_t x42; uint64_t x43; uint64_t x44; uint64_t x45; uint64_t x46; uint64_t x47; uint1 x48; uint64_t x49; uint1 x50; uint64_t x51; uint1 x52; uint64_t x53; uint64_t x54; uint1 x55; uint64_t x56; uint1 x57; uint64_t x58; uint1 x59; uint64_t x60; uint1 x61; uint64_t x62; uint1 x63; uint64_t x64; uint64_t x65; uint64_t x66; uint64_t x67; uint64_t x68; uint64_t x69; uint64_t x70; uint1 x71; uint64_t x72; uint64_t x73; uint1 x74; uint64_t x75; uint1 x76; uint64_t x77; uint1 x78; uint64_t x79; uint1 x80; uint64_t x81; uint1 x82; uint64_t x83; uint64_t x84; uint64_t x85; uint64_t x86; uint64_t x87; uint64_t x88; uint64_t x89; uint64_t x90; uint64_t x91; uint64_t x92; uint1 x93; uint64_t x94; uint1 x95; uint64_t x96; uint1 x97; uint64_t x98; uint64_t x99; uint1 x100; uint64_t x101; uint1 x102; uint64_t x103; uint1 x104; uint64_t x105; uint1 x106; uint64_t x107; uint1 x108; uint64_t x109; uint64_t x110; uint64_t x111; uint64_t x112; uint64_t x113; uint64_t x114; uint64_t x115; uint1 x116; uint64_t x117; uint64_t x118; uint1 x119; uint64_t x120; uint1 x121; uint64_t x122; uint1 x123; uint64_t x124; uint1 x125; uint64_t x126; uint1 x127; uint64_t x128; uint64_t x129; uint64_t x130; uint64_t x131; uint64_t x132; uint64_t x133; uint64_t x134; uint64_t x135; uint64_t x136; uint64_t x137; uint1 x138; uint64_t x139; uint1 x140; uint64_t x141; uint1 x142; uint64_t x143; uint64_t x144; uint1 x145; uint64_t x146; uint1 x147; uint64_t x148; uint1 x149; uint64_t x150; uint1 x151; uint64_t x152; uint1 x153; uint64_t x154; uint64_t x155; uint64_t x156; uint64_t x157; uint64_t x158; uint64_t x159; uint64_t x160; uint1 x161; uint64_t x162; uint64_t x163; uint1 x164; uint64_t x165; uint1 x166; uint64_t x167; uint1 x168; uint64_t x169; uint1 x170; uint64_t x171; uint1 x172; uint64_t x173; uint64_t x174; uint1 x175; uint64_t x176; uint1 x177; uint64_t x178; uint1 x179; uint64_t x180; uint1 x181; uint64_t x182; uint1 x183; uint64_t x184; uint64_t x185; uint64_t x186; uint64_t x187; x1 = (arg1[1]); x2 = (arg1[2]); x3 = (arg1[3]); x4 = (arg1[0]); mulx_u64(&x5, &x6, x4, (arg2[3])); mulx_u64(&x7, &x8, x4, (arg2[2])); mulx_u64(&x9, &x10, x4, (arg2[1])); mulx_u64(&x11, &x12, x4, (arg2[0])); addcarryx_u64(&x13, &x14, 0x0, x12, x9); addcarryx_u64(&x15, &x16, x14, x10, x7); addcarryx_u64(&x17, &x18, x16, x8, x5); x19 = (x18 + x6); mulx_u64(&x20, &x21, x11, UINT64_C(0xffffffff00000001)); mulx_u64(&x22, &x23, x11, UINT32_C(0xffffffff)); mulx_u64(&x24, &x25, x11, UINT64_C(0xffffffffffffffff)); addcarryx_u64(&x26, &x27, 0x0, x25, x22); x28 = (x27 + x23); addcarryx_u64(&x29, &x30, 0x0, x11, x24); addcarryx_u64(&x31, &x32, x30, x13, x26); addcarryx_u64(&x33, &x34, x32, x15, x28); addcarryx_u64(&x35, &x36, x34, x17, x20); addcarryx_u64(&x37, &x38, x36, x19, x21); mulx_u64(&x39, &x40, x1, (arg2[3])); mulx_u64(&x41, &x42, x1, (arg2[2])); mulx_u64(&x43, &x44, x1, (arg2[1])); mulx_u64(&x45, &x46, x1, (arg2[0])); addcarryx_u64(&x47, &x48, 0x0, x46, x43); addcarryx_u64(&x49, &x50, x48, x44, x41); addcarryx_u64(&x51, &x52, x50, x42, x39); x53 = (x52 + x40); addcarryx_u64(&x54, &x55, 0x0, x31, x45); addcarryx_u64(&x56, &x57, x55, x33, x47); addcarryx_u64(&x58, &x59, x57, x35, x49); addcarryx_u64(&x60, &x61, x59, x37, x51); addcarryx_u64(&x62, &x63, x61, x38, x53); mulx_u64(&x64, &x65, x54, UINT64_C(0xffffffff00000001)); mulx_u64(&x66, &x67, x54, UINT32_C(0xffffffff)); mulx_u64(&x68, &x69, x54, UINT64_C(0xffffffffffffffff)); addcarryx_u64(&x70, &x71, 0x0, x69, x66); x72 = (x71 + x67); addcarryx_u64(&x73, &x74, 0x0, x54, x68); addcarryx_u64(&x75, &x76, x74, x56, x70); addcarryx_u64(&x77, &x78, x76, x58, x72); addcarryx_u64(&x79, &x80, x78, x60, x64); addcarryx_u64(&x81, &x82, x80, x62, x65); x83 = ((uint64_t)x82 + x63); mulx_u64(&x84, &x85, x2, (arg2[3])); mulx_u64(&x86, &x87, x2, (arg2[2])); mulx_u64(&x88, &x89, x2, (arg2[1])); mulx_u64(&x90, &x91, x2, (arg2[0])); addcarryx_u64(&x92, &x93, 0x0, x91, x88); addcarryx_u64(&x94, &x95, x93, x89, x86); addcarryx_u64(&x96, &x97, x95, x87, x84); x98 = (x97 + x85); addcarryx_u64(&x99, &x100, 0x0, x75, x90); addcarryx_u64(&x101, &x102, x100, x77, x92); addcarryx_u64(&x103, &x104, x102, x79, x94); addcarryx_u64(&x105, &x106, x104, x81, x96); addcarryx_u64(&x107, &x108, x106, x83, x98); mulx_u64(&x109, &x110, x99, UINT64_C(0xffffffff00000001)); mulx_u64(&x111, &x112, x99, UINT32_C(0xffffffff)); mulx_u64(&x113, &x114, x99, UINT64_C(0xffffffffffffffff)); addcarryx_u64(&x115, &x116, 0x0, x114, x111); x117 = (x116 + x112); addcarryx_u64(&x118, &x119, 0x0, x99, x113); addcarryx_u64(&x120, &x121, x119, x101, x115); addcarryx_u64(&x122, &x123, x121, x103, x117); addcarryx_u64(&x124, &x125, x123, x105, x109); addcarryx_u64(&x126, &x127, x125, x107, x110); x128 = ((uint64_t)x127 + x108); mulx_u64(&x129, &x130, x3, (arg2[3])); mulx_u64(&x131, &x132, x3, (arg2[2])); mulx_u64(&x133, &x134, x3, (arg2[1])); mulx_u64(&x135, &x136, x3, (arg2[0])); addcarryx_u64(&x137, &x138, 0x0, x136, x133); addcarryx_u64(&x139, &x140, x138, x134, x131); addcarryx_u64(&x141, &x142, x140, x132, x129); x143 = (x142 + x130); addcarryx_u64(&x144, &x145, 0x0, x120, x135); addcarryx_u64(&x146, &x147, x145, x122, x137); addcarryx_u64(&x148, &x149, x147, x124, x139); addcarryx_u64(&x150, &x151, x149, x126, x141); addcarryx_u64(&x152, &x153, x151, x128, x143); mulx_u64(&x154, &x155, x144, UINT64_C(0xffffffff00000001)); mulx_u64(&x156, &x157, x144, UINT32_C(0xffffffff)); mulx_u64(&x158, &x159, x144, UINT64_C(0xffffffffffffffff)); addcarryx_u64(&x160, &x161, 0x0, x159, x156); x162 = (x161 + x157); addcarryx_u64(&x163, &x164, 0x0, x144, x158); addcarryx_u64(&x165, &x166, x164, x146, x160); addcarryx_u64(&x167, &x168, x166, x148, x162); addcarryx_u64(&x169, &x170, x168, x150, x154); addcarryx_u64(&x171, &x172, x170, x152, x155); x173 = ((uint64_t)x172 + x153); subborrowx_u64(&x174, &x175, 0x0, x165, UINT64_C(0xffffffffffffffff)); subborrowx_u64(&x176, &x177, x175, x167, UINT32_C(0xffffffff)); subborrowx_u64(&x178, &x179, x177, x169, 0x0); subborrowx_u64(&x180, &x181, x179, x171, UINT64_C(0xffffffff00000001)); subborrowx_u64(&x182, &x183, x181, x173, 0x0); cmovznz_u64(&x184, x183, x174, x165); cmovznz_u64(&x185, x183, x176, x167); cmovznz_u64(&x186, x183, x178, x169); cmovznz_u64(&x187, x183, x180, x171); out1[0] = x184; out1[1] = x185; out1[2] = x186; out1[3] = x187; } Assembly: ecp_nistz256_mul_mont: push rbp push rbx push r12 push r13 push r14 push r15 xchg rdx, rsi ; hack: swap args mov rbx, rdx mov rax, QWORD PTR [rbx] mov r9, QWORD PTR [rsi] mov r10, QWORD PTR [rsi + 0x08 * 1] mov r11, QWORD PTR [rsi + 0x08 * 2] mov r12, QWORD PTR [rsi + 0x08 * 3] mov rbp, rax mul r9 mov r14, 4294967295 mov r8, rax mov rax, rbp mov r9, rdx mul r10 mov r15, 18446744069414584321 add r9, rax ; #549, #582, #583, #584 mov rax, rbp adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 mov r10, rdx mul r11 add r10, rax ; #553, #554, #594, #595, #596, #644, #645, #646 mov rax, rbp adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 mov r11, rdx mul r12 add r11, rax ; #558, #559, #560, #606, #607, #608, #656, #657, #658, #706, #707, #708 mov rax, r8 adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 xor r13, r13 ; #564 mov r12, rdx mov rbp, r8 shl r8, 32 ; #565, #566, #567 mul r15 ; #625, #626, #627, #628, #687, #688, #689, #690, #749, #750, #751, #752 shr rbp, 32 ; #568, #629, #691, #753 add r9, r8 ; #569, #570, #571 adc r10, rbp ; #572, #573, #574 adc r11, rax ; #575, #576, #577 mov rax, QWORD PTR [rbx + 0x08 * 1] adc r12, rdx ; #578, #579, #580 adc r13, 0 xor r8, r8 ; #581 mov rbp, rax mul QWORD PTR [rsi] add r9, rax ; #549, #582, #583, #584 mov rax, rbp adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 mov rcx, rdx mul QWORD PTR [rsi + 0x08 * 1] add r10, rcx ; #588, #589, #590 adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 add r10, rax ; #553, #554, #594, #595, #596, #644, #645, #646 mov rax, rbp adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 mov rcx, rdx mul QWORD PTR [rsi + 0x08 * 2] add r11, rcx ; #600, #601, #602, #650, #651, #652 adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 add r11, rax ; #558, #559, #560, #606, #607, #608, #656, #657, #658, #706, #707, #708 mov rax, rbp adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 mov rcx, rdx mul QWORD PTR [rsi + 0x08 * 3] add r12, rcx ; #612, #613, #614, #662, #663, #664, #712, #713, #714 adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 add r12, rax ; #618, #619, #620, #668, #669, #670, #718, #719, #720 mov rax, r9 adc r13, rdx ; #621, #622, #623, #639, #640, #641 adc r8, 0 ; #642 mov rbp, r9 shl r9, 32 ; #624 mul r15 ; #625, #626, #627, #628, #687, #688, #689, #690, #749, #750, #751, #752 shr rbp, 32 ; #568, #629, #691, #753 add r10, r9 ; #630, #631, #632 adc r11, rbp ; #633, #634, #635 adc r12, rax ; #636, #637, #638 mov rax, QWORD PTR [rbx + 0x08 * 2] adc r13, rdx ; #621, #622, #623, #639, #640, #641 adc r8, 0 ; #642 xor r9, r9 ; #643 mov rbp, rax mul QWORD PTR [rsi] add r10, rax ; #553, #554, #594, #595, #596, #644, #645, #646 mov rax, rbp adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 mov rcx, rdx mul QWORD PTR [rsi + 0x08 * 1] add r11, rcx ; #600, #601, #602, #650, #651, #652 adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 add r11, rax ; #558, #559, #560, #606, #607, #608, #656, #657, #658, #706, #707, #708 mov rax, rbp adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 mov rcx, rdx mul QWORD PTR [rsi + 0x08 * 2] add r12, rcx ; #612, #613, #614, #662, #663, #664, #712, #713, #714 adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 add r12, rax ; #618, #619, #620, #668, #669, #670, #718, #719, #720 mov rax, rbp adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 mov rcx, rdx mul QWORD PTR [rsi + 0x08 * 3] add r13, rcx ; #674, #675, #676, #724, #725, #726 adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 add r13, rax ; #680, #681, #682, #730, #731, #732 mov rax, r10 adc r8, rdx ; #683, #684, #685, #701, #702, #703 adc r9, 0 ; #704 mov rbp, r10 shl r10, 32 ; #686 mul r15 ; #625, #626, #627, #628, #687, #688, #689, #690, #749, #750, #751, #752 shr rbp, 32 ; #568, #629, #691, #753 add r11, r10 ; #692, #693, #694 adc r12, rbp ; #695, #696, #697 adc r13, rax ; #698, #699, #700 mov rax, QWORD PTR [rbx + 0x08 * 3] adc r8, rdx ; #683, #684, #685, #701, #702, #703 adc r9, 0 ; #704 xor r10, r10 ; #705 mov rbp, rax mul QWORD PTR [rsi] add r11, rax ; #558, #559, #560, #606, #607, #608, #656, #657, #658, #706, #707, #708 mov rax, rbp adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 mov rcx, rdx mul QWORD PTR [rsi + 0x08 * 1] add r12, rcx ; #612, #613, #614, #662, #663, #664, #712, #713, #714 adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 add r12, rax ; #618, #619, #620, #668, #669, #670, #718, #719, #720 mov rax, rbp adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 mov rcx, rdx mul QWORD PTR [rsi + 0x08 * 2] add r13, rcx ; #674, #675, #676, #724, #725, #726 adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 add r13, rax ; #680, #681, #682, #730, #731, #732 mov rax, rbp adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 mov rcx, rdx mul QWORD PTR [rsi + 0x08 * 3] add r8, rcx ; #736, #737, #738 adc rdx, 0 ; #550, #551, #552, #555, #556, #557, #561, #562, #563, #585, #586, #587, #591, #592, #593, #597, #598, #599, #603, #604, #605, #609, #610, #611, #615, #616, #617, #647, #648, #649, #653, #654, #655, #659, #660, #661, #665, #666, #667, #671, #672, #673, #677, #678, #679, #709, #710, #711, #715, #716, #717, #721, #722, #723, #727, #728, #729, #733, #734, #735, #739, #740, #741 add r8, rax ; #742, #743, #744 mov rax, r11 adc r9, rdx ; #745, #746, #747, #763, #764, #765 adc r10, 0 ; #766 mov rbp, r11 shl r11, 32 ; #748 mul r15 ; #625, #626, #627, #628, #687, #688, #689, #690, #749, #750, #751, #752 shr rbp, 32 ; #568, #629, #691, #753 add r12, r11 ; #754, #755, #756 adc r13, rbp ; #757, #758, #759 mov rcx, r12 adc r8, rax ; #760, #761, #762 adc r9, rdx ; #745, #746, #747, #763, #764, #765 mov rbp, r13 adc r10, 0 ; #766 sub r12, 18446744073709551615 ; #767, #768 mov rbx, r8 sbb r13, r14 ; #769, #770, #771 sbb r8, 0 ; #772, #773, #774 mov rdx, r9 sbb r9, r15 ; #775, #776, #777 sbb r10, 0 ; #778, #779, #780 cmovb r12, rcx ; #781 cmovb r13, rbp ; #782 mov QWORD PTR [rdi], r12 cmovb r8, rbx ; #783 mov QWORD PTR [rdi + 0x08 * 1], r13 cmovb r9, rdx ; #784 mov QWORD PTR [rdi + 0x08 * 2], r8 mov QWORD PTR [rdi + 0x08 * 3], r9 mov r15, QWORD PTR [rsp] mov r14, QWORD PTR [rsp + 0x08 * 1] mov r13, QWORD PTR [rsp + 0x08 * 2] mov r12, QWORD PTR [rsp + 0x08 * 3] mov rbx, QWORD PTR [rsp + 0x08 * 4] mov rbp, QWORD PTR [rsp + 0x08 * 5] ;lea rsp,[rsp+0x30] ret Equivalence checking error: Unable to unify: In environment: (*symbolic_state*) {| dag_state := (*dag*)[ (*0*) (old 64 0, []); (*build_inputs*) (*1*) (old 64 1, []); (*build_inputs*) (*2*) (old 64 2, []); (*build_inputs*) (*3*) (old 64 3, []); (*build_inputs*) (*4*) (old 64 4, []); (*build_inputs*) (*5*) (old 64 5, []); (*build_inputs*) (*6*) (old 64 6, []); (*build_inputs*) (*7*) (old 64 7, []); (*build_inputs*) (*8*) (const 0, []); (*9*) (const 18446744073709551616, []); (*10*) (const 64, []); (*11*) (mulZ, [3, 4]); (*12*) (mul 64, [3, 4]); (*13*) (shrZ, [11, 10]); (*14*) (shr 64, [11, 10]); (*15*) (mulZ, [3, 5]); (*16*) (mul 64, [3, 5]); (*17*) (shrZ, [15, 10]); (*18*) (shr 64, [15, 10]); (*19*) (mulZ, [3, 6]); (*20*) (mul 64, [3, 6]); (*21*) (shrZ, [19, 10]); (*22*) (shr 64, [19, 10]); (*23*) (mulZ, [3, 7]); (*24*) (mul 64, [3, 7]); (*25*) (shrZ, [23, 10]); (*26*) (shr 64, [23, 10]); (*27*) (add 64, [20, 26]); (*28*) (addcarryZ 64, [26, 20]); (*29*) (addcarry 64, [20, 26]); (*30*) (add 64, [16, 22, 29]); (*31*) (addcarryZ 64, [29, 22, 16]); (*32*) (addcarry 64, [16, 22, 29]); (*33*) (add 64, [12, 18, 32]); (*34*) (addcarryZ 64, [32, 18, 12]); (*35*) (addcarry 64, [12, 18, 32]); (*36*) (addZ, [35, 14]); (*37*) (add 64, [14, 35]); (*38*) (const 18446744069414584321, []); (*39*) (mulZ, [24, 38]); (*40*) (mul 64, [3, 7, 38]); (*41*) (shrZ, [39, 10]); (*42*) (shr 64, [39, 10]); (*43*) (const 4294967295, []); (*44*) (mulZ, [24, 43]); (*45*) (mul 64, [3, 7, 43]); (*46*) (shrZ, [44, 10]); (*47*) (shr 64, [44, 10]); (*48*) (const 18446744073709551615, []); (*49*) (mulZ, [24, 48]); (*50*) (mul 64, [3, 7, 48]); (*51*) (shrZ, [49, 10]); (*52*) (shr 64, [49, 10]); (*53*) (add 64, [45, 52]); (*54*) (addcarryZ 64, [52, 45]); (*55*) (addcarry 64, [45, 52]); (*56*) (addZ, [55, 47]); (*57*) (add 64, [47, 55]); (*58*) (mul 64, [3, 7, 8]); (*59*) (add 64, [58]); (*60*) (addcarryZ 64, [24, 50]); (*61*) (addcarry 64, [24, 50]); (*62*) (add 64, [20, 26, 45, 52, 61]); (*63*) (addcarryZ 64, [61, 27, 53]); (*64*) (addcarry 64, [27, 53, 61]); (*65*) (add 64, [16, 22, 29, 47, 55, 64]); (*66*) (addcarryZ 64, [64, 30, 57]); (*67*) (addcarry 64, [30, 57, 64]); (*68*) (add 64, [12, 18, 32, 40, 67]); (*69*) (addcarryZ 64, [67, 33, 40]); (*70*) (addcarry 64, [33, 40, 67]); (*71*) (add 64, [14, 35, 42, 70]); (*72*) (addcarryZ 64, [70, 37, 42]); (*73*) (addcarry 64, [37, 42, 70]); (*74*) (mulZ, [2, 4]); (*75*) (mul 64, [2, 4]); (*76*) (shrZ, [74, 10]); (*77*) (shr 64, [74, 10]); (*78*) (mulZ, [2, 5]); (*79*) (mul 64, [2, 5]); (*80*) (shrZ, [78, 10]); (*81*) (shr 64, [78, 10]); (*82*) (mulZ, [2, 6]); (*83*) (mul 64, [2, 6]); (*84*) (shrZ, [82, 10]); (*85*) (shr 64, [82, 10]); (*86*) (mulZ, [2, 7]); (*87*) (mul 64, [2, 7]); (*88*) (shrZ, [86, 10]); (*89*) (shr 64, [86, 10]); (*90*) (add 64, [83, 89]); (*91*) (addcarryZ 64, [89, 83]); (*92*) (addcarry 64, [83, 89]); (*93*) (add 64, [79, 85, 92]); (*94*) (addcarryZ 64, [92, 85, 79]); (*95*) (addcarry 64, [79, 85, 92]); (*96*) (add 64, [75, 81, 95]); (*97*) (addcarryZ 64, [95, 81, 75]); (*98*) (addcarry 64, [75, 81, 95]); (*99*) (addZ, [98, 77]); (*100*) (add 64, [77, 98]); (*101*) (add 64, [20, 26, 45, 52, 61, 87]); (*102*) (addcarryZ 64, [62, 87]); (*103*) (addcarry 64, [62, 87]); (*104*) (add 64, [16, 22, 29, 47, 55, 64, 83, 89, 103]); (*105*) (addcarryZ 64, [103, 65, 90]); (*106*) (addcarry 64, [65, 90, 103]); (*107*) (add 64, [12, 18, 32, 40, 67, 79, 85, 92, 106]); (*108*) (addcarryZ 64, [106, 68, 93]); (*109*) (addcarry 64, [68, 93, 106]); (*110*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109]); (*111*) (addcarryZ 64, [109, 71, 96]); (*112*) (addcarry 64, [71, 96, 109]); (*113*) (add 64, [73, 77, 98, 112]); (*114*) (addcarryZ 64, [112, 73, 100]); (*115*) (addcarry 64, [73, 100, 112]); (*116*) (mulZ, [38, 101]); (*117*) (mul 64, [38, 101]); (*118*) (shrZ, [116, 10]); (*119*) (shr 64, [116, 10]); (*120*) (mulZ, [43, 101]); (*121*) (mul 64, [43, 101]); (*122*) (shrZ, [120, 10]); (*123*) (shr 64, [120, 10]); (*124*) (mulZ, [48, 101]); (*125*) (mul 64, [48, 101]); (*126*) (shrZ, [124, 10]); (*127*) (shr 64, [124, 10]); (*128*) (add 64, [121, 127]); (*129*) (addcarryZ 64, [127, 121]); (*130*) (addcarry 64, [121, 127]); (*131*) (addZ, [130, 123]); (*132*) (add 64, [123, 130]); (*133*) (add 64, [20, 26, 45, 52, 61, 87, 125]); (*134*) (addcarryZ 64, [101, 125]); (*135*) (addcarry 64, [101, 125]); (*136*) (add 64, [16, 22, 29, 47, 55, 64, 83, 89, 103, 121, 127, 135]); (*137*) (addcarryZ 64, [135, 104, 128]); (*138*) (addcarry 64, [104, 128, 135]); (*139*) (add 64, [12, 18, 32, 40, 67, 79, 85, 92, 106, 123, 130, 138]); (*140*) (addcarryZ 64, [138, 107, 132]); (*141*) (addcarry 64, [107, 132, 138]); (*142*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109, 117, 141]); (*143*) (addcarryZ 64, [141, 110, 117]); (*144*) (addcarry 64, [110, 117, 141]); (*145*) (add 64, [73, 77, 98, 112, 119, 144]); (*146*) (addcarryZ 64, [144, 113, 119]); (*147*) (addcarry 64, [113, 119, 144]); (*148*) (addZ, [147, 115]); (*149*) (add 64, [115, 147]); (*150*) (mulZ, [1, 4]); (*151*) (mul 64, [1, 4]); (*152*) (shrZ, [150, 10]); (*153*) (shr 64, [150, 10]); (*154*) (mulZ, [1, 5]); (*155*) (mul 64, [1, 5]); (*156*) (shrZ, [154, 10]); (*157*) (shr 64, [154, 10]); (*158*) (mulZ, [1, 6]); (*159*) (mul 64, [1, 6]); (*160*) (shrZ, [158, 10]); (*161*) (shr 64, [158, 10]); (*162*) (mulZ, [1, 7]); (*163*) (mul 64, [1, 7]); (*164*) (shrZ, [162, 10]); (*165*) (shr 64, [162, 10]); (*166*) (add 64, [159, 165]); (*167*) (addcarryZ 64, [165, 159]); (*168*) (addcarry 64, [159, 165]); (*169*) (add 64, [155, 161, 168]); (*170*) (addcarryZ 64, [168, 161, 155]); (*171*) (addcarry 64, [155, 161, 168]); (*172*) (add 64, [151, 157, 171]); (*173*) (addcarryZ 64, [171, 157, 151]); (*174*) (addcarry 64, [151, 157, 171]); (*175*) (addZ, [174, 153]); (*176*) (add 64, [153, 174]); (*177*) (add 64, [16, 22, 29, 47, 55, 64, 83, 89, 103, 121, 127, 135, 163]); (*178*) (addcarryZ 64, [136, 163]); (*179*) (addcarry 64, [136, 163]); (*180*) (add 64, [12, 18, 32, 40, 67, 79, 85, 92, 106, 123, 130, 138, 159, 165, 179]); (*181*) (addcarryZ 64, [179, 139, 166]); (*182*) (addcarry 64, [139, 166, 179]); (*183*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109, 117, 141, 155, 161, 168, 182]); (*184*) (addcarryZ 64, [182, 142, 169]); (*185*) (addcarry 64, [142, 169, 182]); (*186*) (add 64, [73, 77, 98, 112, 119, 144, 151, 157, 171, 185]); (*187*) (addcarryZ 64, [185, 145, 172]); (*188*) (addcarry 64, [145, 172, 185]); (*189*) (add 64, [115, 147, 153, 174, 188]); (*190*) (addcarryZ 64, [188, 149, 176]); (*191*) (addcarry 64, [149, 176, 188]); (*192*) (mulZ, [38, 177]); (*193*) (mul 64, [38, 177]); (*194*) (shrZ, [192, 10]); (*195*) (shr 64, [192, 10]); (*196*) (mulZ, [43, 177]); (*197*) (mul 64, [43, 177]); (*198*) (shrZ, [196, 10]); (*199*) (shr 64, [196, 10]); (*200*) (mulZ, [48, 177]); (*201*) (mul 64, [48, 177]); (*202*) (shrZ, [200, 10]); (*203*) (shr 64, [200, 10]); (*204*) (add 64, [197, 203]); (*205*) (addcarryZ 64, [203, 197]); (*206*) (addcarry 64, [197, 203]); (*207*) (addZ, [206, 199]); (*208*) (add 64, [199, 206]); (*209*) (add 64, [16, 22, 29, 47, 55, 64, 83, 89, 103, 121, 127, 135, 163, 201]); (*210*) (addcarryZ 64, [177, 201]); (*211*) (addcarry 64, [177, 201]); (*212*) (add 64, [12, 18, 32, 40, 67, 79, 85, 92, 106, 123, 130, 138, 159, 165, 179, 197, 203, 211]); (*213*) (addcarryZ 64, [211, 180, 204]); (*214*) (addcarry 64, [180, 204, 211]); (*215*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109, 117, 141, 155, 161, 168, 182, 199, 206, 214]); (*216*) (addcarryZ 64, [214, 183, 208]); (*217*) (addcarry 64, [183, 208, 214]); (*218*) (add 64, [73, 77, 98, 112, 119, 144, 151, 157, 171, 185, 193, 217]); (*219*) (addcarryZ 64, [217, 186, 193]); (*220*) (addcarry 64, [186, 193, 217]); (*221*) (add 64, [115, 147, 153, 174, 188, 195, 220]); (*222*) (addcarryZ 64, [220, 189, 195]); (*223*) (addcarry 64, [189, 195, 220]); (*224*) (addZ, [223, 191]); (*225*) (add 64, [191, 223]); (*226*) (mulZ, [0, 4]); (*227*) (mul 64, [0, 4]); (*228*) (shrZ, [226, 10]); (*229*) (shr 64, [226, 10]); (*230*) (mulZ, [0, 5]); (*231*) (mul 64, [0, 5]); (*232*) (shrZ, [230, 10]); (*233*) (shr 64, [230, 10]); (*234*) (mulZ, [0, 6]); (*235*) (mul 64, [0, 6]); (*236*) (shrZ, [234, 10]); (*237*) (shr 64, [234, 10]); (*238*) (mulZ, [0, 7]); (*239*) (mul 64, [0, 7]); (*240*) (shrZ, [238, 10]); (*241*) (shr 64, [238, 10]); (*242*) (add 64, [235, 241]); (*243*) (addcarryZ 64, [241, 235]); (*244*) (addcarry 64, [235, 241]); (*245*) (add 64, [231, 237, 244]); (*246*) (addcarryZ 64, [244, 237, 231]); (*247*) (addcarry 64, [231, 237, 244]); (*248*) (add 64, [227, 233, 247]); (*249*) (addcarryZ 64, [247, 233, 227]); (*250*) (addcarry 64, [227, 233, 247]); (*251*) (addZ, [250, 229]); (*252*) (add 64, [229, 250]); (*253*) (add 64, [12, 18, 32, 40, 67, 79, 85, 92, 106, 123, 130, 138, 159, 165, 179, 197, 203, 211, 239]); (*254*) (addcarryZ 64, [212, 239]); (*255*) (addcarry 64, [212, 239]); (*256*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109, 117, 141, 155, 161, 168, 182, 199, 206, 214, 235, 241, 255]); (*257*) (addcarryZ 64, [255, 215, 242]); (*258*) (addcarry 64, [215, 242, 255]); (*259*) (add 64, [73, 77, 98, 112, 119, 144, 151, 157, 171, 185, 193, 217, 231, 237, 244, 258]); (*260*) (addcarryZ 64, [258, 218, 245]); (*261*) (addcarry 64, [218, 245, 258]); (*262*) (add 64, [115, 147, 153, 174, 188, 195, 220, 227, 233, 247, 261]); (*263*) (addcarryZ 64, [261, 221, 248]); (*264*) (addcarry 64, [221, 248, 261]); (*265*) (add 64, [191, 223, 229, 250, 264]); (*266*) (addcarryZ 64, [264, 225, 252]); (*267*) (addcarry 64, [225, 252, 264]); (*268*) (mulZ, [38, 253]); (*269*) (mul 64, [38, 253]); (*270*) (shrZ, [268, 10]); (*271*) (shr 64, [268, 10]); (*272*) (mulZ, [43, 253]); (*273*) (mul 64, [43, 253]); (*274*) (shrZ, [272, 10]); (*275*) (shr 64, [272, 10]); (*276*) (mulZ, [48, 253]); (*277*) (mul 64, [48, 253]); (*278*) (shrZ, [276, 10]); (*279*) (shr 64, [276, 10]); (*280*) (add 64, [273, 279]); (*281*) (addcarryZ 64, [279, 273]); (*282*) (addcarry 64, [273, 279]); (*283*) (addZ, [282, 275]); (*284*) (add 64, [275, 282]); (*285*) (add 64, [12, 18, 32, 40, 67, 79, 85, 92, 106, 123, 130, 138, 159, 165, 179, 197, 203, 211, 239, 277]); (*286*) (addcarryZ 64, [253, 277]); (*287*) (addcarry 64, [253, 277]); (*288*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109, 117, 141, 155, 161, 168, 182, 199, 206, 214, 235, 241, 255, 273, 279, 287]); (*289*) (addcarryZ 64, [287, 256, 280]); (*290*) (addcarry 64, [256, 280, 287]); (*291*) (add 64, [73, 77, 98, 112, 119, 144, 151, 157, 171, 185, 193, 217, 231, 237, 244, 258, 275, 282, 290]); (*292*) (addcarryZ 64, [290, 259, 284]); (*293*) (addcarry 64, [259, 284, 290]); (*294*) (add 64, [115, 147, 153, 174, 188, 195, 220, 227, 233, 247, 261, 269, 293]); (*295*) (addcarryZ 64, [293, 262, 269]); (*296*) (addcarry 64, [262, 269, 293]); (*297*) (add 64, [191, 223, 229, 250, 264, 271, 296]); (*298*) (addcarryZ 64, [296, 265, 271]); (*299*) (addcarry 64, [265, 271, 296]); (*300*) (addZ, [299, 267]); (*301*) (add 64, [267, 299]); (*302*) (const 1, []); (*303*) (add 64, [14, 35, 42, 70, 75, 81, 95, 109, 117, 141, 155, 161, 168, 182, 199, 206, 214, 235, 241, 255, 273, 279, 287, 302]); (*304*) (subborrowZ 64, [288, 48]); (*305*) (subborrow 64, [288, 48]); (*306*) (neg 64, [305]); (*307*) (add 64, [38, 73, 77, 98, 112, 119, 144, 151, 157, 171, 185, 193, 217, 231, 237, 244, 258, 275, 282, 290, 306]); (*308*) (subborrowZ 64, [291, 43, 305]); (*309*) (subborrow 64, [291, 43, 305]); (*310*) (neg 64, [309]); (*311*) (add 64, [115, 147, 153, 174, 188, 195, 220, 227, 233, 247, 261, 269, 293, 310]); (*312*) (subborrowZ 64, [294, 309]); (*313*) (subborrow 64, [294, 309]); (*314*) (neg 64, [313]); (*315*) (add 64, [43, 191, 223, 229, 250, 264, 271, 296, 314]); (*316*) (subborrowZ 64, [297, 38, 313]); (*317*) (subborrow 64, [297, 38, 313]); (*318*) (neg 64, [317]); (*319*) (add 64, [267, 299, 318]); (*320*) (subborrowZ 64, [301, 317]); (*321*) (subborrow 64, [301, 317]); (*322*) (selectznz, [321, 303, 288]); (*323*) (selectznz, [321, 307, 291]); (*324*) (selectznz, [321, 311, 294]); (*325*) (selectznz, [321, 315, 297]); (*326*) (old 64 326, []); (*init_symbolic_state*) (*327*) (old 64 327, []); (*init_symbolic_state*) (*328*) (old 64 328, []); (*init_symbolic_state*) (*329*) (old 64 329, []); (*init_symbolic_state*) (*330*) (old 64 330, []); (*init_symbolic_state*) (*331*) (old 64 331, []); (*init_symbolic_state*) (*332*) (old 64 332, []); (*init_symbolic_state*) (*333*) (old 64 333, []); (*init_symbolic_state*) (*334*) (old 64 334, []); (*init_symbolic_state*) (*335*) (old 64 335, []); (*init_symbolic_state*) (*336*) (old 64 336, []); (*init_symbolic_state*) (*337*) (old 64 337, []); (*init_symbolic_state*) (*338*) (old 64 338, []); (*init_symbolic_state*) (*339*) (old 64 339, []); (*init_symbolic_state*) (*340*) (old 64 340, []); (*init_symbolic_state*) (*341*) (old 64 341, []); (*init_symbolic_state*) (*342*) (old 64 342, []); (*output_placeholders*) (*343*) (old 64 343, []); (*output_placeholders*) (*344*) (old 64 344, []); (*output_placeholders*) (*345*) (old 64 345, []); (*output_placeholders*) (*346*) (old 64 346, []); (*inputaddrs*) (*347*) (const 8, []); (*inputaddrs*) (*348*) (add 64, [346, 347]); (*inputaddrs*) (*349*) (const 16, []); (*inputaddrs*) (*350*) (add 64, [346, 349]); (*inputaddrs*) (*351*) (const 24, []); (*inputaddrs*) (*352*) (add 64, [346, 351]); (*inputaddrs*) (*353*) (old 64 353, []); (*inputaddrs*) (*354*) (add 64, [347, 353]); (*inputaddrs*) (*355*) (add 64, [349, 353]); (*inputaddrs*) (*356*) (add 64, [351, 353]); (*inputaddrs*) (*357*) (old 64 357, []); (*outputaddrs*) (*358*) (add 64, [347, 357]); (*outputaddrs*) (*359*) (add 64, [349, 357]); (*outputaddrs*) (*360*) (add 64, [351, 357]); (*outputaddrs*) (*361*) (old 64 361, []); (*stack_base*) (*362*) (old 64 362, []); (*stack_base*) (*363*) (old 64 363, []); (*stack_base*) (*364*) (old 64 364, []); (*stack_base*) (*365*) (old 64 365, []); (*stack_base*) (*366*) (old 64 366, []); (*stack_base*) (*367*) (old 64 367, []); (*stack_base*) (*368*) (old 64 368, []); (*stack_base*) (*369*) (old 64 369, []); (*stack_base*) (*370*) (old 64 370, []); (*stack_base*) (*371*) (old 64 371, []); (*stack_base*) (*372*) (old 64 372, []); (*stack_base*) (*373*) (old 64 373, []); (*stack_base*) (*374*) (old 64 374, []); (*stack_base*) (*375*) (old 64 375, []); (*stack_base*) (*376*) (old 64 376, []); (*stack_base*) (*377*) (old 64 377, []); (*stack_base*) (*378*) (old 64 378, []); (*stack_base*) (*379*) (old 64 379, []); (*stack_base*) (*380*) (old 64 380, []); (*stack_base*) (*381*) (old 64 381, []); (*stack_base*) (*382*) (old 64 382, []); (*stack_base*) (*383*) (old 64 383, []); (*stack_base*) (*384*) (old 64 384, []); (*stack_base*) (*385*) (old 64 385, []); (*stack_base*) (*386*) (old 64 386, []); (*stack_base*) (*387*) (old 64 387, []); (*stack_base*) (*388*) (old 64 388, []); (*stack_base*) (*389*) (old 64 389, []); (*stack_base*) (*390*) (old 64 390, []); (*stack_base*) (*391*) (old 64 391, []); (*stack_base*) (*392*) (old 64 392, []); (*stack_base*) (*393*) (old 64 393, []); (*stack_base*) (*394*) (old 64 394, []); (*stack_base*) (*395*) (old 64 395, []); (*stack_base*) (*396*) (old 64 396, []); (*stack_base*) (*397*) (old 64 397, []); (*stack_base*) (*398*) (old 64 398, []); (*stack_base*) (*399*) (old 64 399, []); (*stack_base*) (*400*) (old 64 400, []); (*stack_base*) (*401*) (old 64 401, []); (*stack_base*) (*402*) (old 64 402, []); (*stack_base*) (*403*) (old 64 403, []); (*stack_base*) (*404*) (old 64 404, []); (*stack_base*) (*405*) (old 64 405, []); (*stack_base*) (*406*) (old 64 406, []); (*stack_base*) (*407*) (old 64 407, []); (*stack_base*) (*408*) (old 64 408, []); (*stack_base*) (*409*) (old 64 409, []); (*stack_base*) (*410*) (const 18446744073709551232, []); (*stack_base*) (*411*) (add 64, [409, 410]); (*stack_base*) (*412*) (const 18446744073709551240, []); (*stack_base*) (*413*) (add 64, [409, 412]); (*stack_base*) (*414*) (const 18446744073709551248, []); (*stack_base*) (*415*) (add 64, [409, 414]); (*stack_base*) (*416*) (const 18446744073709551256, []); (*stack_base*) (*417*) (add 64, [409, 416]); (*stack_base*) (*418*) (const 32, []); (*stack_base*) (*419*) (const 18446744073709551264, []); (*stack_base*) (*420*) (add 64, [409, 419]); (*stack_base*) (*421*) (const 40, []); (*stack_base*) (*422*) (const 18446744073709551272, []); (*stack_base*) (*423*) (add 64, [409, 422]); (*stack_base*) (*424*) (const 48, []); (*stack_base*) (*425*) (const 18446744073709551280, []); (*stack_base*) (*426*) (add 64, [409, 425]); (*stack_base*) (*427*) (const 56, []); (*stack_base*) (*428*) (const 18446744073709551288, []); (*stack_base*) (*429*) (add 64, [409, 428]); (*stack_base*) (*430*) (const 18446744073709551296, []); (*stack_base*) (*431*) (add 64, [409, 430]); (*stack_base*) (*432*) (const 72, []); (*stack_base*) (*433*) (const 18446744073709551304, []); (*stack_base*) (*434*) (add 64, [409, 433]); (*stack_base*) (*435*) (const 80, []); (*stack_base*) (*436*) (const 18446744073709551312, []); (*stack_base*) (*437*) (add 64, [409, 436]); (*stack_base*) (*438*) (const 88, []); (*stack_base*) (*439*) (const 18446744073709551320, []); (*stack_base*) (*440*) (add 64, [409, 439]); (*stack_base*) (*441*) (const 96, []); (*stack_base*) (*442*) (const 18446744073709551328, []); (*stack_base*) (*443*) (add 64, [409, 442]); (*stack_base*) (*444*) (const 104, []); (*stack_base*) (*445*) (const 18446744073709551336, []); (*stack_base*) (*446*) (add 64, [409, 445]); (*stack_base*) (*447*) (const 112, []); (*stack_base*) (*448*) (const 18446744073709551344, []); (*stack_base*) (*449*) (add 64, [409, 448]); (*stack_base*) (*450*) (const 120, []); (*stack_base*) (*451*) (const 18446744073709551352, []); (*stack_base*) (*452*) (add 64, [409, 451]); (*stack_base*) (*453*) (const 128, []); (*stack_base*) (*454*) (const 18446744073709551360, []); (*stack_base*) (*455*) (add 64, [409, 454]); (*stack_base*) (*456*) (const 136, []); (*stack_base*) (*457*) (const 18446744073709551368, []); (*stack_base*) (*458*) (add 64, [409, 457]); (*stack_base*) (*459*) (const 144, []); (*stack_base*) (*460*) (const 18446744073709551376, []); (*stack_base*) (*461*) (add 64, [409, 460]); (*stack_base*) (*462*) (const 152, []); (*stack_base*) (*463*) (const 18446744073709551384, []); (*stack_base*) (*464*) (add 64, [409, 463]); (*stack_base*) (*465*) (const 160, []); (*stack_base*) (*466*) (const 18446744073709551392, []); (*stack_base*) (*467*) (add 64, [409, 466]); (*stack_base*) (*468*) (const 168, []); (*stack_base*) (*469*) (const 18446744073709551400, []); (*stack_base*) (*470*) (add 64, [409, 469]); (*stack_base*) (*471*) (const 176, []); (*stack_base*) (*472*) (const 18446744073709551408, []); (*stack_base*) (*473*) (add 64, [409, 472]); (*stack_base*) (*474*) (const 184, []); (*stack_base*) (*475*) (const 18446744073709551416, []); (*stack_base*) (*476*) (add 64, [409, 475]); (*stack_base*) (*477*) (const 192, []); (*stack_base*) (*478*) (const 18446744073709551424, []); (*stack_base*) (*479*) (add 64, [409, 478]); (*stack_base*) (*480*) (const 200, []); (*stack_base*) (*481*) (const 18446744073709551432, []); (*stack_base*) (*482*) (add 64, [409, 481]); (*stack_base*) (*483*) (const 208, []); (*stack_base*) (*484*) (const 18446744073709551440, []); (*stack_base*) (*485*) (add 64, [409, 484]); (*stack_base*) (*486*) (const 216, []); (*stack_base*) (*487*) (const 18446744073709551448, []); (*stack_base*) (*488*) (add 64, [409, 487]); (*stack_base*) (*489*) (const 224, []); (*stack_base*) (*490*) (const 18446744073709551456, []); (*stack_base*) (*491*) (add 64, [409, 490]); (*stack_base*) (*492*) (const 232, []); (*stack_base*) (*493*) (const 18446744073709551464, []); (*stack_base*) (*494*) (add 64, [409, 493]); (*stack_base*) (*495*) (const 240, []); (*stack_base*) (*496*) (const 18446744073709551472, []); (*stack_base*) (*497*) (add 64, [409, 496]); (*stack_base*) (*498*) (const 248, []); (*stack_base*) (*499*) (const 18446744073709551480, []); (*stack_base*) (*500*) (add 64, [409, 499]); (*stack_base*) (*501*) (const 256, []); (*stack_base*) (*502*) (const 18446744073709551488, []); (*stack_base*) (*503*) (add 64, [409, 502]); (*stack_base*) (*504*) (const 264, []); (*stack_base*) (*505*) (const 18446744073709551496, []); (*stack_base*) (*506*) (add 64, [409, 505]); (*stack_base*) (*507*) (const 272, []); (*stack_base*) (*508*) (const 18446744073709551504, []); (*stack_base*) (*509*) (add 64, [409, 508]); (*stack_base*) (*510*) (const 280, []); (*stack_base*) (*511*) (const 18446744073709551512, []); (*stack_base*) (*512*) (add 64, [409, 511]); (*stack_base*) (*513*) (const 288, []); (*stack_base*) (*514*) (const 18446744073709551520, []); (*stack_base*) (*515*) (add 64, [409, 514]); (*stack_base*) (*516*) (const 296, []); (*stack_base*) (*517*) (const 18446744073709551528, []); (*stack_base*) (*518*) (add 64, [409, 517]); (*stack_base*) (*519*) (const 304, []); (*stack_base*) (*520*) (const 18446744073709551536, []); (*stack_base*) (*521*) (add 64, [409, 520]); (*stack_base*) (*522*) (const 312, []); (*stack_base*) (*523*) (const 18446744073709551544, []); (*stack_base*) (*524*) (add 64, [409, 523]); (*stack_base*) (*525*) (const 320, []); (*stack_base*) (*526*) (const 18446744073709551552, []); (*stack_base*) (*527*) (add 64, [409, 526]); (*stack_base*) (*528*) (const 328, []); (*stack_base*) (*529*) (const 18446744073709551560, []); (*stack_base*) (*530*) (add 64, [409, 529]); (*stack_base*) (*531*) (const 336, []); (*stack_base*) (*532*) (const 18446744073709551568, []); (*stack_base*) (*533*) (add 64, [409, 532]); (*stack_base*) (*534*) (const 344, []); (*stack_base*) (*535*) (const 18446744073709551576, []); (*stack_base*) (*536*) (add 64, [409, 535]); (*stack_base*) (*537*) (const 352, []); (*stack_base*) (*538*) (const 18446744073709551584, []); (*stack_base*) (*539*) (add 64, [409, 538]); (*stack_base*) (*540*) (const 360, []); (*stack_base*) (*541*) (const 18446744073709551592, []); (*stack_base*) (*542*) (add 64, [409, 541]); (*stack_base*) (*543*) (const 368, []); (*stack_base*) (*544*) (const 18446744073709551600, []); (*stack_base*) (*545*) (add 64, [409, 544]); (*stack_base*) (*546*) (const 376, []); (*stack_base*) (*547*) (const 18446744073709551608, []); (*stack_base*) (*548*) (add 64, [409, 547]); (*stack_base*) (*549*) (addoverflow 64, [20, 26]); (*550*) (add 64, [22, 29]); (*551*) (addcarry 64, [22, 29]); (*552*) (addoverflow 64, [22, 29]); (*553*) (addcarry 64, [16, 550]); (*554*) (addoverflow 64, [16, 550]); (*555*) (add 64, [18, 553]); (*556*) (addcarry 64, [18, 553]); (*557*) (addoverflow 64, [18, 553]); (*558*) (add 64, [12, 18, 553]); (*559*) (addcarry 64, [12, 555]); (*560*) (addoverflow 64, [12, 555]); (*561*) (add 64, [14, 559]); (*562*) (addcarry 64, [14, 559]); (*563*) (addoverflow 64, [14, 559]); (*564*) (xorZ, [339, 339]); (*565*) (const 63, []); (*566*) (const 4294967296, []); (*567*) (mul 64, [3, 7, 566]); (*568*) (shr 64, [24, 418]); (*569*) (add 64, [20, 26, 567]); (*570*) (addcarry 64, [27, 567]); (*571*) (addoverflow 64, [27, 567]); (*572*) (add 64, [16, 22, 29, 568, 570]); (*573*) (addcarry 64, [30, 568, 570]); (*574*) (addoverflow 64, [30, 568, 570]); (*575*) (add 64, [12, 18, 40, 553, 573]); (*576*) (addcarry 64, [40, 558, 573]); (*577*) (addoverflow 64, [40, 558, 573]); (*578*) (add 64, [14, 42, 559, 576]); (*579*) (addcarry 64, [42, 561, 576]); (*580*) (addoverflow 64, [42, 561, 576]); (*581*) (xorZ, [567, 567]); (*582*) (add 64, [20, 26, 87, 567]); (*583*) (addcarry 64, [87, 569]); (*584*) (addoverflow 64, [87, 569]); (*585*) (add 64, [89, 583]); (*586*) (addcarry 64, [89, 583]); (*587*) (addoverflow 64, [89, 583]); (*588*) (add 64, [16, 22, 29, 89, 568, 570, 583]); (*589*) (addcarry 64, [572, 585]); (*590*) (addoverflow 64, [572, 585]); (*591*) (add 64, [85, 589]); (*592*) (addcarry 64, [85, 589]); (*593*) (addoverflow 64, [85, 589]); (*594*) (add 64, [16, 22, 29, 83, 89, 568, 570, 583]); (*595*) (addcarry 64, [83, 588]); (*596*) (addoverflow 64, [83, 588]); (*597*) (add 64, [85, 589, 595]); (*598*) (addcarry 64, [591, 595]); (*599*) (addoverflow 64, [591, 595]); (*600*) (add 64, [12, 18, 40, 85, 553, 573, 589, 595]); (*601*) (addcarry 64, [575, 597]); (*602*) (addoverflow 64, [575, 597]); (*603*) (add 64, [81, 601]); (*604*) (addcarry 64, [81, 601]); (*605*) (addoverflow 64, [81, 601]); (*606*) (add 64, [12, 18, 40, 79, 85, 553, 573, 589, 595]); (*607*) (addcarry 64, [79, 600]); (*608*) (addoverflow 64, [79, 600]); (*609*) (add 64, [81, 601, 607]); (*610*) (addcarry 64, [603, 607]); (*611*) (addoverflow 64, [603, 607]); (*612*) (add 64, [14, 42, 81, 559, 576, 601, 607]); (*613*) (addcarry 64, [578, 609]); (*614*) (addoverflow 64, [578, 609]); (*615*) (add 64, [77, 613]); (*616*) (addcarry 64, [77, 613]); (*617*) (addoverflow 64, [77, 613]); (*618*) (add 64, [14, 42, 75, 81, 559, 576, 601, 607]); (*619*) (addcarry 64, [75, 612]); (*620*) (addoverflow 64, [75, 612]); (*621*) (add 64, [77, 579, 613, 619]); (*622*) (addcarry 64, [579, 615, 619]); (*623*) (addoverflow 64, [579, 615, 619]); (*624*) (mul 64, [566, 582]); (*625*) (mulZ, [38, 582]); (*626*) (shrZ, [625, 10]); (*627*) (mul 64, [38, 582]); (*628*) (shr 64, [625, 10]); (*629*) (shr 64, [582, 418]); (*630*) (add 64, [16, 22, 29, 83, 89, 568, 570, 583, 624]); (*631*) (addcarry 64, [594, 624]); (*632*) (addoverflow 64, [594, 624]); (*633*) (add 64, [12, 18, 40, 79, 85, 553, 573, 589, 595, 629, 631]); (*634*) (addcarry 64, [606, 629, 631]); (*635*) (addoverflow 64, [606, 629, 631]); (*636*) (add 64, [14, 42, 75, 81, 559, 576, 601, 607, 627, 634]); (*637*) (addcarry 64, [618, 627, 634]); (*638*) (addoverflow 64, [618, 627, 634]); (*639*) (add 64, [77, 579, 613, 619, 628, 637]); (*640*) (addcarry 64, [621, 628, 637]); (*641*) (addoverflow 64, [621, 628, 637]); (*642*) (add 64, [622, 640]); (*643*) (xorZ, [624, 624]); (*644*) (add 64, [16, 22, 29, 83, 89, 163, 568, 570, 583, 624]); (*645*) (addcarry 64, [163, 630]); (*646*) (addoverflow 64, [163, 630]); (*647*) (add 64, [165, 645]); (*648*) (addcarry 64, [165, 645]); (*649*) (addoverflow 64, [165, 645]); (*650*) (add 64, [12, 18, 40, 79, 85, 165, 553, 573, 589, 595, 629, 631, 645]); (*651*) (addcarry 64, [633, 647]); (*652*) (addoverflow 64, [633, 647]); (*653*) (add 64, [161, 651]); (*654*) (addcarry 64, [161, 651]); (*655*) (addoverflow 64, [161, 651]); (*656*) (add 64, [12, 18, 40, 79, 85, 159, 165, 553, 573, 589, 595, 629, 631, 645]); (*657*) (addcarry 64, [159, 650]); (*658*) (addoverflow 64, [159, 650]); (*659*) (add 64, [161, 651, 657]); (*660*) (addcarry 64, [653, 657]); (*661*) (addoverflow 64, [653, 657]); (*662*) (add 64, [14, 42, 75, 81, 161, 559, 576, 601, 607, 627, 634, 651, 657]); (*663*) (addcarry 64, [636, 659]); (*664*) (addoverflow 64, [636, 659]); (*665*) (add 64, [157, 663]); (*666*) (addcarry 64, [157, 663]); (*667*) (addoverflow 64, [157, 663]); (*668*) (add 64, [14, 42, 75, 81, 155, 161, 559, 576, 601, 607, 627, 634, 651, 657]); (*669*) (addcarry 64, [155, 662]); (*670*) (addoverflow 64, [155, 662]); (*671*) (add 64, [157, 663, 669]); (*672*) (addcarry 64, [665, 669]); (*673*) (addoverflow 64, [665, 669]); (*674*) (add 64, [77, 157, 579, 613, 619, 628, 637, 663, 669]); (*675*) (addcarry 64, [639, 671]); (*676*) (addoverflow 64, [639, 671]); (*677*) (add 64, [153, 675]); (*678*) (addcarry 64, [153, 675]); (*679*) (addoverflow 64, [153, 675]); (*680*) (add 64, [77, 151, 157, 579, 613, 619, 628, 637, 663, 669]); (*681*) (addcarry 64, [151, 674]); (*682*) (addoverflow 64, [151, 674]); (*683*) (add 64, [153, 622, 640, 675, 681]); (*684*) (addcarry 64, [642, 677, 681]); (*685*) (addoverflow 64, [642, 677, 681]); (*686*) (mul 64, [566, 644]); (*687*) (mulZ, [38, 644]); (*688*) (shrZ, [687, 10]); (*689*) (mul 64, [38, 644]); (*690*) (shr 64, [687, 10]); (*691*) (shr 64, [644, 418]); (*692*) (add 64, [12, 18, 40, 79, 85, 159, 165, 553, 573, 589, 595, 629, 631, 645, 686]); (*693*) (addcarry 64, [656, 686]); (*694*) (addoverflow 64, [656, 686]); (*695*) (add 64, [14, 42, 75, 81, 155, 161, 559, 576, 601, 607, 627, 634, 651, 657, 691, 693]); (*696*) (addcarry 64, [668, 691, 693]); (*697*) (addoverflow 64, [668, 691, 693]); (*698*) (add 64, [77, 151, 157, 579, 613, 619, 628, 637, 663, 669, 689, 696]); (*699*) (addcarry 64, [680, 689, 696]); (*700*) (addoverflow 64, [680, 689, 696]); (*701*) (add 64, [153, 622, 640, 675, 681, 690, 699]); (*702*) (addcarry 64, [683, 690, 699]); (*703*) (addoverflow 64, [683, 690, 699]); (*704*) (add 64, [684, 702]); (*705*) (xorZ, [686, 686]); (*706*) (add 64, [12, 18, 40, 79, 85, 159, 165, 239, 553, 573, 589, 595, 629, 631, 645, 686]); (*707*) (addcarry 64, [239, 692]); (*708*) (addoverflow 64, [239, 692]); (*709*) (add 64, [241, 707]); (*710*) (addcarry 64, [241, 707]); (*711*) (addoverflow 64, [241, 707]); (*712*) (add 64, [14, 42, 75, 81, 155, 161, 241, 559, 576, 601, 607, 627, 634, 651, 657, 691, 693, 707]); (*713*) (addcarry 64, [695, 709]); (*714*) (addoverflow 64, [695, 709]); (*715*) (add 64, [237, 713]); (*716*) (addcarry 64, [237, 713]); (*717*) (addoverflow 64, [237, 713]); (*718*) (add 64, [14, 42, 75, 81, 155, 161, 235, 241, 559, 576, 601, 607, 627, 634, 651, 657, 691, 693, 707]); (*719*) (addcarry 64, [235, …
Hopefully this is fast enough to merge the other changes, and we can revert this at some future point when I figure out how to annotate dag indices with actual descriptions without making everything insanely slow.
<details><summary>Timing Diff since 831d611</summary> <p> ``` After | Peak Mem | File Name | Before | Peak Mem || Change || Change (mem) | % Change | % Change (mem) ---------------------------------------------------------------------------------------------------------------------------------------------------- 122m52.12s | 302392 ko | Total Time / Peak Mem | 103m15.27s | 327788 ko || +19m36.84s || -25396 ko | +18.99% | -7.74% ---------------------------------------------------------------------------------------------------------------------------------------------------- 52m14.52s | 302392 ko | fiat-amd64/p434-mul.only-status | 44m39.92s | 327788 ko || +7m34.59s || -25396 ko | +16.96% | -7.74% 49m00.50s | 289860 ko | fiat-amd64/p434-square.only-status | 41m46.12s | 280424 ko || +7m14.38s || 9436 ko | +17.33% | +3.36% 7m03.23s | 93856 ko | fiat-amd64/p384-mul.only-status | 5m42.95s | 93804 ko || +1m20.28s || 52 ko | +23.40% | +0.05% 6m17.51s | 93864 ko | fiat-amd64/p384-square.only-status | 5m09.63s | 84120 ko || +1m07.87s || 9744 ko | +21.92% | +11.58% 2m30.23s | 54948 ko | fiat-amd64/p448-mul.only-status | 1m46.05s | 50276 ko || +0m44.17s || 4672 ko | +41.65% | +9.29% 1m47.63s | 50300 ko | fiat-amd64/p521-mul.only-status | 1m10.69s | 45816 ko || +0m36.93s || 4484 ko | +52.25% | +9.78% 0m57.09s | 42144 ko | fiat-amd64/p448-square.only-status | 0m41.84s | 42168 ko || +0m15.25s || -24 ko | +36.44% | -0.05% 0m33.92s | 38808 ko | fiat-amd64/p224-square.only-status | 0m21.42s | 38620 ko || +0m12.50s || 188 ko | +58.35% | +0.48% 0m33.25s | 36436 ko | fiat-amd64/p224-mul.only-status | 0m22.94s | 36476 ko || +0m10.30s || -40 ko | +44.94% | -0.10% 0m17.85s | 36252 ko | fiat-amd64/p256-mul.only-status | 0m11.08s | 36188 ko || +0m06.77s || 64 ko | +61.10% | +0.17% 0m14.26s | 38104 ko | fiat-amd64/p256-square.only-status | 0m09.72s | 37620 ko || +0m04.53s || 484 ko | +46.70% | +1.28% 0m24.48s | 38740 ko | fiat-amd64/secp256k1-mul.only-status | 0m20.92s | 38780 ko || +0m03.55s || -40 ko | +17.01% | -0.10% 0m23.26s | 36288 ko | fiat-amd64/p521-square.only-status | 0m20.08s | 36152 ko || +0m03.18s || 136 ko | +15.83% | +0.37% 0m22.36s | 38900 ko | fiat-amd64/secp256k1-square.only-status | 0m19.29s | 39028 ko || +0m03.07s || -128 ko | +15.91% | -0.32% 0m04.70s | 26684 ko | fiat-amd64/curve25519-mul.only-status | 0m05.02s | 25720 ko || -0m00.31s || 964 ko | -6.37% | +3.74% 0m03.60s | 26752 ko | fiat-amd64/curve25519-square.only-status | 0m03.85s | 25700 ko || -0m00.25s || 1052 ko | -6.49% | +4.09% 0m01.88s | 22584 ko | fiat-amd64/poly1305-mul.only-status | 0m01.93s | 22620 ko || -0m00.05s || -36 ko | -2.59% | -0.15% 0m01.66s | 22168 ko | fiat-amd64/poly1305-square.only-status | 0m01.70s | 22296 ko || -0m00.04s || -128 ko | -2.35% | -0.57% 0m00.19s | 21616 ko | fiat-amd64/p224-sub.only-status | 0m00.13s | 21284 ko || +0m00.06s || 332 ko | +46.15% | +1.55% ``` </p> </details>
e02b6bf
to
fc67b9d
Compare
Here's some performance data for p256 mul (compare with #1207 (comment)) |
Some of the overhead seems to come from the extra detupling in the function passed to |
Will try to performance optimize later, got OOB ack from Andres on merging |
Here's some perf data from p448, compare with #1207 (comment): |
It seems like another significant chunk of the overhead is caused by increase GC (which would explain the high variance in overheads), and the a significant chunk of the time the GC is invoked by |
The dag now prints out where some indices came from, and assembly lines
generating values that don't exist in PHOAS are annotated with the dag
indices they generate.
We also pave the way to display information about indices that appear
only in the assembly and not in the PHOAS. I'm not sure what the right strategy here is.
One idea:
When we transition from an inequality that contains indices appearing only in assembly to an inequality containing indices that appear in both, recursively expand all indices appearing only in asm
Edit: performance is pretty terrible:
Timing Diff since 831d6114a2c4b13b51c5985a48185c7bd2d5a3a8