Potential overflow in mask computation #22

dmohns · 2024-05-10T14:11:38Z

We discovered a potential overflow in mask computation at https://github.com/EnAccess/OpenPAYGO-HW/blob/main/src/extended/opaygo_core_extended.c#L17

uint64_t extractBitsExtended(uint64_t source, unsigned from, unsigned to)
{
    unsigned long long mask = ( (1ull<<(to-from+1ull))-1ull) << from;
    return (source & mask) >> 24;
}

The additional +1ull does not have an effect on the computation. However, internally it makes the number flow out of uint64_t bounds. The result is still correct when compiled with GCC for example for 64/24 the result

0xFFFFFFFFFF000000

is produced.

There are two concerns about it

Various C compiler might have different overflow behaviour on different platform. It's hard to guarantee this will be working consistently on all platforms and hardware.
This behaviour can encounter problems when porting the feature to different programming languages with stricter overflow rules. See the following example in go

package main

import "fmt"

func main() {
	test := ((uint64(1) << (64 - 24 + 1)) - 1) << 24
	fmt.Printf("Mask in hexadecimal: %X\n", test)
}

yields

./prog.go:6:10: ((uint64(1) << (64 - 24 + 1)) - 1) << 24 (constant 36893488147402326016 of type uint64) overflows uint64

The text was updated successfully, but these errors were encountered:

danirafi · 2024-05-16T04:44:43Z

If I have understood the code correctly, there are also two other concerns if we really want a dynamically masking function; the hardcoded "24" on the return shift and the overflow behaviour described above prevent correct extraction of internal bits e.g. bits 24-48. If we want a dynamic solution the quickest fix is to remove the +1 and put "from" in the return shift.

uint64_t extractBitsExtended(uint64_t source, unsigned from, unsigned to)
{
    unsigned long long mask = ( (1ull<<(to-from))-1ull) << from;
    return (source & mask) >> from;
}

This should work for most values except, as far as I can tell, extracting all 0-64 bits, in which case the 1 still shifts to the 65th bit. It would defeat the point of extracting bits if you take them all anyway but could present unexpected behaviour under that edge case.

Similar code is present in the non-extended version of the code with 32-bit values except the return shift is correctly in place. In the python port the "to" and "from" values are hardcoded.

For our application I would recommend simply hardcoding the two different bitmasks required for the upper 40 bits of 64 bit hashes and the upper 30 bits of the 32 bit hashes. This should remove any risk of weird behaviour and make porting easier/clearer in future.

#define UPPER_40BIT_MASK        0xFFFFFFFFFF000000ull
#define UPPER_40BIT_MASK_SHIFT                  24            
//extract upper 40 bits
uint64_t result = ( this_hash & UPPER_40BIT_MASK ) >> UPPER_40BIT_MASK_SHIFT;

or

#define UPPER_30BIT_MASK        0xFFFFFFFC
#define UPPER_30BIT_MASK_SHIFT                  2            
//extract upper 30 bits
uint32_t result = ( this_hash & UPPER_30BIT_MASK ) >> UPPER_30BIT_MASK_SHIFT;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential overflow in mask computation #22

Potential overflow in mask computation #22

dmohns commented May 10, 2024 •

edited

Loading

danirafi commented May 16, 2024 •

edited

Loading

Potential overflow in mask computation #22

Potential overflow in mask computation #22

Comments

dmohns commented May 10, 2024 • edited Loading

danirafi commented May 16, 2024 • edited Loading

dmohns commented May 10, 2024 •

edited

Loading

danirafi commented May 16, 2024 •

edited

Loading