Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NOINLINE to a couple of test functions #41

Merged
merged 1 commit into from
Feb 6, 2025
Merged

Conversation

Gerold103
Copy link

One would expect that inline or not, a function's behaviour must not change. But it doesn't seem to be the case when the tests are built with clang in Release mode. Both normal LLVM clang and Apple clang.

It is not clear why, but the function
test_encode_uint_custom_size() behaves wrong when inlined. Specifically in Release mode with clang. Even with ASAN it works. Still, this specific build fails reliably.

Debugging is quite complicated due to the optimizations messing up the code. What was found is that seems like when the function is called 2 times on 2 different buffers, the second time it is like if the function had no effect.

Here is what was tried:

  • Any attempt to observe the buffers with prints made the tests pass. Even if the prints are added after the loop in test_compare_uint().

  • In RelWithDebInfo it fails, even though prints in LLDB show the buffers being filled with correct data. However when the code enters mp_compare_uint(), the buffers again look the same.

  • The attempts to replace with 2-dimensional array with an array of structs didn't change anything.

  • The increase of MP_NUMBER_MAX_LEN and MP_NUMBER_CODEC_COUNT didn't have any effect either.

  • Inlining of test_compare_uint() body into main() with the values causing the failure would somehow pass.

  • Adding a printf("...") to each iteration in test_encode_uint_all_sizes() would pass.

Given that this doesn't seem to be anyhow related to memory corruption (including stack overflow), the conclusion so far is that it could be a compiler bug.

@Gerold103 Gerold103 self-assigned this Feb 4, 2025
One would expect that inline or not, a function's behaviour must
not change. But it doesn't seem to be the case when the tests are
built with clang in Release mode. Both normal LLVM clang and Apple
clang.

It is not clear why, but the function
test_encode_uint_custom_size() behaves wrong when inlined.
Specifically in Release mode with clang. Even with ASAN it
works. Still, this specific build fails reliably.

Debugging is quite complicated due to the optimizations messing up
the code. What was found is that seems like when the function is
called 2 times on 2 different buffers, the second time it is like
if the function had no effect.

Here is what was tried:

* Any attempt to observe the buffers with prints made the tests
  pass. Even if the prints are added **after** the loop in
  test_compare_uint().

* In RelWithDebInfo it fails, even though prints in LLDB show the
  buffers being filled with correct data. However when the code
  enters mp_compare_uint(), the buffers again look the same.

* The attempts to replace with 2-dimensional array with an array
  of structs didn't change anything.

* The increase of MP_NUMBER_MAX_LEN and MP_NUMBER_CODEC_COUNT
  didn't have any effect either.

* Inlining of test_compare_uint() body into main() with the values
  causing the failure would somehow pass.

* Adding a printf("...") to each iteration in
  test_encode_uint_all_sizes() would pass.

Given that this doesn't seem to be anyhow related to memory
corruption (including stack overflow), the conclusion so far is
that it could be a compiler bug.
@Gerold103 Gerold103 force-pushed the gerold103/fix-tests branch from 66327ce to 4d2b6c6 Compare February 4, 2025 23:01
@Gerold103 Gerold103 marked this pull request as ready for review February 4, 2025 23:44
@Gerold103 Gerold103 requested review from locker and Gumix February 4, 2025 23:44
@Gerold103
Copy link
Author

@locker and @Gumix, I've stumbled into a weird compiler behaviour, which I was debugging for the past couple of days. I managed to work it around, and this looks like a compiler bug. At least, that is the only plausible explanation I could come up with.

Without this patch the CI in the main tarantool repo fails on release clang build, also with LTO. But passes with ASAN. And all the other jobs also pass. The release clang fail I also reliably reproduced locally, at least on MacOS M3. And this patch also fixes it.

I've written my investigation attempts in the commit message. But I easily could have missed something. Can you please have a fresh very attentive look at what could be wrong here except for the compiler? I really highly doubt that the compiler could misbehave this such seemingly simple code, and yet I am completely out of other options.

@Gumix
Copy link

Gumix commented Feb 5, 2025

But passes with ASAN.

What about UBSan?

@Gerold103
Copy link
Author

@locker and @Gumix, it works with UBSan. Meaning that it shows no errors and passes successfully. But as soon as I remove -fsanitize=undefined, it fails again.

With @igormunkin we managed to build a minimal reproducer, which works on clang-12 and earlier, but fails starting from clang-13 until the latest. The architecture doesn't seem to matter.

It starts working as soon as make test_encode_uint_custom_size() no-inline. Or when I do -O3 but with -fno-inline-functions.

With that I conclude that we have a compiler bug here, and the solution in the patch seems to me like a good enough workaround then.

The reproducer is a single cpp file. Igor said he will try to submit it to clang's bug tracker. I can't attach the file here, GitHub doesn't support that. But I am copy-pasting it below under a drop-down block.

This is how it fails: clear && clang++ msgpuck.cc -O3 -DNDEBUG.
This is how it works: clear && clang++ msgpuck.cc -O3 -DNDEBUG -fno-inline-functions.
Also can use the first command and make the function test_encode_uint_custom_size() NOINLINE.

msgpuck.cc
/*
 * Copyright (c) 2013-2016 MsgPuck Authors
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or
 * without modification, are permitted provided that the following
 * conditions are met:
 *
 * 1. Redistributions of source code must retain the above
 *    copyright notice, this list of conditions and the
 *    following disclaimer.
 *
 * 2. Redistributions in binary form must reproduce the above
 *    copyright notice, this list of conditions and the following
 *    disclaimer in the documentation and/or other materials
 *    provided with the distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
 * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
 * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
 * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
 * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
 * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
 * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
 * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 */
#include <assert.h>
#include <inttypes.h>
#include <limits.h>
#include <math.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define UNIT_TAP_COMPATIBLE 1

#if defined(__cplusplus) && !defined(__STDC_CONSTANT_MACROS)
#define __STDC_CONSTANT_MACROS 1 /* make С++ to be happy */
#endif
#if defined(__cplusplus) && !defined(__STDC_LIMIT_MACROS)
#define __STDC_LIMIT_MACROS 1    /* make С++ to be happy */
#endif

#if defined(__CC_ARM)         /* set the alignment to 1 for armcc compiler */
#define MP_PACKED    __packed
#define MP_PACKED_END
#define MP_CONST __attribute__((const))
#define MP_PURE __attribute__((pure))
#elif defined(__GNUC__)
#define MP_PACKED  __attribute__((packed))
#define MP_PACKED_END
#define MP_CONST __attribute__((const))
#define MP_PURE __attribute__((pure))
#elif defined(_MSC_VER)
#define MP_CONST
#define MP_PURE
#define MP_PACKED __pragma(pack(push, 1))
#define MP_PACKED_END __pragma(pack(pop))
#endif

#if defined(__GNUC__) && !defined(__GNUC_STDC_INLINE__)
#if !defined(MP_LIBRARY)
#define MP_PROTO extern inline
#define MP_IMPL extern inline
#else /* defined(MP_LIBRARY) */
#define MP_PROTO
#define MP_IMPL
#endif
#define MP_ALWAYSINLINE
#else /* C99 inline */
#if !defined(MP_LIBRARY)
#define MP_PROTO inline
#define MP_IMPL inline
#else /* defined(MP_LIBRARY) */
#define MP_PROTO extern inline
#define MP_IMPL inline
#endif
#ifndef _WIN32
#define MP_ALWAYSINLINE __attribute__((always_inline))
#else
#define MP_ALWAYSINLINE
#endif
#endif /* GNU inline or C99 inline */

#if !defined __GNUC_MINOR__ || defined __INTEL_COMPILER || \
	defined __SUNPRO_C || defined __SUNPRO_CC
#define MP_GCC_VERSION(major, minor) 0
#else
#define MP_GCC_VERSION(major, minor) (__GNUC__ > (major) || \
	(__GNUC__ == (major) && __GNUC_MINOR__ >= (minor)))
#endif

#if !defined(__has_builtin)
#define __has_builtin(x) 0 /* clang */
#endif

#if MP_GCC_VERSION(2, 9) || __has_builtin(__builtin_expect)
#define mp_likely(x) __builtin_expect((x), 1)
#define mp_unlikely(x) __builtin_expect((x), 0)
#else
#define mp_likely(x) (x)
#define mp_unlikely(x) (x)
#endif

#if MP_GCC_VERSION(4, 5) || __has_builtin(__builtin_unreachable)
#define mp_unreachable() (assert(0), __builtin_unreachable())
#else
#ifndef _WIN32
MP_PROTO void
mp_unreachable(void) __attribute__((noreturn));
#else
static void __declspec(noreturn) mp_unreachable(void) {
	assert(0);
}
#endif
#endif

#define mp_identity(x) (x) /* just to simplify mp_load/mp_store macroses */

#if MP_GCC_VERSION(4, 8) || __has_builtin(__builtin_bswap16)
#define mp_bswap_u16(x) __builtin_bswap16(x)
#else /* !MP_GCC_VERSION(4, 8) */
#define mp_bswap_u16(x) ( \
	(((x) <<  8) & 0xff00) | \
	(((x) >>  8) & 0x00ff) )
#endif

#if MP_GCC_VERSION(4, 3) || __has_builtin(__builtin_bswap32)
#define mp_bswap_u32(x) __builtin_bswap32(x)
#else /* !MP_GCC_VERSION(4, 3) */
#define mp_bswap_u32(x) ( \
	(((x) << 24) & UINT32_C(0xff000000)) | \
	(((x) <<  8) & UINT32_C(0x00ff0000)) | \
	(((x) >>  8) & UINT32_C(0x0000ff00)) | \
	(((x) >> 24) & UINT32_C(0x000000ff)) )
#endif

#if MP_GCC_VERSION(4, 3) || __has_builtin(__builtin_bswap64)
#define mp_bswap_u64(x) __builtin_bswap64(x)
#else /* !MP_GCC_VERSION(4, 3) */
#define mp_bswap_u64(x) (\
	(((x) << 56) & UINT64_C(0xff00000000000000)) | \
	(((x) << 40) & UINT64_C(0x00ff000000000000)) | \
	(((x) << 24) & UINT64_C(0x0000ff0000000000)) | \
	(((x) <<  8) & UINT64_C(0x000000ff00000000)) | \
	(((x) >>  8) & UINT64_C(0x00000000ff000000)) | \
	(((x) >> 24) & UINT64_C(0x0000000000ff0000)) | \
	(((x) >> 40) & UINT64_C(0x000000000000ff00)) | \
	(((x) >> 56) & UINT64_C(0x00000000000000ff)) )
#endif

#define MP_LOAD_STORE(name, type, bswap)					\
MP_PROTO type									\
mp_load_##name(const char **data);						\
MP_IMPL type									\
mp_load_##name(const char **data)						\
{										\
	struct MP_PACKED cast { type val; } MP_PACKED_END ;					\
	type val = bswap(((struct cast *) *data)->val);				\
	*data += sizeof(type);							\
	return val;								\
}										\
MP_PROTO char *									\
mp_store_##name(char *data, type val);						\
MP_IMPL char *									\
mp_store_##name(char *data, type val)						\
{										\
	struct MP_PACKED cast { type val; } MP_PACKED_END ;					\
	((struct cast *) (data))->val = bswap(val);				\
	return data + sizeof(type);						\
}

MP_LOAD_STORE(u8, uint8_t, mp_identity);

#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__

MP_LOAD_STORE(u16, uint16_t, mp_bswap_u16);
MP_LOAD_STORE(u32, uint32_t, mp_bswap_u32);
MP_LOAD_STORE(u64, uint64_t, mp_bswap_u64);

#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__

MP_LOAD_STORE(u16, uint16_t, mp_identity);
MP_LOAD_STORE(u32, uint32_t, mp_identity);
MP_LOAD_STORE(u64, uint64_t, mp_identity);

#else
#error Unsupported __BYTE_ORDER__
#endif

#if !defined(__FLOAT_WORD_ORDER__)
#define __FLOAT_WORD_ORDER__ __BYTE_ORDER__
#endif /* defined(__FLOAT_WORD_ORDER__) */

#if __FLOAT_WORD_ORDER__ == __ORDER_LITTLE_ENDIAN__

union MP_PACKED mp_float_cast  {
	uint32_t u32;
	float f;
} MP_PACKED_END ;

union MP_PACKED mp_double_cast {
	uint64_t u64;
	double d;
} MP_PACKED_END ;

MP_PROTO float
mp_load_float(const char **data);
MP_PROTO double
mp_load_double(const char **data);
MP_PROTO char *
mp_store_float(char *data, float val);
MP_PROTO char *
mp_store_double(char *data, double val);

MP_IMPL float
mp_load_float(const char **data)
{
	union mp_float_cast cast = *(union mp_float_cast *) *data;
	*data += sizeof(cast);
	cast.u32 = mp_bswap_u32(cast.u32);
	return cast.f;
}

MP_IMPL double
mp_load_double(const char **data)
{
	union mp_double_cast cast = *(union mp_double_cast *) *data;
	*data += sizeof(cast);
	cast.u64 = mp_bswap_u64(cast.u64);
	return cast.d;
}

MP_IMPL char *
mp_store_float(char *data, float val)
{
	union mp_float_cast cast;
	cast.f = val;
	cast.u32 = mp_bswap_u32(cast.u32);
	*(union mp_float_cast *) (data) = cast;
	return data + sizeof(cast);
}

MP_IMPL char *
mp_store_double(char *data, double val)
{
	union mp_double_cast cast;
	cast.d = val;
	cast.u64 = mp_bswap_u64(cast.u64);
	*(union mp_double_cast *) (data) = cast;
	return data + sizeof(cast);
}

#elif __FLOAT_WORD_ORDER__ == __ORDER_BIG_ENDIAN__

MP_LOAD_STORE(float, float, mp_identity);
MP_LOAD_STORE(double, double, mp_identity);

#else
#error Unsupported __FLOAT_WORD_ORDER__
#endif

#undef mp_identity
#undef MP_LOAD_STORE

enum mp_type {
	MP_NIL = 0,
	MP_UINT,
	MP_INT,
	MP_STR,
	MP_BIN,
	MP_ARRAY,
	MP_MAP,
	MP_BOOL,
	MP_FLOAT,
	MP_DOUBLE,
	MP_EXT
};

/** \cond false */
extern const enum mp_type mp_type_hint[];
extern const int8_t mp_parser_hint[];
extern const char *const mp_char2escape[];
extern const uint8_t mp_ext_hint[];

MP_IMPL MP_ALWAYSINLINE enum mp_type
mp_typeof(const char c)
{
	return mp_type_hint[(uint8_t) c];
}

MP_IMPL uint32_t
mp_sizeof_uint(uint64_t num)
{
	if (num <= 0x7f) {
		return 1;
	} else if (num <= UINT8_MAX) {
		return 1 + sizeof(uint8_t);
	} else if (num <= UINT16_MAX) {
		return 1 + sizeof(uint16_t);
	} else if (num <= UINT32_MAX) {
		return 1 + sizeof(uint32_t);
	} else {
		return 1 + sizeof(uint64_t);
	}
}

MP_IMPL uint32_t
mp_sizeof_int(int64_t num)
{
	assert(num < 0);
	if (num >= -0x20) {
		return 1;
	} else if (num >= INT8_MIN && num <= INT8_MAX) {
		return 1 + sizeof(int8_t);
	} else if (num >= INT16_MIN && num <= UINT16_MAX) {
		return 1 + sizeof(int16_t);
	} else if (num >= INT32_MIN && num <= UINT32_MAX) {
		return 1 + sizeof(int32_t);
	} else {
		return 1 + sizeof(int64_t);
	}
}

MP_IMPL char *
mp_encode_uint(char *data, uint64_t num)
{
	if (num <= 0x7f) {
		return mp_store_u8(data, (uint8_t)num);
	} else if (num <= UINT8_MAX) {
		data = mp_store_u8(data, 0xcc);
		return mp_store_u8(data, (uint8_t)num);
	} else if (num <= UINT16_MAX) {
		data = mp_store_u8(data, 0xcd);
		return mp_store_u16(data, (uint16_t)num);
	} else if (num <= UINT32_MAX) {
		data = mp_store_u8(data, 0xce);
		return mp_store_u32(data, (uint32_t)num);
	} else {
		data = mp_store_u8(data, 0xcf);
		return mp_store_u64(data, num);
	}
}

MP_IMPL char *
mp_encode_int(char *data, int64_t num)
{
	assert(num < 0);
	if (num >= -0x20) {
		return mp_store_u8(data, (uint8_t)(0xe0 | num));
	} else if (num >= INT8_MIN) {
		data = mp_store_u8(data, 0xd0);
		return mp_store_u8(data, (uint8_t)num);
	} else if (num >= INT16_MIN) {
		data = mp_store_u8(data, 0xd1);
		return mp_store_u16(data, (uint16_t)num);
	} else if (num >= INT32_MIN) {
		data = mp_store_u8(data, 0xd2);
		return mp_store_u32(data, (uint32_t)num);
	} else {
		data = mp_store_u8(data, 0xd3);
		return mp_store_u64(data, num);
	}
}

MP_IMPL uint64_t
mp_decode_uint_data(uint8_t type, const char **data)
{
	switch (type) {
	case 0xcc:
		return mp_load_u8(data);
	case 0xcd:
		return mp_load_u16(data);
	case 0xce:
		return mp_load_u32(data);
	case 0xcf:
		return mp_load_u64(data);
	default:
		if (mp_unlikely(type > 0x7f))
			mp_unreachable();
		return type;
	}
}

MP_IMPL uint64_t
mp_decode_uint(const char **data)
{
	return mp_decode_uint_data(mp_load_u8(data), data);
}

MP_IMPL int
mp_compare_uint(const char *data_a, const char *data_b)
{
	uint8_t ca = mp_load_u8(&data_a);
	uint8_t cb = mp_load_u8(&data_b);
	uint64_t a, b;
	if (ca != cb) {
		a = mp_decode_uint_data(ca, &data_a);
		b = mp_decode_uint_data(cb, &data_b);
		return a < b ? -1 : a > b;
	}
	switch (ca) {
	case 0xcc:
		a = mp_load_u8(&data_a);
		b = mp_load_u8(&data_b);
		break;
	case 0xcd:
		a = mp_load_u16(&data_a);
		b = mp_load_u16(&data_b);
		break;
	case 0xce:
		a = mp_load_u32(&data_a);
		b = mp_load_u32(&data_b);
		break;
	case 0xcf:
		a = mp_load_u64(&data_a);
		b = mp_load_u64(&data_b);
		break;
	default:
		return 0;
	}
	return a < b ? -1 : a > b;
}

/*
 * }}}
 */

#undef MP_LIBRARY
#undef MP_PROTO
#undef MP_IMPL
#undef MP_ALWAYSINLINE
#undef MP_GCC_VERSION

#define fail(expr, result) do {					\
	fprintf(stderr, "Test failed: %s is %s at %s:%d, in function '%s'\n",\
		expr, result, __FILE__, __LINE__, __func__);		\
	exit(-1);							\
} while (0)

#define fail_if(expr) do { if (expr) fail(#expr, "true"); } while (0)
#define fail_unless(expr) do { if (!(expr)) fail(#expr, "false"); } while (0)

#define note(...) msg(stdout, __VA_ARGS__)
#define diag(...) msg(stderr, __VA_ARGS__)

#define msg(stream, ...) ({ _space(stream); fprintf(stream, "# "); \
	fprintf(stream, __VA_ARGS__); fprintf(stream, "\n"); })

void
_ok(int condition, const char *expr, const char *file, int line,
    const char *fmt, ...);

/* private function, use note(...) or diag(...) instead */
void _space(FILE *stream);

void
_plan(int count, bool tap);

#define _select_10th(f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, ...) f10
#define _ok0(cond, expr, ...)						\
	_select_10th(, ##__VA_ARGS__,					\
		     _ok(cond, expr, __FILE__, __LINE__, __VA_ARGS__),	\
		     _ok(cond, expr, __FILE__, __LINE__, __VA_ARGS__),	\
		     _ok(cond, expr, __FILE__, __LINE__, __VA_ARGS__),	\
		     _ok(cond, expr, __FILE__, __LINE__, __VA_ARGS__),	\
		     _ok(cond, expr, __FILE__, __LINE__, __VA_ARGS__),	\
		     _ok(cond, expr, __FILE__, __LINE__, __VA_ARGS__),	\
		     _ok(cond, expr, __FILE__, __LINE__, __VA_ARGS__),	\
		     _ok(cond, expr, __FILE__, __LINE__, __VA_ARGS__),	\
		     _ok(cond, expr, __FILE__, __LINE__, "line %d", __LINE__))

#define ok(cond, ...)		_ok0(cond, #cond, ##__VA_ARGS__)
#define is(a, b, ...)		_ok0((a) == (b), #a " == " #b, ##__VA_ARGS__)
#define isnt(a, b, ...)		_ok0((a) != (b), #a " != " #b, ##__VA_ARGS__)

#define is_str(a, b, fmt, args...) ok(strcmp(a, b) == 0, fmt, ##args)

#if UNIT_TAP_COMPATIBLE

#define header()					\
	do {						\
		_space(stdout);				\
		printf("# *** %s ***\n", __func__);	\
	} while (0)

#define footer()					\
	do { 						\
		_space(stdout);				\
		printf("# *** %s: done ***\n", __func__); \
	} while (0)

#define plan(count) _plan(count, true)

#else /* !UNIT_TAP_COMPATIBLE */

#define header() printf("\t*** %s ***\n", __func__)
#define footer() printf("\t*** %s: done ***\n", __func__)
#define plan(count) _plan(count, false)

#endif /* !UNIT_TAP_COMPATIBLE */

int check_plan(void);

enum { MAX_LEVELS = 10 };
static int tests_done[MAX_LEVELS];
static int tests_failed[MAX_LEVELS];
static int plan_test[MAX_LEVELS];
static int level = -1;

void
_space(FILE *stream)
{
	for (int i = 0 ; i < level; i++) {
		fprintf(stream, "    ");
	}
}

void
_plan(int count, bool tap)
{
	++level;
	plan_test[level] = count;
	tests_done[level] = 0;
	tests_failed[level] = 0;

	if (tap && level == 0)
		printf("TAP version 13\n");

	_space(stdout);
	printf("%d..%d\n", 1, plan_test[level]);
}

int
check_plan(void)
{
	int r = 0;
	if (tests_done[level] != plan_test[level]) {
		_space(stderr);
		fprintf(stderr,
			"# Looks like you planned %d tests but ran %d.\n",
			plan_test[level], tests_done[level]);
		r = -1;
	}

	if (tests_failed[level]) {
		_space(stderr);
		fprintf(stderr,
			"# Looks like you failed %d test of %d run.\n",
			tests_failed[level], tests_done[level]);
		r = tests_failed[level];
	}
	--level;
	if (level >= 0) {
		is(r, 0, "subtests");
	}
	return r;
}

void
_ok(int condition, const char *expr, const char *file, int line,
    const char *fmt, ...)
{
	va_list ap;

	_space(stdout);
	printf("%s %d - ", condition ? "ok" : "not ok", ++tests_done[level]);
	va_start(ap, fmt);
	vprintf(fmt, ap);
	printf("\n");
	va_end(ap);
	if (!condition) {
		tests_failed[level]++;
		_space(stderr);
		fprintf(stderr, "#   Failed test `%s'\n", expr);
		_space(stderr);
		fprintf(stderr, "#   in %s at line %d\n", file, line);
	}
}

#include <type_traits>

#define MP_NUMBER_MAX_LEN 16
#define MP_NUMBER_CODEC_COUNT 16

#ifndef lengthof
#define lengthof(array) (sizeof(array) / sizeof((array)[0]))
#endif

#ifndef __has_attribute
#  define __has_attribute(x) 0
#endif

#if __has_attribute(noinline) || defined(__GNUC__)
#  define NOINLINE __attribute__((noinline))
#else
#  define NOINLINE
#endif

// MAKE THIS NOINLINE TO MAKE TESTS PASS
static bool
test_encode_uint_custom_size(char *buf, uint64_t val, int size)
{
	char *pos;
	switch (size) {
	case 1:
		if (val > 0x7f)
			return false;
		mp_store_u8(buf, val);
		return true;
	case 2:
		if (val > UINT8_MAX)
			return false;
		pos = mp_store_u8(buf, 0xcc);
		mp_store_u8(pos, (uint8_t)val);
		return true;
	case 3:
		if (val > UINT16_MAX)
			return false;
		pos = mp_store_u8(buf, 0xcd);
		mp_store_u16(pos, (uint16_t)val);
		return true;
	case 5:
		if (val > UINT32_MAX)
			return false;
		pos = mp_store_u8(buf, 0xce);
		mp_store_u32(pos, (uint32_t)val);
		return true;
	case 9:
		pos = mp_store_u8(buf, 0xcf);
		mp_store_u64(pos, val);
		return true;
	}
	abort();
	return false;
}

static int
test_encode_uint_all_sizes(char mp_nums[][MP_NUMBER_MAX_LEN], uint64_t val)
{
	int sizes[] = {1, 2, 3, 5, 9};
	int count = lengthof(sizes);
	int used = 0;
	for (int i = 0; i < count; ++i) {
		assert(used < MP_NUMBER_CODEC_COUNT);
		if (test_encode_uint_custom_size(mp_nums[used], val, sizes[i]))
			++used;
	}
	return used;
}

static void
test_compare_uint(uint64_t a, uint64_t b)
{
	char mp_nums_a[MP_NUMBER_CODEC_COUNT][MP_NUMBER_MAX_LEN];
	int count_a = test_encode_uint_all_sizes(mp_nums_a, a);

	char mp_nums_b[MP_NUMBER_CODEC_COUNT][MP_NUMBER_MAX_LEN];
	int count_b = test_encode_uint_all_sizes(mp_nums_b, b);

	for (int ai = 0; ai < count_a; ++ai) {
		for (int bi = 0; bi < count_b; ++bi) {
			int r = mp_compare_uint(mp_nums_a[ai], mp_nums_b[bi]);
			if (a < b) {
				ok(r < 0, "mp_compare_uint(%" PRIu64 ", "
				   "%" PRIu64 ") < 0", a, b);
			} else if (a > b) {
				ok(r > 0, "mp_compare_uint(%" PRIu64 ", "
				   "%" PRIu64 ") > 0", a, b);
			} else {
				ok(r == 0, "mp_compare_uint(%" PRIu64 ", "
				   "%" PRIu64 ") == 0", a, b);
			}
		}
	}
}

static int
test_compare_uints(void)
{
	plan(100);
	header();

	uint64_t nums[] = {
		0, 1
	};
	int count = sizeof(nums) / sizeof(*nums);
	for (int i = 0; i < count; i++) {
		for (int j = 0; j < count; j++) {
			test_compare_uint(nums[i], nums[j]);
		}
	}

	footer();
	return check_plan();
}

int main()
{
	plan(2);
	header();

	test_compare_uints();
	test_compare_uints();

	footer();
	return check_plan();
}

@igormunkin
Copy link

The reproducer is a single cpp file.

I've created a gist with a quite minimal reproducer: https://gist.github.com/igormunkin/476e29d7c737f108774b4616a079a2cd.

I tried to remove all "unit-tap"-related code, but the issue is gone, if ok is replace with a plain assert (everything works fine as a result of such substitute). Anyway, this is fine reproducer to file an issue against the compiler, but I'll re-check the code for possible UBs once more later.

All in all, I believe @Gerold103 et al. can proceed with this patch at the moment.

@Gumix
Copy link

Gumix commented Feb 6, 2025

I've created a gist with a quite minimal reproducer:

Simplified it a bit:

test.cc
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

static uint64_t
unused_func(uint8_t type, const char *data)
{
        switch (type) {
        case 8: {
                struct cast { uint8_t val; };
                return ((struct cast *)data)->val;
        }
        case 16: {
                struct cast { uint16_t val; };
                uint16_t val = ((struct cast *)data)->val;
                return __builtin_bswap16(val);
        }
        case 32: {
                struct cast { uint32_t val; };
                uint32_t val = ((struct cast *)data)->val;
                return __builtin_bswap32(val);
        }
        case 64: {
                struct cast { uint64_t val; };
                uint64_t val = ((struct cast *)data)->val;
                return __builtin_bswap64(val);
        }
        }
        abort();
}

//__attribute__((noinline))
static void
store_0_to_buf(void *buf)
{
        struct cast { uint8_t val; };
        ((struct cast *)buf)->val = 0;
}

int main()
{
        char buf_a[2][8];
        char buf_b[2][8];
        store_0_to_buf(buf_a[0]);
        store_0_to_buf(buf_a[1]);
        store_0_to_buf(buf_b[0]);
        store_0_to_buf(buf_b[1]);

        for (int ai = 0; ai < 2; ++ai) {
                for (int bi = 0; bi < 2; ++bi) {
                        const char *data_a = buf_a[ai];
                        const char *data_b = buf_b[bi];
                        struct cast { uint8_t val; };
                        uint8_t ca = ((struct cast *) data_a)->val;
                        uint8_t cb = ((struct cast *) data_b)->val;
                        if (ca != cb) {
                                uint64_t a = unused_func(ca, data_a + 1);
                                uint64_t b = unused_func(cb, data_b + 1);
                                printf("??? %d\n", a > b);
                        }
                        printf("ok ");
                }
        }
}

With clang++ -O2 -fno-inline-functions test.cc the output is "ok ok ok ok".
With clang++ -O2 test.cc it is optimized to an empty main function.

@igormunkin
Copy link

I've created a gist with a quite minimal reproducer:

Simplified it a bit:

test.cc

@Gumix, neat, thanks! Do you mind me using your reproducer for the bug report?

With clang++ -O2 -fno-inline-functions test.cc the output is "ok ok ok ok". With clang++ -O2 test.cc it is optimized to an empty main function.

So, Clang aggressively inlines zero-initialization for both arrays, considers them equal (since no modifications are done below) and purges the whole loop body (and, ergo, the whole function body) as a result of optimization. No magic, only speculations :)

Since the whole input is statically generated within test_encode_uint_all_sizes and is used nearby in test_compare_uint, inlining might use the input values for optimizations either.

Anyway, the fact it purges the functions with side effects (such as printf) does bother me.


Fun fact: your reproducer works fine with -O3 for all clang++ I have on my machine.


In addition, I've failed to find the first poisoned version.

Several "latest" clang++ available in Ubuntu 20.04 docker (i.e. clang++-14 and `clang++-15) are fine:

$ clang++-14 --version
Ubuntu clang version 14.0.0-1ubuntu1.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ clang++-14 -O2 -fno-inline-functions test.cc
$ ./a.out
ok ok ok ok 
$ clang++-14 -O2 test.cc
$ ./a.out
ok ok ok ok 
$ clang++-14 -O3 test.cc
$ ./a.out
ok ok ok ok 
$ clang++-15 --version
Ubuntu clang version 15.0.7
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ clang++-15 -O2 -fno-inline-functions test.cc
$ ./a.out
ok ok ok ok 
$ clang++-15 -O2 test.cc
$ ./a.out
ok ok ok ok 
$ clang++-15 -O3 test.cc
$ ./a.out
ok ok ok ok

But using these versions in Ubuntu 22.04 docker causing the crash:

$ clang++-14 --version
Ubuntu clang version 14.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ clang++-14 -O2 -fno-inline-functions test.cc
$ ./a.out
ok ok ok ok 
$ clang++-14 -O2 test.cc
$ ./a.out
./test.sh: line 9:  4760 Aborted                 (core dumped) ./a.out
$ clang++-14 -O3 test.cc
$ ./a.out
ok ok ok ok 
$ clang++-15 --version
Ubuntu clang version 15.0.7
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ clang++-15 -O2 -fno-inline-functions test.cc
$ ./a.out
ok ok ok ok 
$ clang++-15 -O2 test.cc
$ ./a.out
./test.sh: line 9:  4774 Aborted                 (core dumped) ./a.out
$ clang++-15 -O3 test.cc
$ ./a.out
ok ok ok ok

All in all, I hope this is enough for bug report. I'll make the one a bit later.

@Gumix
Copy link

Gumix commented Feb 6, 2025

@Gumix, neat, thanks! Do you mind me using your reproducer for the bug report?

Sure. I guess the unused_func() can be simplified further, but if I even change the numbers in the switch, the issue disappears.

@Gerold103
Copy link
Author

Thanks everyone, this was an interesting issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants