Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASAN (clean!), UBSAN (clean!) jobs added -- nicely sanitized! MSAN also added but disabled, as our environment is not quite suitable. #35

Merged
merged 67 commits into from
Dec 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
dc5e829
Minor syntax thing. It was just bigging me. Not sure if will fix it…
ygoldfeld Dec 13, 2023
9217e6f
(further work is experimental) Turning off TSAN for convenience. Fir…
ygoldfeld Dec 13, 2023
1d55fc8
(cont.)
ygoldfeld Dec 13, 2023
d91e56e
Seeing some cryptic linker errors with ASAN. Perhaps disabling LTO w…
ygoldfeld Dec 13, 2023
0c54b12
typo
ygoldfeld Dec 13, 2023
61b2db9
Ah those were removed by mistake (in this branch); fixing.
ygoldfeld Dec 13, 2023
c277925
Pretty sweet: ASAN passes beautifully for clang-17. Now expanding to…
ygoldfeld Dec 13, 2023
47a0aca
OK; let us try UBSAN and MSAN because why not. Also renamed undefine…
ygoldfeld Dec 13, 2023
e360f45
(cont.)
ygoldfeld Dec 13, 2023
c15f8f7
Supposedly UBSAN is helped with a certain print-stacktrace env var; a…
ygoldfeld Dec 13, 2023
52e8e35
Merge branch 'main' into asan
ygoldfeld Dec 13, 2023
0282ad3
Getting around MSAN-related compile error using ignore-list (ignoring…
ygoldfeld Dec 14, 2023
0dcfe05
More timeout loosening. This time it is for the *SAN runs some of wh…
ygoldfeld Dec 14, 2023
6be712f
Build fix.
ygoldfeld Dec 14, 2023
f0ca197
Build fix.
ygoldfeld Dec 14, 2023
3a58694
Do not need UBSAN suppressions yet....
ygoldfeld Dec 14, 2023
98a187d
Fix for long-standing problem/implied TODO: pack up the logs even if …
ygoldfeld Dec 14, 2023
3bd11d3
Build fix.
ygoldfeld Dec 14, 2023
1dc3019
Bug fix.
ygoldfeld Dec 14, 2023
5b512c6
Build fix?
ygoldfeld Dec 14, 2023
836fda9
Build fix?
ygoldfeld Dec 14, 2023
1e8219d
Build fix.
ygoldfeld Dec 14, 2023
955ac91
Build fix.
ygoldfeld Dec 14, 2023
305a377
Build fix.
ygoldfeld Dec 14, 2023
3228968
Build fix.
ygoldfeld Dec 14, 2023
457dc75
(debug)
ygoldfeld Dec 14, 2023
daacdb7
(debug)
ygoldfeld Dec 14, 2023
20ea5c5
(debug)
ygoldfeld Dec 14, 2023
b895750
(debug)
ygoldfeld Dec 14, 2023
6716679
(debug)
ygoldfeld Dec 14, 2023
ce018d3
(debug)
ygoldfeld Dec 14, 2023
378298d
Saving all logs (not just transport_test all modes); meaning now also…
ygoldfeld Dec 14, 2023
b07b922
MSAN: so the problem, I finally figured out, during the build-targets…
ygoldfeld Dec 14, 2023
8673be9
MSAN regex format tweak.
ygoldfeld Dec 14, 2023
a05ff81
Bug fix.
ygoldfeld Dec 14, 2023
c19f8bf
MSAN regex format tweak.
ygoldfeld Dec 14, 2023
f3a832b
Updating submodules.
ygoldfeld Dec 14, 2023
5302f6e
Suppressing next capnp compiler binary MSAN failure.
ygoldfeld Dec 14, 2023
d6efe69
Updating submodules.
ygoldfeld Dec 15, 2023
852bec1
Suppressing next MSAN failure (this one is from global Boost.chrono i…
ygoldfeld Dec 15, 2023
da55ee8
Suppressing first UBSAN thing: a somewhat sloppy but harmless jemallo…
ygoldfeld Dec 15, 2023
da8a6af
Forward progress for both UBSAN and MSAN issues. UBSAN has bifurcate…
ygoldfeld Dec 15, 2023
7ed1e1e
Splitting up MSAN ignore-list into a separate suppression file and on…
ygoldfeld Dec 15, 2023
700b472
Okay, trying for the real full matrix. MSAN will probably still give…
ygoldfeld Dec 15, 2023
7622254
Fixes....
ygoldfeld Dec 15, 2023
f052af6
The gtest options thing puts in a newline which messes up the redirec…
ygoldfeld Dec 15, 2023
5213c35
The gtest options thing puts in a newline which messes up the redirec…
ygoldfeld Dec 15, 2023
7500480
The gtest options thing puts in a newline which messes up the redirec…
ygoldfeld Dec 15, 2023
ce2e73a
(debug)
ygoldfeld Dec 15, 2023
3b05f85
(debug)
ygoldfeld Dec 15, 2023
b9a3f34
(debug)
ygoldfeld Dec 15, 2023
f07f497
(debug)
ygoldfeld Dec 15, 2023
7c42e3d
(debug)
ygoldfeld Dec 15, 2023
c247f47
(debug)
ygoldfeld Dec 15, 2023
23661c0
(debug)
ygoldfeld Dec 15, 2023
edccbd2
(debug)
ygoldfeld Dec 15, 2023
814871a
Finally got a handle on the crazy suppressions-parse errors; trying s…
ygoldfeld Dec 15, 2023
552433f
Build fix.
ygoldfeld Dec 15, 2023
d4093ca
Merge branch 'main' into aubmsan
ygoldfeld Dec 15, 2023
eb4e910
Per preceding TODO -- parameterizing Conan profile to optionally fore…
ygoldfeld Dec 15, 2023
aed2f0a
Build fix.
ygoldfeld Dec 15, 2023
fe8f19e
Bug fix.
ygoldfeld Dec 15, 2023
59278a1
Bug fix.
ygoldfeld Dec 16, 2023
44c437a
Bug fix.
ygoldfeld Dec 16, 2023
4876544
Bug fix.
ygoldfeld Dec 16, 2023
be6bdd1
Bug fix.
ygoldfeld Dec 16, 2023
dba721f
Bug fix.
ygoldfeld Dec 16, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
573 changes: 412 additions & 161 deletions .github/workflows/main.yml

Large diffs are not rendered by default.

26 changes: 16 additions & 10 deletions conanfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,35 +7,39 @@ class IpcRecipe(ConanFile):
settings = "os", "compiler", "build_type", "arch"

options = {
"build": [True, False],
"build": [True, False],
"build_no_lto": [True, False],
"doc": [True, False],
}

default_options = {
"build": True,
"build": True,
"build_no_lto": False,
"doc": False,
}

def configure(self):
if self.options.build:
self.options["jemalloc"].enable_cxx = False
self.options["jemalloc"].enable_cxx = False
self.options["jemalloc"].prefix = "je_"

def generate(self):
deps = CMakeDeps(self)
if self.options.doc:
deps.build_context_activated = ["doxygen/1.9.4"]
deps.generate()

toolchain = CMakeToolchain(self)
if self.options.build:
toolchain.variables["CFG_ENABLE_TEST_SUITE"] = "ON"
toolchain.variables["JEMALLOC_PREFIX"] = self.options["jemalloc"].prefix
if self.options.build_no_lto:
toolchain.variables["CFG_NO_LTO"] = "ON"
if self.options.doc:
toolchain.variables["CFG_ENABLE_DOC_GEN"] = "ON"
toolchain.variables["CFG_SKIP_CODE_GEN"] = "ON"
toolchain.generate()

def build(self):
cmake = CMake(self)
cmake.configure()
Expand All @@ -44,15 +48,17 @@ def build(self):
if self.options.build:
self.run("cmake --build . -- --keep-going VERBOSE=1")
if self.options.doc:
# Note: `flow_doc_public flow_doc_full` could also be added here and work; however
# we leave that to `flow` and its own Conan setup.
self.run("cmake --build . -- ipc_doc_public ipc_doc_full --keep-going VERBOSE=1")

def requirements(self):
if self.options.build:
self.requires("capnproto/1.0.1")
self.requires("flow/1.0")
self.requires("gtest/1.14.0")
self.requires("jemalloc/5.2.1")

def build_requirements(self):
self.tool_requires("cmake/3.26.3")
if self.options.doc:
Expand All @@ -61,6 +67,6 @@ def build_requirements(self):
def package(self):
cmake = CMake(self)
cmake.install()

def layout(self):
cmake_layout(self)
5 changes: 5 additions & 0 deletions msan_ignore_list_clang.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# boost.chrono duration global initializer => std::string uninit value (TODO: too general but...).
# There is a number of these, including things like operator+() which are tough to specify in
# this format; for now going in for a pound, as they say: the entire file.
[memory]
src:*/bits/basic_string.h
22 changes: 12 additions & 10 deletions test/suite/transport_test/cli-script.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
# See the License for the specific language governing
# permissions and limitations under the License.

# Please see "Timeout notes" in srv-script.txt. Then come back here.

# <cmd> <sh-name> <timeout> <timeout-units>
SOCK_STREAM_CONNECT _socks_1 50 ms
# -> stm 0
Expand Down Expand Up @@ -115,35 +117,35 @@ CHAN_BUNDLE_POSIX_CREATE _mqd_4 0 2
# Other side of 3-way handshake.
# <cmd> <stm-slot> <blob-sz> <expected-ipc-err-or-success> <expected-blob-sz>
# <timeout> <timeout-units>
CHAN_BUNDLE_POSIX_RECV_BLOB 0 8192 0 1 3 ms
CHAN_BUNDLE_POSIX_RECV_BLOB 0 8192 0 1 15 ms
# <cmd> <stm-slot> <blob-sz> <expected-ipc-err-or-success>
CHAN_BUNDLE_POSIX_SEND_BLOB 0 1 0
CHAN_BUNDLE_POSIX_RECV_BLOB 0 8192 0 1 3 ms
CHAN_BUNDLE_POSIX_RECV_BLOB 0 8192 0 1 15 ms
# <cmd> <stm-slot> <blob-sz-or-0> <expected-ipc-err-or-success> <expected-sock?> <expected-blob-sz>
# <timeout> <timeout-units>
CHAN_BUNDLE_POSIX_RECV 0 10101 0 1 10101 3 ms
CHAN_BUNDLE_POSIX_RECV 0 10101 0 1 10101 15 ms
# -> sock 2
# Similar to other side.
# <cmd> <stm-slot> <sock-slot-or-none> <blob-sz-or-0> <expected-ipc-err-or-success>
CHAN_BUNDLE_POSIX_SEND 0 1 20202 0
CHAN_BUNDLE_POSIX_SEND 0 -1 30303 0
CHAN_BUNDLE_POSIX_RECV_BLOB 0 8192 RECEIVES_FINISHED_CANNOT_RECEIVE 0 3 ms
CHAN_BUNDLE_POSIX_RECV 0 8192 RECEIVES_FINISHED_CANNOT_RECEIVE 0 0 3 ms
CHAN_BUNDLE_POSIX_RECV_BLOB 0 8192 RECEIVES_FINISHED_CANNOT_RECEIVE 0 15 ms
CHAN_BUNDLE_POSIX_RECV 0 8192 RECEIVES_FINISHED_CANNOT_RECEIVE 0 0 15 ms

# Now bipc instead of POSIX.

SOCK_STREAM_CONNECT _socks_1 1 s
# -> stm 3
SLEEP 100 ms
CHAN_BUNDLE_BIPC_CREATE _mqd_4 0 3
CHAN_BUNDLE_BIPC_RECV_BLOB 0 8192 0 1 3 ms
CHAN_BUNDLE_BIPC_RECV_BLOB 0 8192 0 1 15 ms
CHAN_BUNDLE_BIPC_SEND_BLOB 0 1 0
CHAN_BUNDLE_BIPC_RECV_BLOB 0 8192 0 1 3 ms
CHAN_BUNDLE_BIPC_RECV 0 10101 0 1 10101 3 ms
CHAN_BUNDLE_BIPC_RECV_BLOB 0 8192 0 1 15 ms
CHAN_BUNDLE_BIPC_RECV 0 10101 0 1 10101 15 ms
CHAN_BUNDLE_BIPC_SEND 0 1 20202 0
CHAN_BUNDLE_BIPC_SEND 0 -1 30303 0
CHAN_BUNDLE_BIPC_RECV_BLOB 0 8192 RECEIVES_FINISHED_CANNOT_RECEIVE 0 3 ms
CHAN_BUNDLE_BIPC_RECV 0 8192 RECEIVES_FINISHED_CANNOT_RECEIVE 0 0 3 ms
CHAN_BUNDLE_BIPC_RECV_BLOB 0 8192 RECEIVES_FINISHED_CANNOT_RECEIVE 0 15 ms
CHAN_BUNDLE_BIPC_RECV 0 8192 RECEIVES_FINISHED_CANNOT_RECEIVE 0 0 15 ms

# TODO: Moar....

Expand Down
59 changes: 34 additions & 25 deletions test/suite/transport_test/srv-script.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,22 @@
# See the License for the specific language governing
# permissions and limitations under the License.

# Timeout notes:
#
# - There are very tight timeouts below -- in "microseconds" -- and the idea is to ensure
# it doesn't block at all in situations, e.g., where we know there are N messages queued-up
# locally at that point. However the actual time depends on computer speed, and on slower machines
# like default GitHub runners it can take longer than one experiences in a powerful lab machine.
# We are not benchmarking here or perf-testing, so to avoid false failures we use a less tight
# timeout that's still in the microseconds. So 50usec is realistic usually, but we use 250usec.
# - Moreover... particularly in open-source automated CI/CD pipeline... we sometimes run these
# tests (1) on slow machines *and* (2) while compiled with run-time sanitizers (ASAN, MSAN, etc.),
# some of which slow-down the code by up to, like, 5x.
# So bottom line is some of the below (e.g., 15ms) will look pretty generous even for slower machines;
# but the *SANs are the reason it is like that. TODO: Consider varying this depending on build type (via CMake
# build). That said, again, we are not benchmarking or perf-testing here; timeouts here are to ensure stuff
# does not hang for seconds mainly.

# <cmd> <sh-name>
SOCK_STREAM_ACC_LISTEN _socks_1
# -> acc 0
Expand Down Expand Up @@ -107,13 +123,6 @@ BLOB_STREAM_MQ_BIPC_SND_CREATE _mqd_0 0 0 0 0 0

# Let's actually do some IPC (between processes).

# Note: There are very tight timeouts below -- in "microseconds" -- and the idea is to ensure
# it doesn't block at all in situations, e.g., where we know there are N messages queued-up
# locally at that point. However the actual time depends on computer speed, and on slower machines
# like default GitHub runners it can take longer than one experiences in a powerful lab machine.
# We are not benchmarking here or perf-testing, so to avoid false failures we use a less tight
# timeout that's still in the microseconds. So 50usec is realistic usually, but we use 250usec.

BLOB_STREAM_MQ_POSIX_RCV_CREATE _mqd_1 1 3 10 0 0
# -> Posix: rcv 2
# <cmd> <stm-slot> <blob-sz> <expected-ipc-err-or-success> <expected-blob-sz>
Expand Down Expand Up @@ -149,20 +158,20 @@ BLOB_STREAM_MQ_BIPC_RCV_CREATE _mqd_2 0 0 0 0 0
BLOB_STREAM_MQ_BIPC_SND_CREATE _mqd_3 0 0 0 0 0
# -> Bipc:snd 2
# Do other side of 3-way handshake (see other side which does the opposite).
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 1 3 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 1 15 ms
BLOB_STREAM_MQ_BIPC_SEND_BLOB 2 1 0
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 1 3 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 1 15 ms
# Now just receive some stuff -- first 2 sans send-would-block -- last 5 after partial would-block.
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 3 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 3 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 15 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 15 ms
SLEEP 2 s
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 3 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 3 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 3 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 3 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 3 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 15 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 15 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 15 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 15 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 10 0 8 15 ms
# Then graceful-close.
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 8192 RECEIVES_FINISHED_CANNOT_RECEIVE 0 3 ms
BLOB_STREAM_MQ_BIPC_RECV_BLOB 2 8192 RECEIVES_FINISHED_CANNOT_RECEIVE 0 15 ms

# Mess with channels now.

Expand Down Expand Up @@ -190,12 +199,12 @@ CHAN_BUNDLE_POSIX_SEND_BLOB 0 1 0
CHAN_BUNDLE_POSIX_SEND 0 2 10101 0
# <cmd> <stm-slot> <blob-sz-or-0> <expected-ipc-err-or-success> <expected-sock?> <expected-blob-sz>
# <timeout> <timeout-units>
CHAN_BUNDLE_POSIX_RECV 0 20202 0 1 20202 10 ms
CHAN_BUNDLE_POSIX_RECV 0 20202 0 1 20202 30 ms
# -> sock 3
CHAN_BUNDLE_POSIX_RECV 0 30303 0 0 30303 10 ms
CHAN_BUNDLE_POSIX_RECV 0 30303 0 0 30303 30 ms
# <cmd> <stm-slot> <dupe-call?> <expected-ipc-err-or-success> <timeout> <timeout-units>
CHAN_BUNDLE_POSIX_SEND_END 0 0 0 3 ms
CHAN_BUNDLE_POSIX_SEND_END 0 1 0 3 ms
CHAN_BUNDLE_POSIX_SEND_END 0 0 0 15 ms
CHAN_BUNDLE_POSIX_SEND_END 0 1 0 15 ms

# Now bipc instead of POSIX.

Expand All @@ -206,10 +215,10 @@ CHAN_BUNDLE_BIPC_SEND_BLOB 0 1 0
CHAN_BUNDLE_BIPC_RECV_BLOB 0 8192 0 1 500 ms
CHAN_BUNDLE_BIPC_SEND_BLOB 0 1 0
CHAN_BUNDLE_BIPC_SEND 0 2 10101 0
CHAN_BUNDLE_BIPC_RECV 0 20202 0 1 20202 10 ms
CHAN_BUNDLE_BIPC_RECV 0 30303 0 0 30303 10 ms
CHAN_BUNDLE_BIPC_SEND_END 0 0 0 3 ms
CHAN_BUNDLE_BIPC_SEND_END 0 1 0 3 ms
CHAN_BUNDLE_BIPC_RECV 0 20202 0 1 20202 30 ms
CHAN_BUNDLE_BIPC_RECV 0 30303 0 0 30303 30 ms
CHAN_BUNDLE_BIPC_SEND_END 0 0 0 15 ms
CHAN_BUNDLE_BIPC_SEND_END 0 1 0 15 ms

# TODO: Moar....

Expand Down
3 changes: 3 additions & 0 deletions ubsan_suppressions_clang_13.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# (See explanation in higher-version suppressions file.)
shift-base:je_mallocx
shift-exponent:je_mallocx
1 change: 1 addition & 0 deletions ubsan_suppressions_clang_15.cfg
1 change: 1 addition & 0 deletions ubsan_suppressions_clang_16.cfg
15 changes: 15 additions & 0 deletions ubsan_suppressions_clang_17.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# src/jemalloc.c:3133:16: runtime error: left shift of 4095 by 20 places cannot be represented in type 'int'
# Looks harmless... a macro is doing essentially `((1 << 12) - 1) << 20`, which is a negative int -- used as an & mask.
# jemalloc should be more civilized IMO, but it is fine.
# Lastly: docs say `shift` is a usable suppression type, but it is not; it is a grouping; one must use these
# in the file. Possibly just one of the two is enough, but let us not quibble.
shift-base:je_mallocx
shift-exponent:je_mallocx
# tcache.c:144:2: runtime error: variable length array bound evaluates to non-positive value 0
# Gets invoked from some kind of cleanup hook. Also look harmless in context, as the actual bound
# being 0 controls various code touching the "array." The var-length array is a gcc extension;
# probably clang too then.
# jemalloc should really not do this sort of thing though.
vla-bound:je_tcache_bin_flush_small
# (Very similar situation; skipping details.)
vla-bound:je_tcache_bin_flush_large
Loading