Data race during shutdown #26

michihenning · 2023-04-24T02:08:00Z

I have a client implemented with NTF that talks to a pre-existing server. Everything works fine. I can resolve, connect, and read and write messages. The code runs clean under address sanitizer. However, thread sanitizer complains about data races whenever I shut down the interface. The interface is created like so:

    ntca::InterfaceConfig icf;
    string name = "prx_cli_" + to_string(id_);
    icf.setThreadName(name);
    icf.setMinThreads(1);
    icf.setMaxThreads(1);

    ntca::ResolverConfig resolver_config;
    resolver_config.setClientTimeout(10); // We wait for up to 10 seconds for the DNS to respond.
    icf.setResolverConfig(resolver_config);
    icf.setResolverEnabled(true);

    {
        lock_guard<decltype(mutex_)> lock(mutex_);
        interface_ = ntcf::System::createInterface(icf);
        interface_->start();
        strand_ = interface_->createStrand();  // Single strand to serialize callbacks.
    }

    auto promise = make_shared<std::promise<void>>();
    auto fut = promise->get_future();
    interface_->execute([this, promise]() {
        thread_id_ = this_thread::get_id();
        promise->set_value();
    });

    fut.get();  // Wait until the interface thread has come up.

    interface_->execute([this]() { resolve_and_connect(true); });

There is a single thread in the interface. I have assertions all through the code to verify that callbacks are invoked on the single interface thread. All callbacks are created using the appropriate createCallback functions, and they specify the single strand for the interface.

To shut down the interface, I have this code:

    bsl::shared_ptr<ntci::Interface> interface;
    {
        lock_guard<decltype(mutex_)> lock(mutex_);
        interface = interface_;
    }

    interface->resolver()->shutdown();
    interface->resolver()->linger(
    interface->closeAll();
    interface->shutdown();
    interface->linger();

At the time I call stop(), there is no activity, other than a single receive() call pending, whose completion handler has not popped yet because the server has not written anything. There are no pending send() calls or other async operations.

In case it matters, my reads specify a timeout:

    ntca::ReceiveOptions r_opts;
    r_opts.setMinSize(GW_HDR_SIZE);
    r_opts.setMaxSize(GW_HDR_SIZE);
    const int64_t secs =  2 * heartbeat_interval_sec;
    bsls::TimeInterval deadline(socket_->currentTime().addSeconds(secs));
    r_opts.setDeadline(deadline);

    auto receive_done = [this](auto receiver, auto blob, auto event) {
        BSLS_ASSERT(receiver == socket_);
        assert(this_thread::get_id() == thread_id_.load());
        if (event.isError()) {
            if (state_ != ShuttingDown) {
                // Print error, close the connection, and reconnect...
                try_reconnect();
            }
            return;
        }
        read_gw_hdr_handler(std::move(blob));
    };
    auto callback = socket_->createReceiveCallback(receive_done, strand_);
    auto error = socket_->receive(r_opts, callback);
    if (error) {

        CS_LOG(FATAL) << "Cannot read: " << error.text();
        abort();
    }

Once I call my stop code, thread sanitizer complains. I have pasted some of the salient output here. T11 is the interface thread.

==================
WARNING: ThreadSanitizer: data race (pid=1625761)
  Read of size 8 at 0x7b2400004f38 by main thread:
    #0 bsl::shared_ptr<BloombergLP::ntci::Authorization>::~shared_ptr() /opt/bb/include/bslstl_sharedptr.h:4702 (ProxyClientStream_test+0x1d6f5b)
    #1 BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)>::~Invoker() /opt/bb/include/ntci_invoker.h:324 (ProxyClientStream_test+0x1d44e5)
    #2 void BloombergLP::bslma::SharedPtrInplaceRep_ImpUtil::dispose<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >(BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&) /opt/bb/include/bslma_sharedptrinplacerep.h:332 (ProxyClientStream_test+0x1dfadb)
    #3 BloombergLP::bslma::SharedPtrInplaceRep<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >::disposeObject() /opt/bb/include/bslma_sharedptrinplacerep.h:371 (ProxyClientStream_test+0x1dee5b)
    #4 BloombergLP::bslma::SharedPtrRep::releaseRef() <null> (ProxyClientStream_test+0x4a6b85)
    #5 ProxyClientStreamTest_connect_after_send_Test::TestBody() /home/mhenning/code/cpp/ts-gateway/test/ProxyClientStream/ProxyClientStream_test.cc:205 (ProxyClientStream_test+0x8f165)
    #6 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x15c619)
    #7 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x151809)
    #8 testing::Test::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2508 (ProxyClientStream_test+0x12331b)
    #9 testing::TestInfo::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2684 (ProxyClientStream_test+0x123fd6)
    #10 testing::TestSuite::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2816 (ProxyClientStream_test+0x124a88)
    #11 testing::internal::UnitTestImpl::RunAllTests() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:5338 (ProxyClientStream_test+0x132d28)
    #12 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2433 (ProxyClientStream_test+0x15e571)
    #13 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) <null> (ProxyClientStream_test+0x15344b)
    #14 testing::UnitTest::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:4925 (ProxyClientStream_test+0x130e8f)
    #15 RUN_ALL_TESTS() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/include/gtest/gtest.h:2473 (ProxyClientStream_test+0x179df2)
    #16 main /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googlemock/src/gmock_main.cc:63 (ProxyClientStream_test+0x179ced)

  Previous write of size 8 at 0x7b2400004f38 by thread T11 (mutexes: write M1597):
    #0 operator new(unsigned long) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:64 (libtsan.so.0+0x8c032)
    #1 operator new(unsigned long, BloombergLP::bslma::Allocator&) /opt/bb/include/bslma_allocator.h:632 (ProxyClientStream_test+0x1be4cc)
    #2 void bsl::shared_ptr<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >::createInplace<bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*&>(BloombergLP::bslma::Allocator*, bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*&) /opt/bb/include/bslstl_sharedptr.h:4944 (ProxyClientStream_test+0x1cbca8)
    #3 BloombergLP::ntci::Callback<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)>::Callback(bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, bsl::shared_ptr<BloombergLP::ntci::Strand> const&, BloombergLP::bslma::Allocator*) /home/mhenning/code/ntf-core/groups/ntc/ntci/ntci_callback.h:940 (ProxyClientStream_test+0x3141a0)
    #4 BloombergLP::ntci::TimerCallbackFactory::createTimerCallback(bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*) /home/mhenning/code/ntf-core/groups/ntc/ntci/ntci_timercallbackfactory.h:97 (ProxyClientStream_test+0x3141a0)
    #5 BloombergLP::ntcr::StreamSocket::receive(BloombergLP::ntca::ReceiveOptions const&, BloombergLP::ntci::Callback<void (bsl::shared_ptr<BloombergLP::ntci::Receiver> const&, bsl::shared_ptr<BloombergLP::bdlbb::Blob> const&, BloombergLP::ntca::ReceiveEvent const&)> const&) /home/mhenning/code/ntf-core/groups/ntc/ntcr/ntcr_streamsocket.cpp:4923 (ProxyClientStream_test+0x3141a0)
    #6 operator() /home/mhenning/code/cpp/ts-gateway/src/ProxyClientStream.cc:1133 (ProxyClientStream_test+0x1ae284)
    #7 invoke /opt/bb/include/bslstl_function_invokerutil.h:927 (ProxyClientStream_test+0x1b78ef)
    #8 BloombergLP::bslstl::Function_Variadic<void ()>::operator()() const /opt/bb/include/bslstl_function.h:1518 (ProxyClientStream_test+0x3e6fd9)
    #9 BloombergLP::ntcs::Chronology::announce() /home/mhenning/code/ntf-core/groups/ntc/ntcs/ntcs_chronology.cpp:1157 (ProxyClientStream_test+0x3e6fd9)

  As if synchronized via sleep:
    #0 sleep ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:348 (libtsan.so.0+0x6314b)
    #1 ProxyClientStreamTest_connect_after_send_Test::TestBody() /home/mhenning/code/cpp/ts-gateway/test/ProxyClientStream/ProxyClientStream_test.cc:203 (ProxyClientStream_test+0x8f14e)
    #2 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x15c619)
    #3 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x151809)
    #4 testing::Test::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2508 (ProxyClientStream_test+0x12331b)
    #5 testing::TestInfo::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2684 (ProxyClientStream_test+0x123fd6)
    #6 testing::TestSuite::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2816 (ProxyClientStream_test+0x124a88)
    #7 testing::internal::UnitTestImpl::RunAllTests() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:5338 (ProxyClientStream_test+0x132d28)
    #8 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2433 (ProxyClientStream_test+0x15e571)
    #9 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) <null> (ProxyClientStream_test+0x15344b)
    #10 testing::UnitTest::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:4925 (ProxyClientStream_test+0x130e8f)
    #11 RUN_ALL_TESTS() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/include/gtest/gtest.h:2473 (ProxyClientStream_test+0x179df2)
    #12 main /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googlemock/src/gmock_main.cc:63 (ProxyClientStream_test+0x179ced)
==================
WARNING: ThreadSanitizer: data race (pid=1625761)
  Read of size 8 at 0x7b2400004ed0 by main thread:
    #0 BloombergLP::bslma::SharedPtrInplaceRep<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >::disposeRep() /opt/bb/include/bslma_sharedptrinplacerep.h:378 (ProxyClientStream_test+0x1dee90)
    #1 BloombergLP::ntcs::Chronology::Timer::close() /opt/bb/include/bslstl_sharedptr.h:4703 (ProxyClientStream_test+0x3e3e85)
    #2 ProxyClientStreamTest_connect_after_send_Test::TestBody() /home/mhenning/code/cpp/ts-gateway/test/ProxyClientStream/ProxyClientStream_test.cc:205 (ProxyClientStream_test+0x8f165)
    #3 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x15c619)
    #4 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x151809)
    #5 testing::Test::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2508 (ProxyClientStream_test+0x12331b)
    #6 testing::TestInfo::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2684 (ProxyClientStream_test+0x123fd6)
    #7 testing::TestSuite::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2816 (ProxyClientStream_test+0x124a88)
    #8 testing::internal::UnitTestImpl::RunAllTests() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:5338 (ProxyClientStream_test+0x132d28)
    #9 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2433 (ProxyClientStream_test+0x15e571)
    #10 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) <null> (ProxyClientStream_test+0x15344b)
    #11 testing::UnitTest::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:4925 (ProxyClientStream_test+0x130e8f)
    #12 RUN_ALL_TESTS() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/include/gtest/gtest.h:2473 (ProxyClientStream_test+0x179df2)
    #13 main /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googlemock/src/gmock_main.cc:63 (ProxyClientStream_test+0x179ced)

  Previous write of size 8 at 0x7b2400004ed0 by thread T11 (mutexes: write M1597):
    #0 operator new(unsigned long) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:64 (libtsan.so.0+0x8c032)
    #1 operator new(unsigned long, BloombergLP::bslma::Allocator&) /opt/bb/include/bslma_allocator.h:632 (ProxyClientStream_test+0x1be4cc)
    #2 void bsl::shared_ptr<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >::createInplace<bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*&>(BloombergLP::bslma::Allocator*, bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*&) /opt/bb/include/bslstl_sharedptr.h:4944 (ProxyClientStream_test+0x1cbca8)
    #3 BloombergLP::ntci::Callback<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)>::Callback(bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, bsl::shared_ptr<BloombergLP::ntci::Strand> const&, BloombergLP::bslma::Allocator*) /home/mhenning/code/ntf-core/groups/ntc/ntci/ntci_callback.h:940 (ProxyClientStream_test+0x3141a0)
    #4 BloombergLP::ntci::TimerCallbackFactory::createTimerCallback(bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*) /home/mhenning/code/ntf-core/groups/ntc/ntci/ntci_timercallbackfactory.h:97 (ProxyClientStream_test+0x3141a0)
    #5 BloombergLP::ntcr::StreamSocket::receive(BloombergLP::ntca::ReceiveOptions const&, BloombergLP::ntci::Callback<void (bsl::shared_ptr<BloombergLP::ntci::Receiver> const&, bsl::shared_ptr<BloombergLP::bdlbb::Blob> const&, BloombergLP::ntca::ReceiveEvent const&)> const&) /home/mhenning/code/ntf-core/groups/ntc/ntcr/ntcr_streamsocket.cpp:4923 (ProxyClientStream_test+0x3141a0)
    #6 operator() /home/mhenning/code/cpp/ts-gateway/src/ProxyClientStream.cc:1133 (ProxyClientStream_test+0x1ae284)
    #7 invoke /opt/bb/include/bslstl_function_invokerutil.h:927 (ProxyClientStream_test+0x1b78ef)
    #8 BloombergLP::bslstl::Function_Variadic<void ()>::operator()() const /opt/bb/include/bslstl_function.h:1518 (ProxyClientStream_test+0x3e6fd9)
    #9 BloombergLP::ntcs::Chronology::announce() /home/mhenning/code/ntf-core/groups/ntc/ntcs/ntcs_chronology.cpp:1157 (ProxyClientStream_test+0x3e6fd9)

  As if synchronized via sleep:
    #0 sleep ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:348 (libtsan.so.0+0x6314b)
    #1 ProxyClientStreamTest_connect_after_send_Test::TestBody() /home/mhenning/code/cpp/ts-gateway/test/ProxyClientStream/ProxyClientStream_test.cc:203 (ProxyClientStream_test+0x8f14e)
    #2 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x15c619)
    #3 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x151809)
    #4 testing::Test::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2508 (ProxyClientStream_test+0x12331b)
    #5 testing::TestInfo::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2684 (ProxyClientStream_test+0x123fd6)
    #6 testing::TestSuite::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2816 (ProxyClientStream_test+0x124a88)
    #7 testing::internal::UnitTestImpl::RunAllTests() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:5338 (ProxyClientStream_test+0x132d28)
    #8 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2433 (ProxyClientStream_test+0x15e571)
    #9 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) <null> (ProxyClientStream_test+0x15344b)
    #10 testing::UnitTest::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:4925 (ProxyClientStream_test+0x130e8f)
    #11 RUN_ALL_TESTS() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/include/gtest/gtest.h:2473 (ProxyClientStream_test+0x179df2)
    #12 main /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googlemock/src/gmock_main.cc:63 (ProxyClientStream_test+0x179ced)
==================
WARNING: ThreadSanitizer: data race (pid=1625761)
  Write of size 8 at 0x7b2400004ec0 by main thread:
    #0 operator delete(void*) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:126 (libtsan.so.0+0x8b2c8)
    #1 BloombergLP::bslma::SharedPtrInplaceRep<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >::disposeRep() /opt/bb/include/bslma_sharedptrinplacerep.h:378 (ProxyClientStream_test+0x1deee9)
    #2 BloombergLP::ntcs::Chronology::Timer::close() /opt/bb/include/bslstl_sharedptr.h:4703 (ProxyClientStream_test+0x3e3e85)
    #3 ProxyClientStreamTest_connect_after_send_Test::TestBody() /home/mhenning/code/cpp/ts-gateway/test/ProxyClientStream/ProxyClientStream_test.cc:205 (ProxyClientStream_test+0x8f165)
    #4 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x15c619)
    #5 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x151809)
    #6 testing::Test::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2508 (ProxyClientStream_test+0x12331b)
    #7 testing::TestInfo::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2684 (ProxyClientStream_test+0x123fd6)
    #8 testing::TestSuite::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2816 (ProxyClientStream_test+0x124a88)
    #9 testing::internal::UnitTestImpl::RunAllTests() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:5338 (ProxyClientStream_test+0x132d28)
    #10 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2433 (ProxyClientStream_test+0x15e571)
    #11 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) <null> (ProxyClientStream_test+0x15344b)
    #12 testing::UnitTest::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:4925 (ProxyClientStream_test+0x130e8f)
    #13 RUN_ALL_TESTS() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/include/gtest/gtest.h:2473 (ProxyClientStream_test+0x179df2)
    #14 main /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googlemock/src/gmock_main.cc:63 (ProxyClientStream_test+0x179ced)

  Previous write of size 8 at 0x7b2400004ec0 by thread T11 (mutexes: write M1597):
    #0 operator new(unsigned long) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:64 (libtsan.so.0+0x8c032)
    #1 operator new(unsigned long, BloombergLP::bslma::Allocator&) /opt/bb/include/bslma_allocator.h:632 (ProxyClientStream_test+0x1be4cc)
    #2 void bsl::shared_ptr<BloombergLP::ntci::Invoker<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> >::createInplace<bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*&>(BloombergLP::bslma::Allocator*, bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*&) /opt/bb/include/bslstl_sharedptr.h:4944 (ProxyClientStream_test+0x1cbca8)
    #3 BloombergLP::ntci::Callback<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)>::Callback(bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, bsl::shared_ptr<BloombergLP::ntci::Strand> const&, BloombergLP::bslma::Allocator*) /home/mhenning/code/ntf-core/groups/ntc/ntci/ntci_callback.h:940 (ProxyClientStream_test+0x3141a0)
    #4 BloombergLP::ntci::TimerCallbackFactory::createTimerCallback(bsl::function<void (bsl::shared_ptr<BloombergLP::ntci::Timer> const&, BloombergLP::ntca::TimerEvent const&)> const&, BloombergLP::bslma::Allocator*) /home/mhenning/code/ntf-core/groups/ntc/ntci/ntci_timercallbackfactory.h:97 (ProxyClientStream_test+0x3141a0)
    #5 BloombergLP::ntcr::StreamSocket::receive(BloombergLP::ntca::ReceiveOptions const&, BloombergLP::ntci::Callback<void (bsl::shared_ptr<BloombergLP::ntci::Receiver> const&, bsl::shared_ptr<BloombergLP::bdlbb::Blob> const&, BloombergLP::ntca::ReceiveEvent const&)> const&) /home/mhenning/code/ntf-core/groups/ntc/ntcr/ntcr_streamsocket.cpp:4923 (ProxyClientStream_test+0x3141a0)
    #6 operator() /home/mhenning/code/cpp/ts-gateway/src/ProxyClientStream.cc:1133 (ProxyClientStream_test+0x1ae284)
    #7 invoke /opt/bb/include/bslstl_function_invokerutil.h:927 (ProxyClientStream_test+0x1b78ef)
    #8 BloombergLP::bslstl::Function_Variadic<void ()>::operator()() const /opt/bb/include/bslstl_function.h:1518 (ProxyClientStream_test+0x3e6fd9)
    #9 BloombergLP::ntcs::Chronology::announce() /home/mhenning/code/ntf-core/groups/ntc/ntcs/ntcs_chronology.cpp:1157 (ProxyClientStream_test+0x3e6fd9)

  As if synchronized via sleep:
    #0 sleep ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:348 (libtsan.so.0+0x6314b)
    #1 ProxyClientStreamTest_connect_after_send_Test::TestBody() /home/mhenning/code/cpp/ts-gateway/test/ProxyClientStream/ProxyClientStream_test.cc:203 (ProxyClientStream_test+0x8f14e)
    #2 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x15c619)
    #3 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) <null> (ProxyClientStream_test+0x151809)
    #4 testing::Test::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2508 (ProxyClientStream_test+0x12331b)
    #5 testing::TestInfo::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2684 (ProxyClientStream_test+0x123fd6)
    #6 testing::TestSuite::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2816 (ProxyClientStream_test+0x124a88)
    #7 testing::internal::UnitTestImpl::RunAllTests() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:5338 (ProxyClientStream_test+0x132d28)
    #8 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:2433 (ProxyClientStream_test+0x15e571)
    #9 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) <null> (ProxyClientStream_test+0x15344b)
    #10 testing::UnitTest::Run() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/src/gtest.cc:4925 (ProxyClientStream_test+0x130e8f)
    #11 RUN_ALL_TESTS() /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googletest/include/gtest/gtest.h:2473 (ProxyClientStream_test+0x179df2)
    #12 main /home/mhenning/code/cpp/ts-gateway/build/third-party/gtest-src/googlemock/src/gmock_main.cc:63 (ProxyClientStream_test+0x179ced)

The text was updated successfully, but these errors were encountered:

mattrm456 · 2023-04-25T17:55:36Z

Are you able to post the complete code for your client? Stripped down if possible, but something that still reproduces this data race report from tsan.

michihenning · 2023-04-27T04:17:50Z

I've distilled this down as much as I can. Attached is the source for a trivial program. It implements a very simple server and client.
To avoid any potential confusion, the server is implemented using system calls only, not using NTF. It binds to port 10001, listens, accepts a single incoming connection, and then goes into a read loop that runs until the corresponding file descriptor is closed. (This is just to make the server thread block without doing anything.)

The client is implemented using NTF. It initializes the interface and connects to the server, which works fine. The main() function instantiates the server and the client and sleeps for five seconds. Then it shuts down the NTF run time. At this point, I'm getting complaints from thread sanitizer (attached).

The code is compiled with --std=c++17 -g -Wall -fsanitize=thread -fsanitize-recover=address -fno-omit-frame-pointer

If port 10001 is in use on your machine, change it to something unused at the top of the source. To keep things simple, I hardwired it. I also used assert() to bail out if there is an error, rather than confusing the issue with irrelevant error-handling noise.
tsan.txt
ntf_test.cc.txt

michihenning · 2023-04-27T04:20:39Z

Apologies, I closed this accidentally.

rdilipk · 2023-05-23T21:40:58Z

@michihenning I played around with this and the problem seems to be with the way the shutdown sequence is initiated in the Client destructor. You have:

Client::~Client()
{
    interface_->closeAll(); // <-- this one should go below everything
    interface_->shutdown();
    interface_->linger();
}

It looks like you want to call closeAll() as the very last thing to do after shutdown() and linger() complete. Once I did that, the tsan complaints went away in the dozen times I tried. Of course, this is just experimentation. I don't know if it's actually true but if it is, maybe it should be called out in the docs explicitly.

michihenning · 2023-05-24T01:43:51Z

It never occurred to me to do this. I find this highly surprising. After all, I want to tell it to tear everything down and then wait until it’s all finished. Seems really counter-intuitive.

In particular, if I call closeAll() as the last thing, how do I know when it’s done? I don’t want to exit with threads still running, and I might want to shut down something else (not NTF-related) after NTF has finished shutting down.

I had a look at InterfaceStopGuard. It calls shutdown() and linger(), but not closeAll(). The doc is really lacking here. There is no mention of what actually happens when I call closeAll() or shutdown(). Does shutdown() imply closeAll()? When would I call closeAll() instead of shutdown()? When I call shutdown(), will the completion handlers for everything get invoked with a special error? I would hope so, because might have to do some cleanup of my own.

rdilipk · 2023-05-24T11:21:28Z

To clarify, I was just experimenting with various ways of shutting down. It may have gotten rid of the tsan complaints but certainly is not necessarily the right thing to do. Let's wait for the official word.

smtrfnv · 2024-06-07T17:54:09Z

Hello @michihenning and @rdilipk. First of all sorry that it took us so long to check that issue.
Currently I am playing with ntf_test.cc.txt and so far I was not able to get any tsan warning/error.
I am using clang-18.1.5, both bde and ntf-core are built with -fsanitize=thread, as well as a simple project based on ntf_test.cc.txt.

It may be that the issue is no longet there as sometime ago we had a major redesign of the way how sockets are shutdown and closed in ntc. I plan to examine it further.

rdilipk · 2024-06-07T18:01:22Z

I don't speak for @michihenning but I think he gave up on NTF long ago. It may not matter at this point.

michihenning closed this as completed Apr 27, 2023

michihenning reopened this Apr 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data race during shutdown #26

Data race during shutdown #26

michihenning commented Apr 24, 2023

mattrm456 commented Apr 25, 2023

michihenning commented Apr 27, 2023 •

edited

Loading

michihenning commented Apr 27, 2023

rdilipk commented May 23, 2023

michihenning commented May 24, 2023

rdilipk commented May 24, 2023

smtrfnv commented Jun 7, 2024

rdilipk commented Jun 7, 2024

Data race during shutdown #26

Data race during shutdown #26

Comments

michihenning commented Apr 24, 2023

mattrm456 commented Apr 25, 2023

michihenning commented Apr 27, 2023 • edited Loading

michihenning commented Apr 27, 2023

rdilipk commented May 23, 2023

michihenning commented May 24, 2023

rdilipk commented May 24, 2023

smtrfnv commented Jun 7, 2024

rdilipk commented Jun 7, 2024

michihenning commented Apr 27, 2023 •

edited

Loading