double free exception #55

DorisBDR · 2025-01-09T16:06:57Z

we had to patch our uda server (both 2.7 and 2.8.0) due to a double free error
it happened when the client is multi-threaded and when some requests generate an error
for the fix we removed the lines (around 400 of client/udaClient.cpp).

 if (tv_server_start != 0) {
         if(server_block.idamerrorstack.idamerror!=NULL){
                free(server_block.idamerrorstack.idamerror);
         }

        freeIdamErrorStack(&server_block.idamerrorstack);    // Free Previous Stack Heap
    }

the full stack shown by gdb

#3  0x00007f3fdb01c5ec in malloc_printerr () from /lib64/libc.so.6
No symbol table info available.
#4  0x00007f3fdb01e454 in _int_free () from /lib64/libc.so.6
No symbol table info available.
#5  0x00007f3ec192d701 in idamClient (request_block=0x7f3ed0fbf370, indices=0x7f3ed0fbf36c)
    at /mnt/ITER/abadiel/udacert/uda280/mudaidamccs720/target/main/c/source/client/udaClient.cpp:418
        err = 1
        serverside = 0
        allocMetaHeap = false
        startupStates = 1
        system_startup = false
        reopen_logs = false
        client_input = 0x7f3ec1c8c9e0 <clientCreateXDRStream()::client_input>
        client_output = 0x7f3ec1c8ca20 <clientCreateXDRStream()::client_output>
        malloc_source = 0
        log_struct_list = {listcount = 0, listsize = 0, logstruct = 0x0}
        private_flags = 0x7f3ec1c9201c <udaPrivateFlags::private_flags>
        client_flags = 0x7f3ec1c91fe0 <udaClientFlags::client_flags>
        protocol_time = 94663234402856
        cached_data_block_list = {count = -2139768280, data = 0x56188075ba28}
        data_received = false
        data_block_indices = std::vector of length 0, capacity 0
        data_system = 0x56188075ba28
        system_config = 0x56188075ba28
        data_source = 0x56188075ba28
        signal_rec = 0x56188075ba28
        signal_desc = 0x56188075ba28
        __func__ = " to empty b"
#6  0x00007f3ec1912a33 in idamGetAPIWithHost (data_object=0x7f3ed0fbf450 "UDA::getMeta(path='/ITER/CRYO-CH-CB3-N2SC:MP4100-AA-YL/fields', startTime=-1)", 
    data_source=0x7f3eba92ccee "", host=0x56187f7249a0 "io-ls-udasrv1.iter.org", port=3090)
    at /mnt/ITER/abadiel/udacert/uda280/mudaidamccs720/target/main/c/source/client/udaGetAPI.cpp:235
        client_flags = 0x7f3ec1c91fe0 <udaClientFlags::client_flags>
        err = 0
--Type <RET> for more, q to quit, c to continue without paging--
        startup = false
        reopen_logs = true
        request_block = {num_requests = 1, requests = 0x56187f7df400}
        __func__ = "g the Data Source\000"
        handle = 22040
#7  0x00007f3eba926cbf in UdaClientReaderBase::callIDAM (this=this@entry=0x56187f7de9c0, 
    CMD=CMD@entry=0x7f3ed0fbf450 "UDA::getMeta(path='/ITER/CRYO-CH-CB3-N2SC:MP4100-AA-YL/fields', startTime=-1)") at /usr/include/c++/8/bits/basic_string.h:2294
        THI = 0x56187f7de9c0
        runCounter = 238
        timestamp = 1736157676460169659
        handle = <optimized out>
#8  0x00007f3eba92a4bd in UdaClientReaderBase::getFieldsList (this=0x56187f7de9c0, aVariableName=..., timeFrom="-1") at ../common/uda_client_reader_base.cpp:1181
        VariableName = "/ITER/CRYO-CH-CB3-N2SC:MP4100-AA-YL"
        CMD = "UDA::getMeta(path='/ITER/CRYO-CH-CB3-N2SC:MP4100-AA-YL/fields', startTime=-1)\000\000\000\000\000\000\000\071\062\062\061\000\023)\340TFԓ\360\364\373\320>\177\000\000\360\364\373\320>\177\000\000\376\003\000\000\000\000\000\000\026\v\031\303>\177\000\000h\366\373\320>\177\000\061\000\023)\340TFԓ \365\373\320>\177\000\000 \365\373\320>\177\000\000\376\003\000\000\000\000\000\000_\v\031\303>\177\000\000\230\366\373\320>\177\000\000\000\300\005\234?\177\000\000\260\366\373\320>\177\000\000\204G"...
        handle = <optimized out>

The text was updated successfully, but these errors were encountered:

stephen-dixon · 2025-01-13T10:55:40Z

The lines you refer to don't seem to exist in either 2.7.x or 2.8.0?

udaClient.cpp looks like this

 if (tv_server_start != 0) {
        freeIdamErrorStack(&server_block.idamerrorstack);    // Free Previous Stack Heap
    }

and the freeIdamErrorStack doesn't actually contain a free anymore, just assigns a nullptr.

void freeIdamErrorStack(UDA_ERROR_STACK* errorstack)
{
    // "FIX" : this is causing segfaults when using multiple clients (eg. get and put) 
    //         apparently due to both trying to free the same memory. Needs fixing properly.
    //    free(errorstack->idamerror);
      
    errorstack->nerrors = 0;
    errorstack->idamerror = nullptr;
}

Ideally I'd like to put the free back into the freeIdamErrorStack function to avoid memory leaks.

Please could you instead test with this updated function in the freeIdamErrorStack function (and then only call this function in the udaClient) ? or provide me with a minimal example of the multithreaded client code which causes the fault you see so I can test? do I just need to make 2 uda requests in separate threads where the first call fails in order to reproduce?

void freeIdamErrorStack(UDA_ERROR_STACK* errorstack)
{ 
    if (errorstack->idamerror != nullptr)
    {
        free(errorstack->idamerror);
    }
    errorstack->nerrors = 0;
    errorstack->idamerror = nullptr;
}

doris75 · 2025-01-13T12:36:47Z

yes you are right, sorry it is a commit we did on our side but actually it causes other issues.
Anyway the error handling needs to be rewritten in case of multi-threaded client, we are going to make a proposal soon

stephen-dixon · 2025-01-15T16:25:38Z

Managed to reproduce. For me I think adding these 2 lines in the udaLockThread function (client/accAPI.cpp:151) may work. I think that without this, the same (global) "server_block" structure is always used in udaClient.cpp for all threads. This change (or something similar) can ensure each thread-specific server_block from the idamState store gets used instead.

diff --git a/source/client/accAPI.cpp b/source/client/accAPI.cpp
index 0d41ac0d..b7d91b68 100755
--- a/source/client/accAPI.cpp
+++ b/source/client/accAPI.cpp
@@ -156,6 +156,8 @@ void udaLockThread()
         client_flags->flags = idamState[id].client_block.clientFlags;
         putIdamThreadLastHandle(idamState[id].lastHandle);
     } else {
+        int id = getThreadId(threadId);
+        putIdamThreadServerBlock(&idamState[id].server_block);
         putIdamThreadLastHandle(-1);
     }
 }

Let me know what you think.

For info: This is the errorstack free function I tested it with.

void freeIdamErrorStack(UDA_ERROR_STACK* errorstack)
{
    if (errorstack->idamerror != nullptr)
    {
       free(errorstack->idamerror);
    }

    errorstack->idamerror = nullptr;
    errorstack->nerrors = 0;
}

doris75 · 2025-01-16T14:51:06Z

i think it is more complex than that because we have public API which retrieves information directly from the server_block if retrieval of data failed. Some of them can be used directly in the user client library . All these methods need to be reviewed...
E.g. in udaClient.cpp

const char* getIdamServerErrorStackRecordMsg(int record)
{
    UDA_LOG(UDA_LOG_DEBUG, "record %d\n", record);
    UDA_LOG(UDA_LOG_DEBUG, "count  %d\n", server_block.idamerrorstack.nerrors);
    if (record < 0 || (unsigned int)record >= server_block.idamerrorstack.nerrors) {
        return nullptr;
    }
    return server_block.idamerrorstack.idamerror[record].msg;   // Server Error Stack Record Message
}

stephen-dixon · 2025-01-16T15:15:07Z

Do you mean you will need additional calls to udaLockThread/udaUnlockThread around all of your calls to these functions in a multi-threaded context?

I think this is the way to ensure you access the correct server block in each thread

doris75 · 2025-01-16T15:17:44Z

yes even though honestly i'm not a big fan of having server_block a global variable
udaFree/udaFreeAll need also to be reviewed...

jholloc · 2025-01-16T15:24:09Z

With UDA 3.0 and the new C++ client, we should hopefully be able to review a lot of this code, as the client will encapsulate all state and the API functions will retrieve the state from the relevant thread local client.

In this way the above API function would become something like:

const char* getIdamServerErrorStackRecordMsg(int record)
{
    auto* client = getThreadClient();
    return client->getErrorMsg(record); // implementation and state of error records are now private to client instance
}

doris75 · 2025-01-16T15:34:10Z

ok it would be much better so are you suggesting to wait for 3.0 for the complete rework? when is it plan then?

stephen-dixon · 2025-01-16T15:40:46Z

I agree: I'm not a fan of having any global state. This is the main improvement in the client2 and server2 implementations as well as some further work going on in v3.0 that Jonathan has mentioned.

My initial feeling though is that for advanced use-cases such as multi-threaded client applications, handling thread locks can be the responsibility of the multithreaded application developer using e.g. manual calls to udaLockThread. (though we'll happily merge in any improvements into a 2.8.X release if you have work in-process that does something better)

On the uda side we can improve documentation for this (for the existing client with all its global state) and otherwise just continue to work on enhanced concurrency support for release 3.0.

We are planning to release version 3.0 by April.

doris75 · 2025-01-16T15:47:09Z

ok i think we will try to make some suggestions so that we can push a few things for 2.8.X

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

double free exception #55

double free exception #55

DorisBDR commented Jan 9, 2025

stephen-dixon commented Jan 13, 2025

doris75 commented Jan 13, 2025

stephen-dixon commented Jan 15, 2025

doris75 commented Jan 16, 2025

stephen-dixon commented Jan 16, 2025 •

edited

Loading

doris75 commented Jan 16, 2025

jholloc commented Jan 16, 2025

doris75 commented Jan 16, 2025

stephen-dixon commented Jan 16, 2025

doris75 commented Jan 16, 2025

double free exception #55

double free exception #55

Comments

DorisBDR commented Jan 9, 2025

stephen-dixon commented Jan 13, 2025

doris75 commented Jan 13, 2025

stephen-dixon commented Jan 15, 2025

doris75 commented Jan 16, 2025

stephen-dixon commented Jan 16, 2025 • edited Loading

doris75 commented Jan 16, 2025

jholloc commented Jan 16, 2025

doris75 commented Jan 16, 2025

stephen-dixon commented Jan 16, 2025

doris75 commented Jan 16, 2025

stephen-dixon commented Jan 16, 2025 •

edited

Loading