Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

double free exception #55

Open
DorisBDR opened this issue Jan 9, 2025 · 10 comments
Open

double free exception #55

DorisBDR opened this issue Jan 9, 2025 · 10 comments

Comments

@DorisBDR
Copy link
Collaborator

DorisBDR commented Jan 9, 2025

we had to patch our uda server (both 2.7 and 2.8.0) due to a double free error
it happened when the client is multi-threaded and when some requests generate an error
for the fix we removed the lines (around 400 of client/udaClient.cpp).

 if (tv_server_start != 0) {
         if(server_block.idamerrorstack.idamerror!=NULL){
                free(server_block.idamerrorstack.idamerror);
         }

        freeIdamErrorStack(&server_block.idamerrorstack);    // Free Previous Stack Heap
    }

the full stack shown by gdb

#3  0x00007f3fdb01c5ec in malloc_printerr () from /lib64/libc.so.6
No symbol table info available.
#4  0x00007f3fdb01e454 in _int_free () from /lib64/libc.so.6
No symbol table info available.
#5  0x00007f3ec192d701 in idamClient (request_block=0x7f3ed0fbf370, indices=0x7f3ed0fbf36c)
    at /mnt/ITER/abadiel/udacert/uda280/mudaidamccs720/target/main/c/source/client/udaClient.cpp:418
        err = 1
        serverside = 0
        allocMetaHeap = false
        startupStates = 1
        system_startup = false
        reopen_logs = false
        client_input = 0x7f3ec1c8c9e0 <clientCreateXDRStream()::client_input>
        client_output = 0x7f3ec1c8ca20 <clientCreateXDRStream()::client_output>
        malloc_source = 0
        log_struct_list = {listcount = 0, listsize = 0, logstruct = 0x0}
        private_flags = 0x7f3ec1c9201c <udaPrivateFlags::private_flags>
        client_flags = 0x7f3ec1c91fe0 <udaClientFlags::client_flags>
        protocol_time = 94663234402856
        cached_data_block_list = {count = -2139768280, data = 0x56188075ba28}
        data_received = false
        data_block_indices = std::vector of length 0, capacity 0
        data_system = 0x56188075ba28
        system_config = 0x56188075ba28
        data_source = 0x56188075ba28
        signal_rec = 0x56188075ba28
        signal_desc = 0x56188075ba28
        __func__ = " to empty b"
#6  0x00007f3ec1912a33 in idamGetAPIWithHost (data_object=0x7f3ed0fbf450 "UDA::getMeta(path='/ITER/CRYO-CH-CB3-N2SC:MP4100-AA-YL/fields', startTime=-1)", 
    data_source=0x7f3eba92ccee "", host=0x56187f7249a0 "io-ls-udasrv1.iter.org", port=3090)
    at /mnt/ITER/abadiel/udacert/uda280/mudaidamccs720/target/main/c/source/client/udaGetAPI.cpp:235
        client_flags = 0x7f3ec1c91fe0 <udaClientFlags::client_flags>
        err = 0
--Type <RET> for more, q to quit, c to continue without paging--
        startup = false
        reopen_logs = true
        request_block = {num_requests = 1, requests = 0x56187f7df400}
        __func__ = "g the Data Source\000"
        handle = 22040
#7  0x00007f3eba926cbf in UdaClientReaderBase::callIDAM (this=this@entry=0x56187f7de9c0, 
    CMD=CMD@entry=0x7f3ed0fbf450 "UDA::getMeta(path='/ITER/CRYO-CH-CB3-N2SC:MP4100-AA-YL/fields', startTime=-1)") at /usr/include/c++/8/bits/basic_string.h:2294
        THI = 0x56187f7de9c0
        runCounter = 238
        timestamp = 1736157676460169659
        handle = <optimized out>
#8  0x00007f3eba92a4bd in UdaClientReaderBase::getFieldsList (this=0x56187f7de9c0, aVariableName=..., timeFrom="-1") at ../common/uda_client_reader_base.cpp:1181
        VariableName = "/ITER/CRYO-CH-CB3-N2SC:MP4100-AA-YL"
        CMD = "UDA::getMeta(path='/ITER/CRYO-CH-CB3-N2SC:MP4100-AA-YL/fields', startTime=-1)\000\000\000\000\000\000\000\071\062\062\061\000\023)\340TFԓ\360\364\373\320>\177\000\000\360\364\373\320>\177\000\000\376\003\000\000\000\000\000\000\026\v\031\303>\177\000\000h\366\373\320>\177\000\061\000\023)\340TFԓ \365\373\320>\177\000\000 \365\373\320>\177\000\000\376\003\000\000\000\000\000\000_\v\031\303>\177\000\000\230\366\373\320>\177\000\000\000\300\005\234?\177\000\000\260\366\373\320>\177\000\000\204G"...
        handle = <optimized out>

@stephen-dixon
Copy link
Contributor

The lines you refer to don't seem to exist in either 2.7.x or 2.8.0?

udaClient.cpp looks like this

 if (tv_server_start != 0) {
        freeIdamErrorStack(&server_block.idamerrorstack);    // Free Previous Stack Heap
    }

and the freeIdamErrorStack doesn't actually contain a free anymore, just assigns a nullptr.

void freeIdamErrorStack(UDA_ERROR_STACK* errorstack)
{
    // "FIX" : this is causing segfaults when using multiple clients (eg. get and put) 
    //         apparently due to both trying to free the same memory. Needs fixing properly.
    //    free(errorstack->idamerror);
      
    errorstack->nerrors = 0;
    errorstack->idamerror = nullptr;
}

Ideally I'd like to put the free back into the freeIdamErrorStack function to avoid memory leaks.

Please could you instead test with this updated function in the freeIdamErrorStack function (and then only call this function in the udaClient) ? or provide me with a minimal example of the multithreaded client code which causes the fault you see so I can test? do I just need to make 2 uda requests in separate threads where the first call fails in order to reproduce?

void freeIdamErrorStack(UDA_ERROR_STACK* errorstack)
{ 
    if (errorstack->idamerror != nullptr)
    {
        free(errorstack->idamerror);
    }
    errorstack->nerrors = 0;
    errorstack->idamerror = nullptr;
}

@doris75
Copy link

doris75 commented Jan 13, 2025

yes you are right, sorry it is a commit we did on our side but actually it causes other issues.
Anyway the error handling needs to be rewritten in case of multi-threaded client, we are going to make a proposal soon

@stephen-dixon
Copy link
Contributor

Managed to reproduce. For me I think adding these 2 lines in the udaLockThread function (client/accAPI.cpp:151) may work. I think that without this, the same (global) "server_block" structure is always used in udaClient.cpp for all threads. This change (or something similar) can ensure each thread-specific server_block from the idamState store gets used instead.

diff --git a/source/client/accAPI.cpp b/source/client/accAPI.cpp
index 0d41ac0d..b7d91b68 100755
--- a/source/client/accAPI.cpp
+++ b/source/client/accAPI.cpp
@@ -156,6 +156,8 @@ void udaLockThread()
         client_flags->flags = idamState[id].client_block.clientFlags;
         putIdamThreadLastHandle(idamState[id].lastHandle);
     } else {
+        int id = getThreadId(threadId);
+        putIdamThreadServerBlock(&idamState[id].server_block);
         putIdamThreadLastHandle(-1);
     }
 }

Let me know what you think.

For info: This is the errorstack free function I tested it with.

void freeIdamErrorStack(UDA_ERROR_STACK* errorstack)
{
    if (errorstack->idamerror != nullptr)
    {
       free(errorstack->idamerror);
    }

    errorstack->idamerror = nullptr;
    errorstack->nerrors = 0;
}

@doris75
Copy link

doris75 commented Jan 16, 2025

i think it is more complex than that because we have public API which retrieves information directly from the server_block if retrieval of data failed. Some of them can be used directly in the user client library . All these methods need to be reviewed...
E.g. in udaClient.cpp

const char* getIdamServerErrorStackRecordMsg(int record)
{
    UDA_LOG(UDA_LOG_DEBUG, "record %d\n", record);
    UDA_LOG(UDA_LOG_DEBUG, "count  %d\n", server_block.idamerrorstack.nerrors);
    if (record < 0 || (unsigned int)record >= server_block.idamerrorstack.nerrors) {
        return nullptr;
    }
    return server_block.idamerrorstack.idamerror[record].msg;   // Server Error Stack Record Message
}

@stephen-dixon
Copy link
Contributor

stephen-dixon commented Jan 16, 2025

Do you mean you will need additional calls to udaLockThread/udaUnlockThread around all of your calls to these functions in a multi-threaded context?

I think this is the way to ensure you access the correct server block in each thread

@doris75
Copy link

doris75 commented Jan 16, 2025

yes even though honestly i'm not a big fan of having server_block a global variable
udaFree/udaFreeAll need also to be reviewed...

@jholloc
Copy link
Collaborator

jholloc commented Jan 16, 2025

With UDA 3.0 and the new C++ client, we should hopefully be able to review a lot of this code, as the client will encapsulate all state and the API functions will retrieve the state from the relevant thread local client.

In this way the above API function would become something like:

const char* getIdamServerErrorStackRecordMsg(int record)
{
    auto* client = getThreadClient();
    return client->getErrorMsg(record); // implementation and state of error records are now private to client instance
}

@doris75
Copy link

doris75 commented Jan 16, 2025

ok it would be much better so are you suggesting to wait for 3.0 for the complete rework? when is it plan then?

@stephen-dixon
Copy link
Contributor

I agree: I'm not a fan of having any global state. This is the main improvement in the client2 and server2 implementations as well as some further work going on in v3.0 that Jonathan has mentioned.

My initial feeling though is that for advanced use-cases such as multi-threaded client applications, handling thread locks can be the responsibility of the multithreaded application developer using e.g. manual calls to udaLockThread. (though we'll happily merge in any improvements into a 2.8.X release if you have work in-process that does something better)

On the uda side we can improve documentation for this (for the existing client with all its global state) and otherwise just continue to work on enhanced concurrency support for release 3.0.

We are planning to release version 3.0 by April.

@doris75
Copy link

doris75 commented Jan 16, 2025

ok i think we will try to make some suggestions so that we can push a few things for 2.8.X

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

4 participants