-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix parent refcount (huge memory leak) on non-last thread exit #51
base: master
Are you sure you want to change the base?
Conversation
how do you get to these changes? Is that a backport from a new qemu version? |
@@ -8323,6 +8324,8 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1, | |||
do_sys_futex(g2h(cpu, ts->child_tidptr), | |||
FUTEX_WAKE, INT_MAX, NULL, NULL, 0); | |||
} | |||
|
|||
object_unref(OBJECT(cpu)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at current QEMU, this should be done before unlocking clone_lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the thing is that:
from comments in object_initialize_child_with_propsv():
- We want @obj's reference to be 1 on success, 0 on failure.
* On success, it's 2: one taken by object_initialize(), and one
* by object_property_add_child().
* On failure in object_initialize() or earlier, it's 1.
* On failure afterwards, it's also 1: object_unparent() releases
* the reference taken by object_property_add_child().
next, when thread done, obj->ref == 2;
We doing just single obj_unref(), which leads to:
if (qatomic_fetch_dec(&obj->ref) == 1) {
, which executed as if (2 == 1)
,
and decreasing obj->ref by one, after fetch...
So, the cpu
object, which is analogue to task_struct
kernel object, still allocated and forgotten.
Moreover, even if we pass this branch, then in
static void object_finalize(void *data)
we have g_assert(obj->parent == NULL);
which expect no-linkage to parent.
Finally, we remove linkage to parent while holding a lock, and freeing (unknown to parent now) object without.
@@ -1179,10 +1179,10 @@ void object_unref(void *objptr) | |||
if (!obj) { | |||
return; | |||
} | |||
g_assert(obj->ref > 0); | |||
g_assert(qatomic_mb_read(&obj->ref) > 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
object_unref in current QEMU is still the same of this one
void object_unref(void *objptr)
{
Object *obj = OBJECT(objptr);
if (!obj) {
return;
}
g_assert(obj->ref > 0);
/* parent always holds a reference to its children */
if (qatomic_fetch_dec(&obj->ref) == 1) {
object_finalize(obj);
}
}
If you spotted a bug in QEMU you should report to them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (qatomic_fetch_dec(&obj->ref) == 1) {
is equal to if (qatomic_dec_fetch(&obj->ref) == 0) {
, we can left it as is and the patch above would still working.
Anyway, we are fuzzing and don't care about hw emulation, so we can just change the things.
Same method, as with rcu_disable_atfork()
, right?
/* parent always holds a reference to its children */
so the child left allocated forever.
No, it is not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, this is a simplest PoC:
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <inttypes.h>
#include <pthread.h>
#include <sys/wait.h>
#include <sched.h>
void *thread_func(void *arg)
{
return NULL;
}
void test_pthread(void)
{
pthread_t tid1, tid2;
pthread_create(&tid1, NULL, thread_func, "hello1");
pthread_create(&tid2, NULL, thread_func, "hello2");
pthread_join(tid1, NULL);
pthread_join(tid2, NULL);
}
int main(int argc, char **argv)
{
int i;
for (i=0;i<0x10000;i++)
test_pthread();
return 0;
}
cc threadsexit.c -lpthread -o /tmp/t
and try to run both unpatched and patched versions while looking at htop.
qemu-x86_64 /tmp/t
@@ -8323,6 +8324,8 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1, | |||
do_sys_futex(g2h(cpu, ts->child_tidptr), | |||
FUTEX_WAKE, INT_MAX, NULL, NULL, 0); | |||
} | |||
|
|||
object_unref(OBJECT(cpu)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the thing is that:
from comments in object_initialize_child_with_propsv():
- We want @obj's reference to be 1 on success, 0 on failure.
* On success, it's 2: one taken by object_initialize(), and one
* by object_property_add_child().
* On failure in object_initialize() or earlier, it's 1.
* On failure afterwards, it's also 1: object_unparent() releases
* the reference taken by object_property_add_child().
next, when thread done, obj->ref == 2;
We doing just single obj_unref(), which leads to:
if (qatomic_fetch_dec(&obj->ref) == 1) {
, which executed as if (2 == 1)
,
and decreasing obj->ref by one, after fetch...
So, the cpu
object, which is analogue to task_struct
kernel object, still allocated and forgotten.
Moreover, even if we pass this branch, then in
static void object_finalize(void *data)
we have g_assert(obj->parent == NULL);
which expect no-linkage to parent.
Finally, we remove linkage to parent while holding a lock, and freeing (unknown to parent now) object without.
@@ -1179,10 +1179,10 @@ void object_unref(void *objptr) | |||
if (!obj) { | |||
return; | |||
} | |||
g_assert(obj->ref > 0); | |||
g_assert(qatomic_mb_read(&obj->ref) > 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (qatomic_fetch_dec(&obj->ref) == 1) {
is equal to if (qatomic_dec_fetch(&obj->ref) == 0) {
, we can left it as is and the patch above would still working.
Anyway, we are fuzzing and don't care about hw emulation, so we can just change the things.
Same method, as with rcu_disable_atfork()
, right?
/* parent always holds a reference to its children */
so the child left allocated forever.
@andreafioraldi you know the code much better (or - at all - ) than me. can you react to @kotee4ko 's replies? thank you! |
I'll look at it on monday |
This is a QEMU bug, not related to qemuafl, it is now fixed
https://gitlab.com/qemu-project/qemu/-/issues/866
IMO we should simply cherry-pick the commits and backport to qemuafl.
Please in case of bugs in QEMU report them upstream not to us.
Il giorno sab 21 set 2024 alle ore 08:35 van Hauser <
***@***.***> ha scritto:
… @andreafioraldi <https://github.com/andreafioraldi> ?
—
Reply to this email directly, view it on GitHub
<#51 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD3LJ6WLKEE5CUQKDIMTTZTZXUHU5AVCNFSM6AAAAABOTJ4U2WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRVGAZDQMZWG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@kotee4ko I agree, cherry picking fix commits is better, it will it more likely that future cherry picked fix commits can be applied too easily. do you want to do that? |
fix it plz