Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failure: ip.to_data_ptr<void>() < AddressSpace::rr_page_start() || ip.to_data_ptr<void>() >= AddressSpace::rr_page_end() #3880

Open
Keno opened this issue Nov 14, 2024 · 0 comments · May be fixed by #3881

Comments

@Keno
Copy link
Member

Keno commented Nov 14, 2024

See log tail below. I believe what's happening is that we're interrupting a futex operation which sets the deferred patching bit and then in the interrupting signal handler, we're attempting to do a (syscall-buffered) read which gets descheduled. Upon exiting that syscall, we then see the deferred patching bit set, but get confused because we're attempting to patch the wrong syscall.

I see two related issues:

  1. I don't think we actually check whether there are any other tasks in the same address space that happen to be inside the patching region, which I think could cause the tracee to crash.
  2. The signal handler sigreturns to a different ip.

I think the proper fix for this is probably to record the patching retry request in the syscall event to make sure we look at the right one. We can then simultaneously check that no other tasks are in the region and the last task to exit can do the patching.

[INFO log_pending_events()] SYSCALL: read
[INFO log_pending_events()] DESCHED
[INFO log_pending_events()] SIGNAL_HANDLER: SIGUSR2(async)
[INFO log_pending_events()] SYSCALL_INTERRUPTION: futex
[INFO log_pending_events()] (none)
[RecordSession] EXEC_SYSCALL_DONE: status=0x857f (SYSCALL)
[RecordSession]   original_syscallno:0 (read); return val:0x8
[record_syscall] 865053: processing: SYSCALL: read -- time: 75445
[Task]   (refreshing extra-register cache using XSAVE)
[RecordTask] Wrote event SYSCALL: read for time 75445
[RecordSession]   exiting desched critical section
[RecordSession] desched: IN_SYSCALL
[RecordTask] Wrote event SYSCALLBUF_ABORT_COMMIT for time 75446
[RecordSession] desched: DISARMING_DESCHED_EVENT
[Task] resuming execution of 865053 with PTRACE_SYSCALL tick_period -1 wait 2
[PerfCounters] Resetting counters with period 1152921504606846976
[Scheduler] Starting 865053
[Task] going into blocking wait for 865053 ...
[Task]   Task 865053 changed status to 0x857f (SYSCALL)
[Task]   (refreshing register cache)
[Scheduler] Stopping 865053
[Task] Requesting registers from tracee 865053
[Task] resuming execution of 865053 with PTRACE_SYSCALL tick_period -2 wait 2
[Task] Flushing registers for tid 865053 { ip:0x70000017 args:(0x64,0x2401,0,0,0,0) orig_syscall: 16 syscallno: -38 }
[Scheduler] Starting 865053
[Task] going into blocking wait for 865053 ...
[Task]   Task 865053 changed status to 0x857f (SYSCALL)
[Task]   (refreshing register cache)
[Scheduler] Stopping 865053
[Task] Requesting registers from tracee 865053
[RecordSession] Retrying deferred syscall patching
[FATAL src/Monkeypatcher.cc:1270:try_patch_syscall()] 
 (task 865053 (rec:865053) at time 75447)
 -> Assertion `ip.to_data_ptr<void>() < AddressSpace::rr_page_start() || ip.to_data_ptr<void>() >= AddressSpace::rr_page_end()' failed to hold. 
Keno added a commit that referenced this issue Nov 15, 2024
This fixes #3880 by moving the deferred syscall patching flag from
Task to the corresponding syscall event. Both to avoid the assertion
in that issue and for performance (since unpatched syscalls are traced),
it is important that we (attempt to) patch the correct syscall ip.

Additionally, we add the ability for tests to make sure that a particular
syscall was actually patched. In particular, our logic for bailing out
when another task is in the patch region was accidentally preventing
all future attempts at patching the same region, even if doing so
would have succeeded. With that fixed up, the new test will make sure
this doesn't regress.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant