add rrcall to get current time #2827

vchuravy · 2021-03-16T15:32:24Z

Exposes the current time through a syscall. The intended use is for an application that knows
it is running under rr to record the current rr time to be able to direct the user to jump to
this point in time. As an example the Julia test-suite could record on a failed test the rr time
and report that together with the test failure.

I noticed that for mark_stdio we also get t->tgid() is the time sufficient or do I also need that
information?

cc: @Keno, @neboat

Keno · 2021-03-16T18:42:38Z

Implementation wise, I think this is fine, though I wonder if it wouldn't be better to instead have the ability to mark a particular point in the execution, which rr would then record in the trace and the frontend (rr or pernosco) could offer various navigation to. Could even be just recorded in the trace buffer if syscallbuf is enabled to make it really low overhead.

rocallahan · 2021-03-16T21:41:28Z

I wonder if it wouldn't be better to instead have the ability to mark a particular point in the execution, which rr would then record in the trace and the frontend (rr or pernosco) could offer various navigation to.

I guess you would need to flesh out what that feature would look like at rr replay time. If you want the tracee to provide a string label for each time, and you want to be able to get the list of labels without doing a full replay or full read of the trace, then we have to store the labels in some new area in the trace, which means extending the trace format.

Keno · 2021-03-16T21:43:35Z

Yeah, it's all a bit complicated. Maybe let's just do this for now then and we'll see if people find it useful. If so, we can come back later and engineer something fancier.

rocallahan · 2021-03-16T21:48:12Z

src/test/util_internal.h

@@ -15,4 +15,8 @@ void rr_detach_teleport(void) {
  test_assert(err == 0);
 }

+int rr_current_time(void) {


Is something supposed to use this function?

neboat · 2021-03-23T13:10:06Z

Hey all, I've been playing around with this PR recently, and although it works, I am encountering some issues that I would like your input on. (Sorry in advance for the long message.)

Here's my situation: I have a program-analysis tool, similar to a Google sanitizer, that identifies interesting pairs of executed instructions in a serial program's execution, and I would like to integrate this tool with RR. Normally the tool works similarly to a sanitizer: the user compiles and links her program with the tool, and then, when she runs the executable, the tool runs as a shadow computation and reports interesting pairs of instructions that it detects as the program executes. I would like to make this tool interact with RR intelligently. In particular, I would like the user to be able to compile and link her program with the tool and use RR to record a run of that tool-instrumented executable. Then, during RR's replay, I would like the user to have some way to easily navigate between the interesting pairs of executed instructions and to move back and forth between the two instructions in each pair.

(For more context, the tool in question is a race detector, and the interesting pairs of instructions are logically parallel instructions involved in a race. But it turns out that I'm not actually concerned with RR's support for multithreading, because this race detector is able to detect races even when the program is run on 1 thread. Hopefully, you shouldn't need to grok the details of this race detector to understand the problem. It should instead suffice to think of this tool as a sanitizer that identifies interesting pairs of instructions in a serial program's execution.)

To this end, I started playing with this PR, and it gets me part of the way there. With this PR, the tool can read and store the RR times for executed instructions. Then, for each interesting instruction pair, the tool can report the RR times of the two instructions in the pair.

However, this solution falls short in a few key ways.

The tool needs to get the current time from RR very frequently, approximately once per memory read or write in the program-under-test. The system-call interface to get the current RR time is OK for tiny programs, but introduces huge overheads that make it unusable on anything real. (I'm currently estimating 1000's of times slower, compared to a normal execution of the program with the tool.)
I'm not sure how to provide a nice interface for users to navigate between the two instructions involved in a race.
- I can start the replay using rr replay -g <time> to navigate to a particular RR time, but I'm not sure how to navigate to a particular time during a replay. (I admit that I might be unaware of some part of RR's interface that would let me do this.)
- Ideally, I would like to provide users an even more friendly interface, so that users don't have to copy-and-paste RR times recorded as having been involved in a race. But I'm not sure how to do this.

I've tried some things to work around these issues. For example, I tried modifying the tool itself to maintain its own event counter of interesting executed instructions and then report instruction pairs in terms of this event counter. This approach dramatically speeds up the time to record the program execution. But it turns out to be far too slow to use gdb's conditional breakpoints to navigate to instruction pairs during RR replay based on this user-defined event counter.

Do you have any ideas for how to support the functionality I'm looking for, possibly by extending RR? Right now I'm imagining some possible RR features might be able to solve this problem effectively:

A much faster way to get the current RR time during RR record.
A facility in RR replay to get the current RR time and quickly navigate to different RR times. With this feature, I could imagine writing a gdb command to binary-search for the correct RR time based on the tool's internal event counter.

This being said, I'm open to suggestions.

I believe some of the issues I've encountered are generic to anyone who wants to make a "smart" dynamic-analysis productivity tool that integrates with RR to allow users to quickly navigate to times when errors are detected. As such, I'm hoping that any extensions to RR to support this functionality would be generally useful for tool writers.

Thoughts? Thanks in advance for your input.

Cheers,
TB

Keno · 2021-03-24T00:47:42Z

A much faster way to get the current RR time during RR record.

We can syscallbuffer this rr call by having rr write the current event time to userspace memory and just returning that. Doesn't even really need to allocate a syscallbuf record.

A facility in RR replay to get the current RR time and quickly navigate to different RR times. With this feature, I could imagine writing a gdb command to binary-search for the correct RR time based on the tool's internal event counter.

I think this could just be handled by rr having a counter of how many times the rrcall was called (and returning that also), and then using the replay assist trick to let rr set a "breakpoint" on the appropriate counter value.

khuey · 2021-03-24T02:38:59Z

Why can't the race detector maintain its own counter of the number of memory reads/writes that have been made in userspace without any cooperation from the rr supervisor at all, and then just use conditional watchpoints on the location in gdb to find it again during the replay?

neboat · 2021-03-24T03:02:02Z

@khuey I tried something similar using conditional breakpoints in gdb, but that approach turns out to be very slow during replay. As I understand it, the slowdown comes from the overhead of switching between gdb and the program itself to repeatedly evaluate the condition. Conditional watchpoints don't seem to be appreciably faster, I presume for the same reason.

@Keno Both of those changes sound good to me, though I'm not terribly familiar with rr internals.

khuey · 2021-03-24T03:15:19Z

I think the overhead is more likely to be the cost of switching between rr and the tracee during replay each time the watchpoint fires and the condition needs to be evaluated. gdb should be sending the condition to rr so we shouldn't need to go back to gdb each time.

neboat · 2021-03-24T03:17:51Z

Ah, that makes sense to me. Thanks for clarifying.

add rrcall to get current time

ef14b7e

rocallahan reviewed Mar 16, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add rrcall to get current time #2827

add rrcall to get current time #2827

vchuravy commented Mar 16, 2021

Keno commented Mar 16, 2021

rocallahan commented Mar 16, 2021

Keno commented Mar 16, 2021

rocallahan Mar 16, 2021

neboat commented Mar 23, 2021

Keno commented Mar 24, 2021

khuey commented Mar 24, 2021

neboat commented Mar 24, 2021

khuey commented Mar 24, 2021

neboat commented Mar 24, 2021

add rrcall to get current time #2827

Are you sure you want to change the base?

add rrcall to get current time #2827

Conversation

vchuravy commented Mar 16, 2021

Keno commented Mar 16, 2021

rocallahan commented Mar 16, 2021

Keno commented Mar 16, 2021

rocallahan Mar 16, 2021

Choose a reason for hiding this comment

neboat commented Mar 23, 2021

Keno commented Mar 24, 2021

khuey commented Mar 24, 2021

neboat commented Mar 24, 2021

khuey commented Mar 24, 2021

neboat commented Mar 24, 2021