Make it possible to detect publishing times of the messages in the pipelines #6255

brkay54 · 2024-01-31T10:21:55Z

Checklist

I've read the contribution guidelines.
I've searched other issues and no duplicate issues were found.
I've agreed with the maintainers that I can plan this task.

Description

It is mainly needed for:

Add a new tool to measure end-to-end delay of the Autoware for sudden obstacles #6547

In the current Autoware, all the pipelines are designed to use their input's timestamp while publishing output as shown below.

However, it causes we can not know the exact timing of the messages to analyze their reaction times. It makes it hard to find bottlenecks inside the pipelines.

While we were developing the tool named reaction_analyzer, we made some changes to the perception, sensing, planning, and control pipeline header times to check nodes' reaction times. (Here you can see the results.) However, it is not possible to change the header times of the output message of the nodes as we discussed in the Perception Sensing WG. However, this situation makes it hard to find the exact timing of the outputs inside the pipeline.

Purpose

Finding new methods to see outputs' exact timings to be able to analyze the nodes' reaction times inside the pipeline.

Possible approaches

Adding a new low-sized message to show nodes' input origin times and output times.

Definition of done

Merge these PRs in order:

VRichardJP · 2024-02-01T06:00:12Z

Long time ago I wanted to solve the same issue and wrote a simple python script to watch predefined pipelines (= list of topics). For example, if I want to analyze the following pipeline:

/input -> (nodeA) -> /after_A -> (nodeB) -> /after_B -> (nodeC) -> /output

I just subscribe all topics /input, /after_A, /after_B and /output (python is great for that because you can easily subscribe any topic type) and keep track of the time I receive each message. Since data header is not modified over the pipeline, I can use the header timestamp as an ID to associate messages from multiple topics. You can also detect frame drops by finding IDs that never reached the end of the pipeline (e.g. if a later ID reached some topic before the current one). Unless data transfer delay is very unbalanced between input and output, by comparing the time you receive 2 messages with the same ID you get a fairly accurate measurement of the node processing time.

Wouldn't this do the job?

brkay54 · 2024-02-01T16:13:36Z

@VRichardJP Thanks for the answer, it is a very sensible method. However, what I am trying to do is catch the first messages of each node (predefined checkpoints) to understand which node or pipeline is late to react. (you can see the sample test video here.

Using the origin time of the message as ID is a great idea, however, when I catch the reacted message, also I need its published time. Maybe adding a new small-sized message that includes the origin time and the publishing time to each node would be a solution. We can use the origin times as ID as you said and after catching the reacted message, we can find the publishing time from the other small-sized message by using the header time of the reacted message.

VRichardJP · 2024-02-02T00:59:41Z

I am not sure I understand the issue.
Is the point of collecting publish timestamps to precisely measure ROS2/DDS performance? For example, to measure the blue arrows in the diagram below (based on the method presented above)?

Edit: the "delay" calculation in the diagram above is not correct. It is:
delay = data A transfer time - data B transfer time + data A interprocess transfer time
Which is roughly delay = data A interprocess transfer time if input and output data have similar size (unless there is a lag at DDS level).

brkay54 · 2024-02-02T13:11:38Z

@VRichardJP The purpose is not to measure the DDS or ROS 2 performance, the purpose is to measure total system performance and we want to be able to find the bottlenecks in that system.

What we are doing is we are recording two pointcloud messages: pointcloud without object (let say pointcloud A) and pointcloud with an object(pointcloud B). We are sending the Pointcloud A until all Autoware nodes and stacks are ready, after that, we start to publish pointcloud B. The time we started to publish pointcloud B is recorded as spawn_time of the object.

After the object is spawned, we check each node's outputs to see how much time it takes to show us the object as output. By using this method, we can see all system performance, and also we can check each node's reaction to analyze the bottlenecks in that system.

I hope I can explain the purpose, please let me know if you have any questions.

VRichardJP · 2024-02-03T00:44:11Z

@brkay54 thank you I understand now.

brkay54 · 2024-02-14T08:32:55Z

We had a meeting with @xmfcx and @mitsudome-r,
We discussed about the possible solutions:

Possible Solutions:
Option - 1:
Modify each node to add a new debug topic publisher.
The topic will contain:

Original header timestamp
timestamp of published timing

Reaction analyzer will subscribe to the main topic and the debug topic with exact time filter to overwrite published time internally to the main topic.
This topic can be turned on/off by a parameter.

Option - 2
Modify each node to do the condition checks that’s being done in reaction analyzer
e.g., if the node detects an object in trajectory, it will publish “true” in the debug topic with the timestamp.

Reaction analyzer subscribes to the debug topic to calculate the time difference between the timestamp in the debug topic.

We decided to implement the Option - 1. We are going to add a new debug publisher in each node with a turn-on/off flag. This option is going to be present in Perception/Sensing WG.

brkay54 self-assigned this Jan 31, 2024

brkay54 added this to Sensing & Perception Working Group Jan 31, 2024

brkay54 added component:perception Advanced sensor data processing and environment understanding. (auto-assigned) component:sensing Data acquisition from sensors, drivers, preprocessing. (auto-assigned) labels Jan 31, 2024

This was referenced Feb 5, 2024

feat(reaction_analyzer): add reaction anaylzer tool to measure end-to-end delay in sudden obstacle braking response #5954

Merged

Investigate end-to-end delay in sudden obstacle braking response #5540

Closed

idorobotics moved this to In Progress in Sensing & Perception Working Group Feb 6, 2024

brkay54 changed the title ~~It is not possible to detect reaction times of the nodes in the perception & sensing pipeline~~ It is not possible to detect reaction times of the nodes in the pipelines Feb 12, 2024

brkay54 changed the title ~~It is not possible to detect reaction times of the nodes in the pipelines~~ It is not possible to detect publishing times of the messages in the pipelines Feb 12, 2024

This was referenced Feb 16, 2024

feat(autoware_msgs): add new optional messages package with published time debug message autowarefoundation/autoware_msgs#83

Closed

feat(tier4_autoware_utils): add published time debug class into utils #6440

Merged

This was referenced Feb 26, 2024

feat(autoware_internal_msgs): add PublishedTime debug info message autowarefoundation/autoware_internal_msgs#1

Merged

feat: add published_time publisher debug to packages #6490

Merged

xmfcx changed the title ~~It is not possible to detect publishing times of the messages in the pipelines~~ Make it possible to detect publishing times of the messages in the pipelines Mar 5, 2024

This was linked to pull requests Mar 5, 2024

feat: add published_time publisher debug to packages #6490

Merged

feat(tier4_autoware_utils): add published time debug class into utils #6440

Merged

brkay54 mentioned this issue Mar 6, 2024

Add a new tool to measure end-to-end delay of the Autoware for sudden obstacles #6547

Closed

16 tasks

xmfcx closed this as completed in #6440 Mar 12, 2024

github-project-automation bot moved this from In Progress to Done in Sensing & Perception Working Group Mar 12, 2024

brkay54 mentioned this issue Mar 13, 2024

feat(published_time_publisher): add unit test #6610

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it possible to detect publishing times of the messages in the pipelines #6255

Make it possible to detect publishing times of the messages in the pipelines #6255

brkay54 commented Jan 31, 2024 •

edited

Loading

VRichardJP commented Feb 1, 2024 •

edited

Loading

brkay54 commented Feb 1, 2024

VRichardJP commented Feb 2, 2024 •

edited

Loading

brkay54 commented Feb 2, 2024

VRichardJP commented Feb 3, 2024

brkay54 commented Feb 14, 2024

Make it possible to detect publishing times of the messages in the pipelines #6255

Make it possible to detect publishing times of the messages in the pipelines #6255

Comments

brkay54 commented Jan 31, 2024 • edited Loading

Checklist

Description

Purpose

Possible approaches

Definition of done

VRichardJP commented Feb 1, 2024 • edited Loading

brkay54 commented Feb 1, 2024

VRichardJP commented Feb 2, 2024 • edited Loading

brkay54 commented Feb 2, 2024

VRichardJP commented Feb 3, 2024

brkay54 commented Feb 14, 2024

brkay54 commented Jan 31, 2024 •

edited

Loading

VRichardJP commented Feb 1, 2024 •

edited

Loading

VRichardJP commented Feb 2, 2024 •

edited

Loading