[Bug] build wrong runtime tree relationship #1257

shaoeric · 2023-06-30T01:28:53Z

图中的runtime名称做了脱敏
A compute 调用了两个cudaruntime，分别是cudaRT_P和cudaLaunchKernel，其中cudaRT_P调用了cudaRT_C，但是

VisualDL/visualdl/component/profiler/parser/event_node.py

Line 457 in e420b8c

def _build_tree_relationship( # noqa: C901

将三个cudaruntime 一起作为A compute的runtime子节点，即len(Acompute.runtime_node) == 3

对此，已经提交了pr，见 https://github.com/PaddlePaddle/VisualDL/pull/1256，希望参与问题的后续讨论，期待回复

rainyfly · 2023-06-30T02:30:44Z

Hi, 感谢你仔细的分析。目前我们确实是把CudaRuntime的调用都作为op的子节点来看，CudaRuntime内部目前没有必要再继续做层级的划分了。主要考虑如下：
（1）所需要的统计表单并不需要CudaRuntime的层级信息，目前CudaRuntime一般都是Paddle的Op在调用，我们做统计时候，其实最多只需要知道Op调用了多少次CudaRuntime就可以，因此把所有时间戳包含在Op里的CudaRuntime直接放在Op里保存就可以。
（2）如果需要知道具体的时序信息，timeline里会自动展现所有的时序包含关系，就像你所看到的那样，通过timeline是可以知道CudaRuntime的包含关系的。但是这个包含关系目前没有统计指标需要考虑到，因此没有对CudaRuntime继续做层级构建树。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] build wrong runtime tree relationship #1257

[Bug] build wrong runtime tree relationship #1257

shaoeric commented Jun 30, 2023

rainyfly commented Jun 30, 2023

[Bug] build wrong runtime tree relationship #1257

[Bug] build wrong runtime tree relationship #1257

Comments

shaoeric commented Jun 30, 2023

rainyfly commented Jun 30, 2023