Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf_cnt: improve avg/peak accuracy for component perf measurements #9664

Merged
merged 4 commits into from
Nov 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions Kconfig.sof
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,26 @@ config PERFORMANCE_COUNTERS
use the stamp() macro periodically to find out how long the cpu
was in active/sleep state between the calls and estimate the cpu load.

config PERFORMANCE_COUNTERS_COMPONENT
bool "Use performance counters to track component execution"
default n
depends on PERFORMANCE_COUNTERS
help
Use performance counters to trace low-latency task execution.
This enables to observe average and peak execution times at
audio component level granularity.
Results are reported via logging subsystem.

config PERFORMANCE_COUNTERS_LL_TASKS
bool "Use performance counters to track LL task execution"
default n
depends on PERFORMANCE_COUNTERS
help
Use performance counters to trace low-latency task execution.
This enables to observe average and peak execution times at
task level granularity.
Results are reported via logging subsystem.

config DSP_RESIDENCY_COUNTERS
bool "DSP residency counters"
default n
Expand Down
11 changes: 11 additions & 0 deletions app/perf_overlay.conf
Original file line number Diff line number Diff line change
@@ -1,5 +1,16 @@
CONFIG_PERFORMANCE_COUNTERS=y
CONFIG_PERFORMANCE_COUNTERS_COMPONENT=y
# disable ll task level statistics to reduce logging overhead
#CONFIG_PERFORMANCE_COUNTERS_LL_TASKS=y
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, perf_overlay.conf is for performance monitoring, not for performance enhancement (why would you not have enhanced performance in your default configuration), right? The name sounds potentially a bit confusing, can we rename it? Or at least add a comment at the top

Copy link
Collaborator Author

@kv2019i kv2019i Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh I'll let @singalsu comment, I think he's the only known user at the moment for this. Not sure if we have some Ci jobs somewhere that have the name hardcoded -- not sure worth the hassle TBH.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem to rename it. I don't think we have CI builds and tests with it. It was planned but our plans got changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine as it is, this feature is for performance monitoring but it can have impact on peak when logging.

CONFIG_SYS_HEAP_RUNTIME_STATS=y
CONFIG_TIMING_FUNCTIONS=y
CONFIG_ADSP_IDLE_CLOCK_GATING=n
CONFIG_KCPS_DYNAMIC_CLOCK_CONTROL=n
# disable top-level statistics to reduce logging overhead
CONFIG_SCHEDULE_LL_STATS_LOG=n

# vendor/target dependent options
#
# uncomment to disable Intel HD-DMA L1 exit ISR. this affects
# the peak execution times at component level
#CONFIG_DMA_INTEL_ADSP_HDA_TIMING_L1_EXIT=n
4 changes: 2 additions & 2 deletions src/audio/component.c
Original file line number Diff line number Diff line change
Expand Up @@ -490,7 +490,7 @@ int comp_copy(struct comp_dev *dev)
*/
if (cpu_is_me(dev->ipc_config.core) ||
dev->ipc_config.proc_domain == COMP_PROCESSING_DOMAIN_DP) {
#if CONFIG_PERFORMANCE_COUNTERS
#if CONFIG_PERFORMANCE_COUNTERS_COMPONENT
perf_cnt_init(&dev->pcd);
#endif

Expand All @@ -506,7 +506,7 @@ int comp_copy(struct comp_dev *dev)
comp_update_performance_data(dev, cycles_consumed);
#endif

#if CONFIG_PERFORMANCE_COUNTERS
#if CONFIG_PERFORMANCE_COUNTERS_COMPONENT
perf_cnt_stamp(&dev->pcd, perf_trace_null, dev);
perf_cnt_average(&dev->pcd, comp_perf_avg_info, dev);
#endif
Expand Down
2 changes: 1 addition & 1 deletion src/include/sof/audio/component.h
Original file line number Diff line number Diff line change
Expand Up @@ -627,7 +627,7 @@ struct comp_dev {
/* private data - core does not touch this */
void *priv_data; /**< private data */

#if CONFIG_PERFORMANCE_COUNTERS
#if CONFIG_PERFORMANCE_COUNTERS_COMPONENT
struct perf_cnt_data pcd;
#endif

Expand Down
14 changes: 8 additions & 6 deletions src/include/sof/lib/perf_cnt.h
Original file line number Diff line number Diff line change
Expand Up @@ -95,11 +95,11 @@ struct perf_cnt_data {
(uint32_t)((pcd)->cpu_delta_peak))
#define task_perf_cnt_avg(pcd, trace_m, arg, class) do { \
(pcd)->cpu_delta_sum += (pcd)->cpu_delta_last; \
if (++(pcd)->period_cnt == 1 << PERF_CNT_CHECK_WINDOW_SIZE) { \
if (!(++(pcd)->period_cnt & MASK(PERF_CNT_CHECK_WINDOW_SIZE - 1, 0))) { \
(pcd)->cpu_delta_sum >>= PERF_CNT_CHECK_WINDOW_SIZE; \
trace_m(pcd, arg, class); \
if ((pcd)->period_cnt & BIT(PERF_CNT_CHECK_WINDOW_SIZE)) \
trace_m(pcd, arg, class); \
(pcd)->cpu_delta_sum = 0; \
(pcd)->period_cnt = 0; \
(pcd)->plat_delta_peak = 0; \
(pcd)->cpu_delta_peak = 0; \
} \
Expand All @@ -115,11 +115,13 @@ struct perf_cnt_data {
*/
#define perf_cnt_average(pcd, trace_m, arg) do { \
(pcd)->cpu_delta_sum += (pcd)->cpu_delta_last; \
if (++(pcd)->period_cnt == 1 << PERF_CNT_CHECK_WINDOW_SIZE) {\
if (!(++(pcd)->period_cnt & MASK(PERF_CNT_CHECK_WINDOW_SIZE - 1, 0))) { \
(pcd)->cpu_delta_sum >>= PERF_CNT_CHECK_WINDOW_SIZE; \
trace_m(pcd, arg); \
(pcd)->peak_mcps_period_cnt &= MASK(PERF_CNT_CHECK_WINDOW_SIZE - 1, 0); \
if ((pcd)->period_cnt & BIT(PERF_CNT_CHECK_WINDOW_SIZE)) { \
trace_m(pcd, arg); \
} \
(pcd)->cpu_delta_sum = 0; \
(pcd)->period_cnt = 0; \
(pcd)->plat_delta_peak = 0; \
(pcd)->cpu_delta_peak = 0; \
(pcd)->peak_mcps_period_cnt = 0; \
Expand Down
4 changes: 2 additions & 2 deletions src/schedule/ll_schedule.c
Original file line number Diff line number Diff line change
Expand Up @@ -68,15 +68,15 @@ DECLARE_TR_CTX(ll_tr, SOF_UUID(ll_sched_uuid), LOG_LEVEL_INFO);
struct ll_schedule_data {
struct list_item tasks; /* list of ll tasks */
atomic_t num_tasks; /* number of ll tasks */
#if CONFIG_PERFORMANCE_COUNTERS
#if CONFIG_PERFORMANCE_COUNTERS__LL_TASKS
struct perf_cnt_data pcd;
#endif
struct ll_schedule_domain *domain; /* scheduling domain */
};

static const struct scheduler_ops schedule_ll_ops;

#if CONFIG_PERFORMANCE_COUNTERS
#if CONFIG_PERFORMANCE_COUNTERS__LL_TASKS
static void perf_ll_sched_trace(struct perf_cnt_data *pcd, int ignored)
{
tr_info(&ll_tr, "perf ll_work peak plat %u cpu %u",
Expand Down
4 changes: 2 additions & 2 deletions src/schedule/zephyr_ll.c
Original file line number Diff line number Diff line change
Expand Up @@ -134,13 +134,13 @@ static inline enum task_state do_task_run(struct task *task)
{
enum task_state state;

#if CONFIG_PERFORMANCE_COUNTERS
#if CONFIG_PERFORMANCE_COUNTERS_LL_TASKS
perf_cnt_init(&task->pcd);
#endif

state = task_run(task);

#if CONFIG_PERFORMANCE_COUNTERS
#if CONFIG_PERFORMANCE_COUNTERS_LL_TASKS
perf_cnt_stamp(&task->pcd, perf_trace_null, NULL);
task_perf_cnt_avg(&task->pcd, task_perf_avg_info, &ll_tr, task);
#endif
Expand Down
Loading