Skip to content

Commit

Permalink
SWDEV-301543 SWDEV-276146 : Fix profile output buff allocation
Browse files Browse the repository at this point in the history
L2 flush is triggered by explicit cache flush PM4 packet in aqlprofile
packets to GPU. This cache flush is used to sync up CPU and GPU to make
sure perfomance counters copied to profile output buffer is visible to
CPU. To get rid of this cache flush the followings are done:
  1. This explicit cache flush packet is removed from aqlprofile code
     (another commit to aqlprofile code).
  2. This commit which changed profile output buffer to use kernarg
     memory since it is uncached for GPU.
After these changes profile counter values when copied by GPU to output
buffer they are guaranteed to be visible to CPU.

Change-Id: Ie953949c85fbee2f4369f1de966bcfb33daec084
(cherry picked from commit 2b79931)
  • Loading branch information
cyamder committed Nov 15, 2021
1 parent e140f47 commit 8d934be
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion src/core/profile.h
Original file line number Diff line number Diff line change
Expand Up @@ -331,7 +331,10 @@ class PmcProfile : public Profile {
hsa_status_t Allocate(util::HsaRsrcFactory* rsrc) {
profile_.command_buffer.ptr =
rsrc->AllocateSysMemory(agent_info_, profile_.command_buffer.size);
profile_.output_buffer.ptr = rsrc->AllocateSysMemory(agent_info_, profile_.output_buffer.size);
// Allocate profile output buffer from kernarg memory pool since kernarg
// memory buffer is uncached. So when GPU copies performance counter values
// to this buffer they are guaranteed to be visible to CPU.
profile_.output_buffer.ptr = rsrc->AllocateKernArgMemory(agent_info_, profile_.output_buffer.size);
return (profile_.command_buffer.ptr && profile_.output_buffer.ptr) ? HSA_STATUS_SUCCESS
: HSA_STATUS_ERROR;
}
Expand Down

0 comments on commit 8d934be

Please sign in to comment.