Skip to content

Commit

Permalink
x86/CPU/AMD: Bring back Compute Unit ID
Browse files Browse the repository at this point in the history
Commit:

  a33d331 ("x86/CPU/AMD: Fix Bulldozer topology")

restored the initial approach we had with the Fam15h topology of
enumerating CU (Compute Unit) threads as cores. And this is still
correct - they're beefier than HT threads but still have some
shared functionality.

Our current approach has a problem with the Mad Max Steam game, for
example. Yves Dionne reported a certain "choppiness" while playing on
v4.9.5.

That problem stems most likely from the fact that the CU threads share
resources within one CU and when we schedule to a thread of a different
compute unit, this incurs latency due to migrating the working set to a
different CU through the caches.

When the thread siblings mask mirrors that aspect of the CUs and
threads, the scheduler pays attention to it and tries to schedule within
one CU first. Which takes care of the latency, of course.

Reported-by: Yves Dionne <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: <[email protected]> # 4.9
Cc: Brice Goglin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Yazen Ghannam <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
  • Loading branch information
suryasaimadhu authored and Ingo Molnar committed Feb 5, 2017
1 parent a0a2864 commit 79a8b9a
Show file tree
Hide file tree
Showing 4 changed files with 19 additions and 4 deletions.
1 change: 1 addition & 0 deletions arch/x86/include/asm/processor.h
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@ struct cpuinfo_x86 {
__u8 x86_phys_bits;
/* CPUID returned core id bits: */
__u8 x86_coreid_bits;
__u8 cu_id;
/* Max extended CPUID function supported: */
__u32 extended_cpuid_level;
/* Maximum supported CPUID level, -1=no CPUID: */
Expand Down
9 changes: 8 additions & 1 deletion arch/x86/kernel/cpu/amd.c
Original file line number Diff line number Diff line change
Expand Up @@ -309,8 +309,15 @@ static void amd_get_topology(struct cpuinfo_x86 *c)

/* get information required for multi-node processors */
if (boot_cpu_has(X86_FEATURE_TOPOEXT)) {
u32 eax, ebx, ecx, edx;

node_id = cpuid_ecx(0x8000001e) & 7;
cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);

node_id = ecx & 0xff;
smp_num_siblings = ((ebx >> 8) & 0xff) + 1;

if (c->x86 == 0x15)
c->cu_id = ebx & 0xff;

/*
* We may have multiple LLCs if L3 caches exist, so check if we
Expand Down
1 change: 1 addition & 0 deletions arch/x86/kernel/cpu/common.c
Original file line number Diff line number Diff line change
Expand Up @@ -1015,6 +1015,7 @@ static void identify_cpu(struct cpuinfo_x86 *c)
c->x86_model_id[0] = '\0'; /* Unset */
c->x86_max_cores = 1;
c->x86_coreid_bits = 0;
c->cu_id = 0xff;
#ifdef CONFIG_X86_64
c->x86_clflush_size = 64;
c->x86_phys_bits = 36;
Expand Down
12 changes: 9 additions & 3 deletions arch/x86/kernel/smpboot.c
Original file line number Diff line number Diff line change
Expand Up @@ -433,9 +433,15 @@ static bool match_smt(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
int cpu1 = c->cpu_index, cpu2 = o->cpu_index;

if (c->phys_proc_id == o->phys_proc_id &&
per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2) &&
c->cpu_core_id == o->cpu_core_id)
return topology_sane(c, o, "smt");
per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2)) {
if (c->cpu_core_id == o->cpu_core_id)
return topology_sane(c, o, "smt");

if ((c->cu_id != 0xff) &&
(o->cu_id != 0xff) &&
(c->cu_id == o->cu_id))
return topology_sane(c, o, "smt");
}

} else if (c->phys_proc_id == o->phys_proc_id &&
c->cpu_core_id == o->cpu_core_id) {
Expand Down

0 comments on commit 79a8b9a

Please sign in to comment.