Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UBUNTU: [Config] EFI: set CAPSULE_LOADER=y #13

Open
wants to merge 82 commits into
base: main
Choose a base branch
from

Conversation

KobaKoNvidia
Copy link
Contributor

BugLink: https://nvbugspro.nvidia.com/bug/4601764

[Description]
Nvidia provide a way to flash the UEFI via capsule loader.
CAPSULE_LOADER is also built-in in L4T kernel
so for the easy use, need to make CAPSULE_LOADER as built-in.

[Test Cases]

  1. build kernel with nvidia-64k and nvidia flavours
* because there're mis-configured CONFIGs and annotation checking complains,  use do_skip_checks=true to ignore during compling.
time env CONCURRENCY_LEVEL=73 sh -c 'rm -f ../*.deb;fakeroot debian/rules clean && fakeroot debian/rules binary-headers binary-nvidia-64k do_linux_tools=false do_zfs=false do_mstflint_access=false do_nvidia-fs=false do_skip_checks=true' 2>&1 | tee build.log
  1. boot kernel with the modification.
    • currently, use latest canonical kernel(6.5.0-1019-nvidia-64k) to verify
  2. check if CONFIG_CAPSULE_LOADER is "y".
  3. check if /dev/efi_capsule_loader exists.
# get nb4601764_verify.sh from [0]
$ sh nb4601764_verify.sh 
### Checking CONFIG_EFI_CAPSULE_LOADER in /boot/config-6.5.0-1019-nvidia-64k
CONFIG_EFI_CAPSULE_LOADER=y

### Checking /dev/efi_capsule_loader
/dev/efi_capsule_loader exists.

[0], verification scripts, http://nv/ecM

[Where things could go wrong]
Low, minor affect with CONIFG_CAPSULE_LOADER=y

ianmay81 and others added 30 commits May 6, 2024 13:19
…dversion"

This reverts commit 47d27f2.

We need to revert this to avoid regressing any modules used in Jammy.

Signed-off-by: Ian May <[email protected]>
Ignore: yes
Signed-off-by: Ian May <[email protected]>
Ignore: yes
Signed-off-by: Ian May <[email protected]>
… a pasid support

BugLink: https://bugs.launchpad.net/bugs/2031320

When an iommu_domain is set to IOMMU_DOMAIN_IDENTITY, the driver would
skip the allocation of a CD table and set the CONFIG field of the STE
to STRTAB_STE_0_CFG_BYPASS. This works well for devices that only have
one substream, i.e. PASID disabled.

However, there could be a use case, for a pasid capable device, that
allows bypassing the translation at the default substream while still
enabling the pasid feature, which means the driver should not skip the
allocation of a CD table nor simply bypass the CONFIG field. Instead,
the S1DSS field should be set to STRTAB_STE_1_S1DSS_BYPASS and the
SHCFG field should be set to STRTAB_STE_1_SHCFG_INCOMING.

Add s1dss in struct arm_smmu_s1_cfg, to allow a configuration in the
finalise() to support this use case.

Also, according to "13.5 Summary of attribute/permission configuration
fields" in the reference manual, the SHCFG field value is irrelevant.
So, set the SHCFG field of the STE always to STRTAB_STE_1_SHCFG_INCOMING
for simplification.

Signed-off-by: Nicolin Chen <[email protected]>
Reviewed-by: Pritesh Raithatha <[email protected]>
Acked-by: Jamie Nguyen <[email protected]>
Acked-by: Nicolin Chen <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Ian May <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
Ignore: yes
Signed-off-by: Ian May <[email protected]>
…rnel

BugLink: https://bugs.launchpad.net/bugs/2043132

With this change, the NVMe and NVMeOF driver would be enabled to support GPUDirectStorage(GDS).
The change is around nvme/nvme rdma map_data()
and unmap_data(), where the IO request is
first intercepted to check for GDS pages and
if it is a GDS page then the request is served
by GDS driver component called nvidia-fs,
else the request would be served by the standard NVMe driver code

Signed-off-by: Sourab Gupta <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Acked-by: Ian May <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2043132

With this change, the NFS driver would be enabled to support GPUDirectStorage(GDS).
The change is around frwr_map and frwr_unmap in the NFS driver, where the IO request
is first intercepted to check for GDS pages and if it is a GDS page then the
request is served by GDS driver component called nvidia-fs,
else the request would be served by the standard NFS driver code.

Signed-off-by: Sourab Gupta <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Acked-by: Ian May <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
This reverts commit b2638e9.

This feature is not targeted for Jammy.

Signed-off-by: Brad Figg <[email protected]>
Signed-off-by: Ian May <[email protected]>
Add support for exposing rprovides data for standalone modules
too. Switch to exposing provides as a shared debian/substvar file
and use that in the templates.

Signed-off-by: Brad Figg <[email protected]>
Signed-off-by: Ian May <[email protected]>
…e_range

BugLink: https://bugs.launchpad.net/bugs/2048966

When running an SVA case, the following soft lockup is triggered:
--------------------------------------------------------------------
watchdog: BUG: soft lockup - CPU#244 stuck for 26s!
pstate: 83400009 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
pc : arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
lr : arm_smmu_cmdq_issue_cmdlist+0x150/0xa50
sp : ffff8000d83ef290
x29: ffff8000d83ef290 x28: 000000003b9aca00 x27: 0000000000000000
x26: ffff8000d83ef3c0 x25: da86c0812194a0e8 x24: 0000000000000000
x23: 0000000000000040 x22: ffff8000d83ef340 x21: ffff0000c63980c0
x20: 0000000000000001 x19: ffff0000c6398080 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: ffff3000b4a3bbb0
x14: ffff3000b4a30888 x13: ffff3000b4a3cf60 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc08120e4d6bc
x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000048cfa
x5 : 0000000000000000 x4 : 0000000000000001 x3 : 000000000000000a
x2 : 0000000080000000 x1 : 0000000000000000 x0 : 0000000000000001
Call trace:
 arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
 __arm_smmu_tlb_inv_range+0x118/0x254
 arm_smmu_tlb_inv_range_asid+0x6c/0x130
 arm_smmu_mm_invalidate_range+0xa0/0xa4
 __mmu_notifier_invalidate_range_end+0x88/0x120
 unmap_vmas+0x194/0x1e0
 unmap_region+0xb4/0x144
 do_mas_align_munmap+0x290/0x490
 do_mas_munmap+0xbc/0x124
 __vm_munmap+0xa8/0x19c
 __arm64_sys_munmap+0x28/0x50
 invoke_syscall+0x78/0x11c
 el0_svc_common.constprop.0+0x58/0x1c0
 do_el0_svc+0x34/0x60
 el0_svc+0x2c/0xd4
 el0t_64_sync_handler+0x114/0x140
 el0t_64_sync+0x1a4/0x1a8
--------------------------------------------------------------------

Note that since 6.6-rc1 the arm_smmu_mm_invalidate_range above is renamed
to "arm_smmu_mm_arch_invalidate_secondary_tlbs", yet the problem remains.

The commit 06ff87b ("arm64: mm: remove unused functions and variable
protoypes") fixed a similar lockup on the CPU MMU side. Yet, it can occur
to SMMU too, since arm_smmu_mm_arch_invalidate_secondary_tlbs() is called
typically next to MMU tlb flush function, e.g.
        tlb_flush_mmu_tlbonly {
                tlb_flush {
                        __flush_tlb_range {
                                // check MAX_TLBI_OPS
                        }
                }
                mmu_notifier_arch_invalidate_secondary_tlbs {
                        arm_smmu_mm_arch_invalidate_secondary_tlbs {
                                // does not check MAX_TLBI_OPS
                        }
                }
        }

Clone a CMDQ_MAX_TLBI_OPS from the MAX_TLBI_OPS in tlbflush.h, since in an
SVA case SMMU uses the CPU page table, so it makes sense to align with the
tlbflush code. Then, replace per-page TLBI commands with a single per-asid
TLBI command, if the request size hits this threshold.

Signed-off-by: Nicolin Chen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
[backported from commit d5afb4b]
[ian-may: change made to function arm_smmu_mm_invalidate_range]
Acked-by: Noah Wager <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2048815

TPM devices may insert wait state on last clock cycle of ADDR phase.
For SPI controllers that support full-duplex transfers, this can be
detected using software by reading the MISO line. For SPI controllers
that only support half-duplex transfers, such as the Tegra QSPI, it is
not possible to detect the wait signal from software. The QSPI
controller in Tegra234 and Tegra241 implement hardware detection of the
wait signal which can be enabled in the controller for TPM devices.

The current TPM TIS driver only supports software detection of the wait
signal. To support SPI controllers that use hardware to detect the wait
signal, add the function tpm_tis_spi_transfer_half() and move the
existing code for software based detection into a function called
tpm_tis_spi_transfer_full(). SPI controllers that only support
half-duplex transfers will always call tpm_tis_spi_transfer_half()
because they cannot support software based detection. The bit
SPI_TPM_HW_FLOW is set to indicate to the SPI controller that hardware
detection is required and it is the responsibility of the SPI controller
driver to determine if this is supported or not.

For hardware flow control, CMD-ADDR-DATA messages are combined into a
single message where as for software flow control exiting method of
CMD-ADDR in a message and DATA in another is followed.

[jarkko: Fixed the function names to match the code change, and the tag
in the short summary.]
Signed-off-by: Krishna Yarlagadda <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: Jarkko Sakkinen <[email protected]>
(cherry picked from commit a86a42a)
Acked-by: Jamie Nguyen <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Acked-by: Ian May <[email protected]>
Signed-off-by: Ian May <[email protected]>
Ignore: yes
Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2049537

Add support of "Thermal fast Sampling Period (_TFP)" for passive
cooling.

As per the ACPI specification (ACPI 6.5, Section 11.4.17 "_TFP (Thermal
fast Sampling Period)", _TFP overrides _TSP ("Thermal Sampling Period"
if both are present in a Thermal zone.

Signed-off-by: Jeff Brasen <[email protected]>
Co-developed-by: Sumit Gupta <[email protected]>
Signed-off-by: Sumit Gupta <[email protected]>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Jamie Nguyen <[email protected]>
Acked-by: Sumit Gupta <[email protected]>
(backported from commit a2ee758 linux-next)
Acked-by: Brad Figg <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Acked-by: Ian May <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
ianmay81 and others added 15 commits May 6, 2024 13:19
Ignore: yes
Signed-off-by: Ian May <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2061930

There are systems in production that don't have
firmware that supports coresight_etm4x.  Instead of
removing completely, blacklist coresight_etm4x so
systems with the correct firmware can use the module.

Signed-off-by: Ian May <[email protected]>
Ignore: yes
Signed-off-by: Ian May <[email protected]>
Ignore: yes
Signed-off-by: Jacob Martin <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2063578
Properties: no-test-build
Signed-off-by: Jacob Martin <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2063461

N2 r0p3 doesn't require the workaround [1], so gating on (#slots - 5) no
longer works because all N2s have 5 slots. Use the new expression
builtin that allows calling strcmp_cpuid_str() and comparing CPUIDs in
metric formulas.

In this case, the commented formula looks like this:

  strcmp_cpuid_str(0x410fd493)        # greater than or equal to N2 r0p3
  | strcmp_cpuid_str(0x410fd490) ^ 1  # OR NOT any version of N2

[1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/blob/main/data/pmu/cpu/neoverse/neoverse-n2-r0p3.json

Signed-off-by: James Clark <[email protected]>
Reviewed-by: John Garry <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Eduard Zingerman <[email protected]>
Cc: Haixin Yu <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jing Zhang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nick Forrington <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Sohom Datta <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
(cherry picked from commit d43f549)
Signed-off-by: Brad Figg <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Acked-by: Noah Wager <[email protected]>
…rm telemetry repo

BugLink: https://bugs.launchpad.net/bugs/2063461

Apart from some slight naming and grouping differences, the new metrics
are functionally the same as the existing ones. Any missing metrics were
manually appended to the end of the auto generated file.

For the events, the new data includes descriptions that may have product
specific details and new groupings that will be consistent with other
products.

After generating the metrics from the telemetry repo [1], the following
manual steps were performed:

 * Change the topdown expressions to compare on CPUID and use
   #slots so that the same data can be shared between N2 and V2. Apart
   from these modifications, the expressions now match more closely with
   the Arm telemetry data which will hopefully make future updates
   easier.

 * Append some metrics from the old N2/V2 data that aren't present in
   the telemetry data. These will possibly be added to the
   telemetry-solution repo at a later time:

    l3d_cache_mpki, l3d_cache_miss_rate, branch_pki, ipc_rate, spec_ipc,
    retired_rate, wasted_rate, branch_immed_spec_rate,
    branch_return_spec_rate, branch_indirect_spec_rate

[1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/blob/main/data/pmu/cpu/neoverse/neoverse-n2.json

Signed-off-by: James Clark <[email protected]>
Reviewed-by: John Garry <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Eduard Zingerman <[email protected]>
Cc: Haixin Yu <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jing Zhang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nick Forrington <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Sohom Datta <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
(cherry picked from commit 4473949)
Signed-off-by: Brad Figg <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Acked-by: Noah Wager <[email protected]>
…ion before access.""

BugLink: https://bugs.launchpad.net/bugs/2064549

This reverts commit 9cfa409.

This effectively restores the functionality added by:
b2b56a1 ("gpio: tegra186: Check GPIO pin permission before access."

We restore this and, now that the patch has been upstreamed, will apply
the proper fix atop.

Signed-off-by: Jamie Nguyen <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Noah Wager <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2064549

The controller has several register bits describing access control
information for a given GPIO pin. When SCR_SEC_[R|W]EN is unset, it
means we have full read/write access to all the registers for given GPIO
pin. When SCR_SEC[R|W]EN is set, it means we need to further check the
accompanying SCR_SEC_G1[R|W] bit to determine read/write access to all
the registers for given GPIO pin.

This check was previously declaring that a GPIO pin was accessible
only if either of the following conditions were met:

  - SCR_SEC_REN + SCR_SEC_WEN both set

    or

  - SCR_SEC_REN + SCR_SEC_WEN both set and
    SCR_SEC_G1R + SCR_SEC_G1W both set

Update the check to properly handle cases where only one of
SCR_SEC_REN or SCR_SEC_WEN is set.

Fixes: b2b56a1 ("gpio: tegra186: Check GPIO pin permission before access.")
Signed-off-by: Prathamesh Shete <[email protected]>
Acked-by: Thierry Reding <[email protected]>
(cherry-picked from commit d806f474a9a7993648a2c70642ee129316d8deff linux-next)
Signed-off-by: Jamie Nguyen <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Noah Wager <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://nvbugspro.nvidia.com/bug/4601764

Nvidia provide a way to flash the UEFI via capsule loader in arm64.
CAPSULE_LOADER is also built-in in L4T kernel
so for the easy use, need to make CAPSULE_LOADER as built-in in arm64.

Signed-off-by: kobak <[email protected]>
@KobaKoNvidia KobaKoNvidia reopened this May 9, 2024
KobaKoNvidia and others added 3 commits May 22, 2024 07:24
BugLink: https://nvbugspro.nvidia.com/bug/4601764

Nvidia provide a way to flash the UEFI via capsule loader in arm64.
CAPSULE_LOADER is also built-in in L4T kernel
so for the easy use, need to make CAPSULE_LOADER as built-in in arm64.

Signed-off-by: kobak <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Acked-by: Noah Wagner <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
BugLink: https://bugs.launchpad.net/bugs/2065721

While calculating the hardware interrupt number for a MSI interrupt, the
higher bits (i.e. from bit-5 onwards a.k.a domain_nr >= 32) of the PCI
domain number gets truncated because of the shifted value casting to return
type of pci_domain_nr() which is 'int'. This for example is resulting in
same hardware interrupt number for devices 0019:00:00.0 and 0039:00:00.0.

To address this cast the PCI domain number to 'irq_hw_number_t' before left
shifting it to calculate the hardware interrupt number.

Please note that this fixes the issue only on 64-bit systems and doesn't
change the behavior for 32-bit systems i.e. the 32-bit systems continue to
have the issue. Since the issue surfaces only if there are too many PCIe
controllers in the system which usually is the case in modern server
systems and they don't tend to run 32-bit kernels.

Fixes: 3878eae ("PCI/MSI: Enhance core to support hierarchy irqdomain")
Signed-off-by: Vidya Sagar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Shanker Donthineni <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
(cherry-picked from commit db744dd)
Signed-off-by: Jamie Nguyen <[email protected]>
Acked-by: Brad Figg <[email protected]>
Acked-by: Jacob Martin <[email protected]>
Acked-by: Noah Wagner <[email protected]>
Signed-off-by: Brad Figg <[email protected]>
nvidia-bfigg pushed a commit that referenced this pull request May 25, 2024
BugLink: https://bugs.launchpad.net/bugs/2059284

With latest clang18, I hit test_progs failures for the following test:

  #13/2    bpf_cookie/multi_kprobe_link_api:FAIL
  #13/3    bpf_cookie/multi_kprobe_attach_api:FAIL
  #13      bpf_cookie:FAIL
  #75      fentry_fexit:FAIL
  #76/1    fentry_test/fentry:FAIL
  #76      fentry_test:FAIL
  #80/1    fexit_test/fexit:FAIL
  #80      fexit_test:FAIL
  #110/1   kprobe_multi_test/skel_api:FAIL
  #110/2   kprobe_multi_test/link_api_addrs:FAIL
  #110/3   kprobe_multi_test/link_api_syms:FAIL
  #110/4   kprobe_multi_test/attach_api_pattern:FAIL
  #110/5   kprobe_multi_test/attach_api_addrs:FAIL
  #110/6   kprobe_multi_test/attach_api_syms:FAIL
  #110     kprobe_multi_test:FAIL

For example, for #13/2, the error messages are:

  [...]
  kprobe_multi_test_run:FAIL:kprobe_test7_result unexpected kprobe_test7_result: actual 0 != expected 1
  [...]
  kprobe_multi_test_run:FAIL:kretprobe_test7_result unexpected kretprobe_test7_result: actual 0 != expected 1

clang17 does not have this issue.

Further investigation shows that kernel func bpf_fentry_test7(), used in
the above tests, is inlined by the compiler although it is marked as
noinline.

  int noinline bpf_fentry_test7(struct bpf_fentry_test_t *arg)
  {
        return (long)arg;
  }

It is known that for simple functions like the above (e.g. just returning
a constant or an input argument), the clang compiler may still do inlining
for a noinline function. Adding 'asm volatile ("")' in the beginning of the
bpf_fentry_test7() can prevent inlining.

Signed-off-by: Yonghong Song <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Tested-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
(cherry picked from commit 32337c0)
Signed-off-by: Manuel Diewald <[email protected]>
Signed-off-by: Stefan Bader <[email protected]>
nvidia-bfigg pushed a commit that referenced this pull request May 25, 2024
BugLink: https://bugs.launchpad.net/bugs/2059284

[ Upstream commit b16904f ]

With latest upstream llvm18, the following test cases failed:

  $ ./test_progs -j
  #13/2    bpf_cookie/multi_kprobe_link_api:FAIL
  #13/3    bpf_cookie/multi_kprobe_attach_api:FAIL
  #13      bpf_cookie:FAIL
  #77      fentry_fexit:FAIL
  #78/1    fentry_test/fentry:FAIL
  #78      fentry_test:FAIL
  #82/1    fexit_test/fexit:FAIL
  #82      fexit_test:FAIL
  #112/1   kprobe_multi_test/skel_api:FAIL
  #112/2   kprobe_multi_test/link_api_addrs:FAIL
  [...]
  #112     kprobe_multi_test:FAIL
  #356/17  test_global_funcs/global_func17:FAIL
  #356     test_global_funcs:FAIL

Further analysis shows llvm upstream patch [1] is responsible for the above
failures. For example, for function bpf_fentry_test7() in net/bpf/test_run.c,
without [1], the asm code is:

  0000000000000400 <bpf_fentry_test7>:
     400: f3 0f 1e fa                   endbr64
     404: e8 00 00 00 00                callq   0x409 <bpf_fentry_test7+0x9>
     409: 48 89 f8                      movq    %rdi, %rax
     40c: c3                            retq
     40d: 0f 1f 00                      nopl    (%rax)

... and with [1], the asm code is:

  0000000000005d20 <bpf_fentry_test7.specialized.1>:
    5d20: e8 00 00 00 00                callq   0x5d25 <bpf_fentry_test7.specialized.1+0x5>
    5d25: c3                            retq

... and <bpf_fentry_test7.specialized.1> is called instead of <bpf_fentry_test7>
and this caused test failures for #13/#77 etc. except #356.

For test case #356/17, with [1] (progs/test_global_func17.c)), the main prog
looks like:

  0000000000000000 <global_func17>:
       0:       b4 00 00 00 2a 00 00 00 w0 = 0x2a
       1:       95 00 00 00 00 00 00 00 exit

... which passed verification while the test itself expects a verification
failure.

Let us add 'barrier_var' style asm code in both places to prevent function
specialization which caused selftests failure.

  [1] llvm/llvm-project#72903

Signed-off-by: Yonghong Song <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Manuel Diewald <[email protected]>
Signed-off-by: Stefan Bader <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.