Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for the Xilinx ZCU102 #8

Closed
wants to merge 6 commits into from
Closed

Conversation

Ivan-Velickovic
Copy link
Collaborator

Current status: partially boots Linux and then faults for some reason I am yet to discover.

VMM|INFO: starting guest at 0x10000000, DTB at 0x3f000000, initial RAM disk at 0x3d700000
[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
[    0.000000] Linux version 6.1.0 (ivanv@elementaryOStsdesktop) (aarch64-none-elf-gcc (GNU Toolchain for the A-profile Architecture 10.2-2020.11 (arm-10.16)) 10.2.1 20201103, GNU ld (GNU Toolchain for the A-profile Architecture 10.2-2020.11 (arm-10.16)) 2.35.1.20201028) #5 SMP PREEMPT Thu Jun 29 15:47:33 AEST 2023
[    0.000000] Machine model: ZynqMP ZCU102 Rev1.1
[    0.000000] efi: UEFI not found.
[    0.000000] earlycon: cdns0 at MMIO 0x00000000ff000000 (options '115200n8')
[    0.000000] printk: bootconsole [cdns0] enabled
[    0.000000] NUMA: No NUMA configuration found
[    0.000000] NUMA: Faking a node at [mem 0x0000000000000000-0x000000003fffffff]
[    0.000000] NUMA: NODE_DATA [mem 0x3fdd0a00-0x3fdd2fff]
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000000000-0x000000003fffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000000003fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000003fffffff]
[    0.000000] cma: Reserved 32 MiB at 0x000000003b600000
[    0.000000] psci: probing for conduit method from DT.
[    0.000000] psci: PSCIv1.2 detected in firmware.
[    0.000000] psci: Using standard PSCI v0.2 function IDs
[    0.000000] psci: Trusted OS migration not required
[    0.000000] psci: SMC Calling Convention v1.0
[    0.000000] percpu: Embedded 19 pages/cpu s48424 r0 d29400 u77824
[    0.000000] pcpu-alloc: s48424 r0 d29400 u77824 alloc=19*4096
[    0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
[    0.000000] Detected VIPT I-cache on CPU0
[    0.000000] CPU features: detected: ARM erratum 843419
[    0.000000] CPU features: detected: ARM erratum 845719
[    0.000000] alternatives: applying boot alternatives
[    0.000000] Fallback order for Node 0: 0
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 258048
[    0.000000] Policy zone: DMA
[    0.000000] Kernel command line: earlycon debug loglevel=8 initcall_debug earlyprintk=serial
[    0.000000] Unknown kernel command line parameters "earlyprintk=serial", will be passed to user space.
[    0.000000] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
[    0.000000] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 931252K/1048576K available (29184K kernel code, 5564K rwdata, 16716K rodata, 10752K init, 672K bss, 84556K reserved, 32768K cma-reserved)
VMM|ERROR: unexpected memory fault on address: 0x20, FSR: 0x93d48046, IP: 0xffff8000082647e8, is_prefetch: false, is_write: true
VMM|INFO: VCPU registers:
    SCTLR: 0x200000034f4d91d
    TTBR0: 0x12ced000
    TTBR1: 0x12cee000
    TCR:   0x400032b5503510
    MAIR:  0x40044ffff
    AMAIR: 0x0
    CIDR:  0x0
    ACTLR: 0x0
    CPACR: 0x300000
    AFSR0: 0x0
    AFSR1: 0x0
    ESR:   0x0
    FAR:   0x0
    ISR:   0x0
    VBAR:  0xffff800008010800
    TPIDR_EL1: 0xffff800034ead000
    SP_EL1: 0xffff80000b773ce0
    ELR_EL1: 0x12cf0008
    SPSR_EL1: 0x3c5
    CNTV_CTL: 0x0
    CNTV_CVAL: 0x0
    CNTVOFF: 0x538b5b6
    CNTKCTL_EL1: 0x1
VMM|ERROR: Failed to handle virtual memory fault
<<seL4(CPU 0) [receiveIPC/142 T0x8000003400 "child of: 'rootserver'" @207124]: Reply object already has unexecuted reply!>>

cc: @sitestudio

@Ivan-Velickovic
Copy link
Collaborator Author

Linux kernel at least boots now. Now we have to figure out why Linux user-space isn't working:

[    9.858543] Btrfs loaded, crc32c=crc32c-generic, zoned=no, fsverity=no
[   10.549905] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[   10.641286] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[   10.666106] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[   10.669282] cfg80211: failed to load regulatory.db
[   10.673463] ALSA device list:
[   10.673825]   No soundcards found.
[   10.678710] Warning: unable to open an initial console.
[   10.812887] Freeing unused kernel memory: 10752K
[   10.825434] Run /init as init process
[   10.827976]   with arguments:
[   10.828846]     /init
[   10.829606]   with environment:
[   10.830497]     HOME=/
[   10.835214]     TERM=linux
[   10.836235]     earlyprintk=serial
[   11.032427] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
[   11.034551] CPU: 0 PID: 1 Comm: init Not tainted 6.1.0 #5
[   11.035976] Hardware name: ZynqMP ZCU102 Rev1.1 (DT)
[   11.037005] Call trace:
[   11.037431]  dump_backtrace.part.0+0xd8/0xe4
[   11.039199]  show_stack+0x14/0x40
[   11.039546]  dump_stack_lvl+0x64/0x7c
[   11.039936]  dump_stack+0x14/0x2c
[   11.040423]  panic+0x180/0x340
[   11.040765]  make_task_dead+0x0/0x100
[   11.041149]  do_group_exit+0x30/0x8c
[   11.041483]  __wake_up_parent+0x0/0x2c
[   11.041988]  invoke_syscall+0x44/0x110
[   11.042353]  el0_svc_common.constprop.0+0x40/0xe0
[   11.042780]  do_el0_svc+0x28/0xc0
[   11.043110]  el0_svc+0x1c/0x50
[   11.043380]  el0t_64_sync_handler+0xb0/0xb4
[   11.043891]  el0t_64_sync+0x15c/0x160
[   11.046003] Kernel Offset: disabled
[   11.046456] CPU features: 0x00000,00c00080,0000421b
[   11.047594] Memory Limit: none
[   11.048818] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00 ]---

@Ivan-Velickovic
Copy link
Collaborator Author

I'm assuming the failure probably comes from the firmware failing:

[    0.909962] calling  zynqmp_firmware_driver_init+0x0/0x2c @ 1
VMM|INFO: Handling SiP call with function number: 63
VMM|INFO: Handling SiP call with function number: 1
[    0.911086] zynqmp_firmware: probe of firmware:zynqmp-firmware failed with error -22
[    0.912231] initcall zynqmp_firmware_driver_init+0x0/0x2c returned 0 after 2162 usecs

@Ivan-Velickovic
Copy link
Collaborator Author

Well it looked like the Microkit SDK changes for SMC forwarding I made were wrong - which is weird as seL4 should have at least printed an error. So that's one thing to investigate.

The next thing to investigate is, now that the SMC forwarding is fixed, it looks like when the VMM makes the system call to perform the SMC call, seL4 halts for some reason:

halting...
Kernel entry via Syscall, number: 1, Call
Cap type: 25, Invocation tag: 40

@sitestudio
Copy link

Not sure if you are already past this one:

Added some extra print statements and set CONFIG_ZYNQMP_FIRMWARE_DEBUG=y in linux_config, now getting an seL4_IllegalOperation.

VMM|INFO: Handling SiP call with function number: 63 0
VMM|INFO: PRE seL4_ARM_SMC_Call
<<seL4(CPU 0) [decodeSchedContextInvocation/278 T0x8000003400 "child of: 'rootserver'" @203d9c]: seL4_IllegalOperation SchedContext invocation: Illegal operation attempted.>>
VMM|INFO: POST seL4_ARM_SMC_Call
VMM|INFO: PRE fault_advance_vcpu 0
VMM|INFO: POST fault_advance_vcpu 0
VMM|INFO: POST SUCCESS
VMM|INFO: returning TRUE
VMM|INFO: Handling SiP call with function number: 1 0
VMM|INFO: PRE seL4_ARM_SMC_Call
<<seL4(CPU 0) [decodeSchedContextInvocation/278 T0x8000003400 "child of: 'rootserver'" @203d9c]: seL4_IllegalOperation SchedContext invocation: Illegal operation attempted.>>
VMM|INFO: POST seL4_ARM_SMC_Call
VMM|INFO: PRE fault_advance_vcpu 0
VMM|INFO: POST fault_advance_vcpu 0
VMM|INFO: POST SUCCESS
VMM|INFO: returning TRUE
[ 1.048455] zynqmp_firmware: probe of firmware:zynqmp-firmware failed with error -22
[ 1.049461] initcall zynqmp_firmware_driver_init+0x0/0x2c returned 0 after 2746 usecs

@Ivan-Velickovic
Copy link
Collaborator Author

Not sure if you are already past this one:

Added some extra print statements and set CONFIG_ZYNQMP_FIRMWARE_DEBUG=y in linux_config, now getting an seL4_IllegalOperation.

Thanks, it looks like that error is from me messing up the SMC cap number, which is fixed now Ivan-Velickovic/microkit@a1165fa. That got me to the latest update which is halting... from seL4. Haven't investigated further yet but will do so soon.

@Ivan-Velickovic
Copy link
Collaborator Author

Ivan-Velickovic commented Oct 3, 2023

Okay so, on hardware, we get into user-space and can log into the buildroot/BusyBox CLI.

The problem I observed with the kernel halting only happened on QEMU, and not on real hardware. I don't have the time to investigate but the first thing I would check is whether running this Linux kernel actually works on QEMU (without seL4 virtualisation), I assume it doesn't.

There are a couple of remaining issues before this can be merged:

  • There is an IRQ that is failing to be injected, VMM|ERROR: Failed to inject vIRQ 0x5 into vGIC on vCPU 0x0. I don't know what hardware this IRQ corresponds to and couldn't find anything in the DTS. I assume that the injection fails because we aren't registering this IRQ with the vGIC. The question is mainly why does Linux try to get this IRQ injected.
  • We should stop changing the default linux.dts and instead add the chosen node (with comments) and any disabled devices to a DTS overlay. The default DTS is too large and incomprehensible for someone later on to see what changes were needed to get virtualisation working.
  • Sort out SMC forwarding support properly, probably in a separate PR, the current support works but it makes too many assumptions.
  • Comments in simple.system (e.g why is SMC forwarding needed).
  • Check images README is correct.
  • Update main README with correct SDK version as SMC forwarding needs a new version of Microkit.
  • Using zig build, Linux fails to enter user-space. Needs investigation.
  • Others have said that ZCU102 and similar platforms need SMC forwarding for Linux to boot, but none have said why. I'd like to understand why and what Linux is actually doing with these SiP calls into firmware.

@Ivan-Velickovic Ivan-Velickovic marked this pull request as ready for review October 3, 2023 04:15
@Ivan-Velickovic Ivan-Velickovic force-pushed the zcu102_support branch 3 times, most recently from f7dac35 to 6ac1554 Compare October 3, 2023 06:36
@Ivan-Velickovic Ivan-Velickovic changed the title WIP: initial support for the Xilinx ZCU102 Support for the Xilinx ZCU102 Oct 3, 2023
@@ -90,6 +91,69 @@ static void smc_set_arg(seL4_UserContext *u, size_t arg, size_t val)
}
}

static void dump_smc_request(seL4_ARM_SMCContext *request) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be smc_print_request to be consistent with the TCB and VCPU functions.

Ivan-Velickovic and others added 6 commits January 16, 2024 14:38
These devices are not passed through for the simple example.
This commit should be changed to instead disable these devices
in the overlay and leave the base `linux.dts` file unchanged
from Linux source code.
@chrisguikema
Copy link

chrisguikema commented Apr 16, 2024

@Ivan-Velickovic I was looking through the LionsOS source and noticed this PR.

Others have said that ZCU102 and similar platforms need SMC forwarding for Linux to boot, but none have said why. I'd like to understand why and what Linux is actually doing with these SiP calls into firmware.

I did notice something when running a 2018.3 Petalinux on the CAmkES-VM. In the device tree, if you specified the pinctrl nodes in the guest device tree, Linux will attempt to use SMC calls to configure the hardware resources. When those nodes weren't included, no SMC calls were made and we didn't need to provide the VMM with any SMC capabilities.

I noticed this when using the kernel dtb as a base v. a dtb outputted by Petalinux.

https://github.com/seL4/seL4/blob/master/tools/dts/zynqmp.dts#L990

https://github.com/seL4/camkes-vm-images/blob/master/zynqmp/2018_3/linux.dts#L1899

Might be something worth looking into. Obviously, for any more complex passthrough devices, the pinctrl node is probably necessary, but for a minimal system you should be able to get away with no SMC calls.

@Furao
Copy link

Furao commented Aug 9, 2024

@Ivan-Velickovic I was looking through the LionsOS source and noticed this PR.

Others have said that ZCU102 and similar platforms need SMC forwarding for Linux to boot, but none have said why. I'd like to understand why and what Linux is actually doing with these SiP calls into firmware.

I did notice something when running a 2018.3 Petalinux on the CAmkES-VM. In the device tree, if you specified the pinctrl nodes in the guest device tree, Linux will attempt to use SMC calls to configure the hardware resources. When those nodes weren't included, no SMC calls were made and we didn't need to provide the VMM with any SMC capabilities.

I noticed this when using the kernel dtb as a base v. a dtb outputted by Petalinux.

https://github.com/seL4/seL4/blob/master/tools/dts/zynqmp.dts#L990

https://github.com/seL4/camkes-vm-images/blob/master/zynqmp/2018_3/linux.dts#L1899

Might be something worth looking into. Obviously, for any more complex passthrough devices, the pinctrl node is probably necessary, but for a minimal system you should be able to get away with no SMC calls.

For more specifics on the reason for why, see this doc: https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18842107/Arm+Trusted+Firmware

Functions like Power Management, Clocking, Pin Control, and others are mainly done with co-processors that are accessed via ATF. If a Linux device driver needs to make a change to any of these values for initialization, then the driver will make an SMC call.

@Ivan-Velickovic
Copy link
Collaborator Author

Fairly out of date and the PR does a couple of other things in addition to support for the ZCU102.

@Ivan-Velickovic Ivan-Velickovic deleted the zcu102_support branch September 4, 2024 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants