Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supports XCVMEM extension instructions. #2

Open
wants to merge 10,000 commits into
base: corev-mcu-dev
Choose a base branch
from

Conversation

pz9115
Copy link

@pz9115 pz9115 commented Jul 22, 2024

Supports XCVMEM extension instructions on QEMU 9.0 version, please update the dev branch first,

commit see pz9115@265859e

rth7680 and others added 30 commits March 27, 2024 12:15
The return address comes from IA*Q_Next, and IASQ_Next
is always equal to IASQ_Back, not IASQ_Front.

Tested-by: Helge Deller <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Do not clobber the high bits of the address by using a 32-bit deposit.

Reviewed-by: Helge Deller <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Wide mode provides two more conditions, add them.

Fixes: 59963d8 ("target/hppa: Pass d to do_unit_cond")
Signed-off-by: Sven Schnelle <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Fixes: c53e401 ("target/hppa: Remove TARGET_REGISTER_BITS")
Signed-off-by: Sven Schnelle <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Reviewed-by: Helge Deller <[email protected]>
Tested-by: Helge Deller <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
The call to gen_helper_read_interval_timer is
identical on both sides of the IF.

Reviewed-by: Helge Deller <[email protected]>
Tested-by: Helge Deller <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Call translator_io_start before write to EIRR.
Move evaluation of EIRR vs EIEM to hppa_cpu_exec_interrupt.
Exit TB after write to EIEM, but otherwise use a straight store.

Reviewed-by: Helge Deller <[email protected]>
Tested-by: Helge Deller <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Move it to cpu.h, so it can also be used in hppa_form_gva_psw().

Signed-off-by: Sven Schnelle <[email protected]>
Reviewed-by: Helge Deller <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
The carry bits for each nibble N are located in bit (N+1)*4,
so the shift by 3 was off by one.  Furthermore, the carry bit
for the most significant carry bit is indeed located in bit 64,
which is located in a different storage word.

Use a double-word shift-right to reassemble into a single word
and place them all at bit 0 of their respective nibbles.

Tested-by: Helge Deller <[email protected]>
Reviewed-by: Helge Deller <[email protected]>
Fixes: b216745 ("target-hppa: Implement basic arithmetic")
Signed-off-by: Richard Henderson <[email protected]>
With r1 as zero is by far the most common usage of UADDCM, as the
easiest way to invert a register.  The compiler does occasionally
use the addition step as well, and we can simplify that to avoid
a temp and write directly into the destination.

Tested-by: Helge Deller <[email protected]>
Reviewed-by: Helge Deller <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Split do_unit_cond to do_unit_zero_cond to only handle conditions
versus zero.  These are the only ones that are legal for UXOR.
Simplify trans_uxor accordingly.

Rename do_unit to do_unit_addsub, since xor has been split.
Properly compute carry-out bits for add and subtract, mirroring
the code in do_add and do_sub.

Tested-by: Helge Deller <[email protected]>
Reviewed-by: Helge Deller <[email protected]>
Fixes: b216745 ("target-hppa: Implement basic arithmetic")
Signed-off-by: Richard Henderson <[email protected]>
The cond_need_ext predicate was created while we still had a
32-bit compilation mode.  It now makes more sense to treat D
as an absolute indicator of a 64-bit operation.

Tested-by: Helge Deller <[email protected]>
Reviewed-by: Helge Deller <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Prepare for proper indication of shladd unsigned overflow.
The UV indicator will be zero/not-zero instead of a single bit.

Tested-by: Helge Deller <[email protected]>
Reviewed-by: Helge Deller <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Overflow indicator should include the effect of the shift step.
We had previously left ??? comments about the issue.

Tested-by: Helge Deller <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
The local 9p driver in virtio-9p-test.c its temporary dir right at the
start of qos-test (via virtio_9p_create_local_test_dir()) and only
deletes it after qos-test is finished (via
virtio_9p_remove_local_test_dir()).

This means that any qos-test machine that ends up running virtio-9p-test
local tests more than once will end up re-using the same temp dir. This
is what's happening in [1] after we introduced the riscv machine nodes:
if we enable slow tests with the '-m slow' flag using
qemu-system-riscv64, this is what happens:

- a temp dir is created;

- virtio-9p-device tests will run virtio-9p-test successfully;

- virtio-9p-pci tests will run virtio-9p-test, and fail right at the
  first slow test at fs_create_dir() because the "01" file was already
created by fs_create_dir() test when running with the virtio-9p-device.

The root cause is that we're creating a single temporary dir, via the
construct/destruct callbacks, and this temp dir is kept for the entire
qos-test run.

We can change each test to clean after themselves. This approach would
make the 'create' tests obsolete since we would need to create and
delete dirs/files/symlinks for the cleanup, turning them into the
'unlinkat' tests that comes right after.

We chose a different approach that handles the root cause: do not use
constructor/destructor to create the temp dir. Create one temp dir for
each test, and remove it after the test is complete. This is the
approach taken for other qtests like vhost-user-test.c where each test
requires a setup() and a subsequent cleanup(), all of those instantiated
in the .before callback.

[1] https://mail.gnu.org/archive/html/qemu-devel/2024-03/msg05807.html

Reported-by: Thomas Huth <[email protected]>
Signed-off-by: Daniel Henrique Barboza <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Reviewed-by: Christian Schoenebeck <[email protected]>
Tested-by: Thomas Huth <[email protected]>
Signed-off-by: Christian Schoenebeck <[email protected]>
Commit 558f5c4 gated the local tests with g_test_slow() to skip them
in 'make check'. The reported issue back then was this following CI
problem:

https://lists.nongnu.org/archive/html/qemu-devel/2020-11/msg05510.html

This problem ended up being fixed after it was detected with the
recently added risc-v machine nodes [1]. virtio-9p-test.c is now
creating and removing temporary dirs for each test run, instead of
creating a single dir for the entire qos-test scope.

We're now able to run these tests with 'make check' in the CI, so let's
go ahead and re-enable them.

This reverts commit 558f5c4.

[1] https://mail.gnu.org/archive/html/qemu-devel/2024-03/msg05807.html

Signed-off-by: Daniel Henrique Barboza <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Reviewed-by: Christian Schoenebeck <[email protected]>
Tested-by: Thomas Huth <[email protected]>
Signed-off-by: Christian Schoenebeck <[email protected]>
virtio_net_guest_notifier_pending() and virtio_net_guest_notifier_mask()
checked VIRTIO_NET_F_MQ to know there are multiple queues, but
VIRTIO_NET_F_RSS also enables multiple queues. Refer to n->multiqueue,
which is set to true either of VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS is
enabled.

Fixes: 68b0a63 ("virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa")
Signed-off-by: Akihiko Odaki <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
The kernel documentation says:
> The value stored can be of any size, however, all array elements are
> aligned to 8 bytes.
https://www.kernel.org/doc/html/v6.8/bpf/map_array.html

Fixes: 333b3e5 ("ebpf: Added eBPF map update through mmap.")
Signed-off-by: Akihiko Odaki <[email protected]>
Acked-by: Andrew Melnychenko <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
It is incorrect to have the VIRTIO_NET_HDR_F_NEEDS_CSUM set when
checksum offloading is disabled so clear the bit.

TCP/UDP checksum is usually offloaded when the peer requires virtio
headers because they can instruct the peer to compute checksum. However,
igb disables TX checksum offloading when a VF is enabled whether the
peer requires virtio headers because a transmitted packet can be routed
to it and it expects the packet has a proper checksum. Therefore, it
is necessary to have a correct virtio header even when checksum
offloading is disabled.

A real TCP/UDP checksum will be computed and saved in the buffer when
checksum offloading is disabled. The virtio specification requires to
set the packet checksum stored in the buffer to the TCP/UDP pseudo
header when the VIRTIO_NET_HDR_F_NEEDS_CSUM bit is set so the bit must
be cleared in that case.

Fixes: ffbd2db ("e1000e: Perform software segmentation for loopback")
Buglink: https://issues.redhat.com/browse/RHEL-23067
Signed-off-by: Akihiko Odaki <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
Some of them are only necessary for POSIX systems. The others are
assigned to function pointers in NetClientInfo that can actually be
NULL.

Signed-off-by: Akihiko Odaki <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
…e()"

This reverts commit 46d4d36.

The reverted commit changed to emit warnings instead of errors when
vhost is requested but vhost initialization fails if vhostforce option
is not set.

However, vhostforce is not meant to ignore vhost errors. It was once
introduced as an option to commit 5430a28 ("vhost: force vhost off
for non-MSI guests") to force enabling vhost for non-MSI guests, which
will have worse performance with vhost. The option was deprecated with
commit 1e7398a ("vhost: enable vhost without without MSI-X") and
changed to behave identical with the vhost option for compatibility.

Worse, commit bf769f7 ("virtio: del net client if net_init_tap_one
failed") changed to delete the client when vhost fails even when the
failure only results in a warning. The leads to an assertion failure
for the -netdev command line option.

The reverted commit was intended to avoid that the vhost initialization
failure won't result in a corrupted netdev. This problem should have
been fixed by deleting netdev when the initialization fails instead of
ignoring the failure with an arbitrary option. Fortunately, commit
bf769f7 ("virtio: del net client if net_init_tap_one failed"),
mentioned earlier, implements this behavior.

Restore the correct semantics and fix the assertion failure for the
-netdev command line option by reverting the problematic commit.

Signed-off-by: Akihiko Odaki <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
This operation is trivial and does not require a helper.

Reviewed-by: Helge Deller <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Split trans_diag into per-operation functions.

Reviewed-by: Helge Deller <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
The 32-bit PA-7300LC (PCX-L2) CPU and the 64-bit PA8700 (PCX-W2) CPU
use different diag instructions to save or restore the CPU registers
to/from the shadow registers.

Implement those per-CPU architecture diag instructions to fix those
parts of the HP ODE testcases (L2DIAG and WDIAG, section 1) which test
the shadow registers.

Signed-off-by: Helge Deller <[email protected]>
[rth: Use decodetree to distinguish cases]
Signed-off-by: Richard Henderson <[email protected]>
Reviewed-by: Helge Deller <[email protected]>
Tested-by: Helge Deller <[email protected]>
Along this path we have already skipped the insn to be
nullified, so the subsequent insn should be executed.

Cc: [email protected]
Reported-by: Sven Schnelle <[email protected]>
Tested-by: Sven Schnelle <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
The 'sign' computation is attempting to locate the sign bit that has
been repeated, so that we can test if that bit is known zero.  That
computation can be zero if there are no known sign repetitions.

Cc: [email protected]
Fixes: 93a967f ("tcg/optimize: Propagate sign info for shifting")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2248
Signed-off-by: Richard Henderson <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Fixes: 83b4613 ("disas: introduce show_opcodes")
Signed-off-by: Richard Henderson <[email protected]>
Using log_pc produces the pc at the beginning of TB,
not the actual pc installed by cpu_restore_state_from_tb,
which could be any of the guest instructions within TB.

Signed-off-by: Richard Henderson <[email protected]>
Check for flag bit in H_GUEST_GETSET_STATE_FLAG_GUEST_WIDE need to use
bitwise NOT operator to ensure no other flag bits are set.

Resolves: Coverity CID 1540008
Resolves: Coverity CID 1540009
Reported-by: Peter Maydell <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off by: Harsh Prateek Bora <[email protected]>
Signed-off-by: Nicholas Piggin <[email protected]>
"sysemu/tcg.h" declares tcg_enabled(), and is implicitly included.
Include it explicitly to avoid the following error when refactoring
headers:

  hw/ppc/spapr.c:2612:9: error: call to undeclared function 'tcg_enabled'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
    if (tcg_enabled()) {
        ^

Reviewed-by: Harsh Prateek Bora <[email protected]>
Signed-off-by: Philippe Mathieu-Daudé <[email protected]>
Signed-off-by: Nicholas Piggin <[email protected]>
hrw and others added 25 commits June 1, 2024 07:20
Cc: [email protected]
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2304
Reported-by: Marcin Juszkiewicz <[email protected]>
Signed-off-by: Richard Henderson <[email protected]>
Signed-off-by: Marcin Juszkiewicz <[email protected]>
Message-id: [email protected]
Reviewed-by: Peter Maydell <[email protected]>
Signed-off-by: Peter Maydell <[email protected]>
(cherry picked from commit daf9748)
Signed-off-by: Michael Tokarev <[email protected]>
Since qemu 8.2, the combination of NBD + TLS + iothread crashes on an
assertion failure:

qemu-kvm: ../io/channel.c:534: void qio_channel_restart_read(void *): Assertion `qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)' failed.

It turns out that when we removed AioContext locking, we did so by
having NBD tell its qio channels that it wanted to opt in to
qio_channel_set_follow_coroutine_ctx(); but while we opted in on the
main channel, we did not opt in on the TLS wrapper channel.
qemu-iotests has coverage of NBD+iothread and NBD+TLS, but apparently
no coverage of NBD+TLS+iothread, or we would have noticed this
regression sooner.  (I'll add that in the next patch)

But while we could manually opt in to the TLS channel in nbd/server.c
(a one-line change), it is more generic if all qio channels that wrap
other channels inherit the follow status, in the same way that they
inherit feature bits.

CC: Stefan Hajnoczi <[email protected]>
CC: Daniel P. Berrangé <[email protected]>
CC: [email protected]
Fixes: https://issues.redhat.com/browse/RHEL-34786
Fixes: 06e0f09 ("io: follow coroutine AioContext in qio_channel_yield()", v8.2.0)
Signed-off-by: Eric Blake <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
Message-ID: <[email protected]>
(cherry picked from commit 199e84d)
Signed-off-by: Michael Tokarev <[email protected]>
Prevent regressions when using NBD with TLS in the presence of
iothreads, adding coverage the fix to qio channels made in the
previous patch.

The shell function pick_unused_port() was copied from
nbdkit.git/tests/functions.sh.in, where it had all authors from Red
Hat, agreeing to the resulting relicensing from 2-clause BSD to GPLv2.

CC: [email protected]
CC: "Richard W.M. Jones" <[email protected]>
Signed-off-by: Eric Blake <[email protected]>
Message-ID: <[email protected]>
Reviewed-by: Daniel P. Berrangé <[email protected]>
(cherry picked from commit a73c993)
Signed-off-by: Michael Tokarev <[email protected]>
Since only root APLICs can have hw IRQ lines, aplic->parent should
be initialized first.

Fixes: e8f7934 ("hw/intc: Add RISC-V AIA APLIC device emulation")
Reviewed-by: Daniel Henrique Barboza <[email protected]>
Signed-off-by: yang.zhang <[email protected]>
Cc: qemu-stable <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit c76b121)
Signed-off-by: Michael Tokarev <[email protected]>
The Zkr extension may only be exposed to KVM guests if the VMM
implements the SEED CSR. Use the same implementation as TCG.

Without this patch, running with a KVM which does not forward the
SEED CSR access to QEMU will result in an ILL exception being
injected into the guest (this results in Linux guests crashing on
boot). And, when running with a KVM which does forward the access,
QEMU will crash, since QEMU doesn't know what to do with the exit.

Fixes: 3108e2f ("target/riscv/kvm: update KVM exts to Linux 6.8")
Signed-off-by: Andrew Jones <[email protected]>
Reviewed-by: Daniel Henrique Barboza <[email protected]>
Cc: qemu-stable <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit 8699777)
Signed-off-by: Michael Tokarev <[email protected]>
Running a KVM guest using a 6.9-rc3 kernel, in a 6.8 host that has zkr
enabled, will fail with a kernel oops SIGILL right at the start. The
reason is that we can't expose zkr without implementing the SEED CSR.
Disabling zkr in the guest would be a workaround, but if the KVM doesn't
allow it we'll error out and never boot.

In hindsight this is too strict. If we keep proceeding, despite not
disabling the extension in the KVM vcpu, we'll not add the extension in
the riscv,isa. The guest kernel will be unaware of the extension, i.e.
it doesn't matter if the KVM vcpu has it enabled underneath or not. So
it's ok to keep booting in this case.

Change our current logic to not error out if we fail to disable an
extension in kvm_set_one_reg(), but show a warning and keep booting. It
is important to throw a warning because we must make the user aware that
the extension is still available in the vcpu, meaning that an
ill-behaved guest can ignore the riscv,isa settings and  use the
extension.

The case we're handling happens with an EINVAL error code. If we fail to
disable the extension in KVM for any other reason, error out.

We'll also keep erroring out when we fail to enable an extension in KVM,
since adding the extension in riscv,isa at this point will cause a guest
malfunction because the extension isn't enabled in the vcpu.

Suggested-by: Andrew Jones <[email protected]>
Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Cc: qemu-stable <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit 1215d45)
Signed-off-by: Michael Tokarev <[email protected]>
In RVV and vcrypto instructions, the masked and tail elements are set to 1s
using vext_set_elems_1s function if the vma/vta bit is set. It is the element
agnostic policy.

However, this function can't deal the big endian situation. This patch fixes
the problem by adding handling of such case.

Signed-off-by: Huang Tao <[email protected]>
Suggested-by: Richard Henderson <[email protected]>
Reviewed-by: LIU Zhiwei <[email protected]>
Cc: qemu-stable <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit 75115d8)
Signed-off-by: Michael Tokarev <[email protected]>
This code has a typo that writes zvkb to zvkg, causing users can't
enable zvkb through the config. This patch gets this fixed.

Signed-off-by: Yangyu Chen <[email protected]>
Fixes: ea61ef7 ("target/riscv: Move vector crypto extensions to riscv_cpu_extensions")
Reviewed-by: LIU Zhiwei <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Reviewed-by: Max Chou <[email protected]>
Reviewed-by:  Weiwei Li <[email protected]>
Message-ID: <[email protected]>
Cc: qemu-stable <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit ff33b7a)
Signed-off-by: Michael Tokarev <[email protected]>
….f.w instructions

According v spec 18.4, only the vfwcvt.f.f.v and vfncvt.f.f.w
instructions will be affected by Zvfhmin extension.
And the vfwcvt.f.f.v and vfncvt.f.f.w instructions only support the
conversions of

* From 1*SEW(16/32) to 2*SEW(32/64)
* From 2*SEW(32/64) to 1*SEW(16/32)

Signed-off-by: Max Chou <[email protected]>
Reviewed-by: Daniel Henrique Barboza <[email protected]>
Cc: qemu-stable <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit 17b713c)
Signed-off-by: Michael Tokarev <[email protected]>
…structions

The require_scale_rvf function only checks the double width operator for
the vector floating point widen instructions, so most of the widen
checking functions need to add require_rvf for single width operator.

The vfwcvt.f.x.v and vfwcvt.f.xu.v instructions convert single width
integer to double width float, so the opfxv_widen_check function doesn’t
need require_rvf for the single width operator(integer).

Signed-off-by: Max Chou <[email protected]>
Reviewed-by: Daniel Henrique Barboza <[email protected]>
Cc: qemu-stable <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit 7a999d4)
Signed-off-by: Michael Tokarev <[email protected]>
The opfv_narrow_check needs to check the single width float operator by
require_rvf.

Signed-off-by: Max Chou <[email protected]>
Reviewed-by: Daniel Henrique Barboza <[email protected]>
Cc: qemu-stable <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit 692f33a)
Signed-off-by: Michael Tokarev <[email protected]>
…widen instructions

If the checking functions check both the single and double width
operators at the same time, then the single width operator checking
functions (require_rvf[min]) will check whether the SEW is 8.

Signed-off-by: Max Chou <[email protected]>
Reviewed-by: Daniel Henrique Barboza <[email protected]>
Cc: qemu-stable <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit 93cb52b)
Signed-off-by: Michael Tokarev <[email protected]>
raise_mmu_exception(), as is today, is prioritizing guest page faults by
checking first if virt_enabled && !first_stage, and then considering the
regular inst/load/store faults.

There's no mention in the spec about guest page fault being a higher
priority that PMP faults. In fact, privileged spec section 3.7.1 says:

"Attempting to fetch an instruction from a PMP region that does not have
execute permissions raises an instruction access-fault exception.
Attempting to execute a load or load-reserved instruction which accesses
a physical address within a PMP region without read permissions raises a
load access-fault exception. Attempting to execute a store,
store-conditional, or AMO instruction which accesses a physical address
within a PMP region without write permissions raises a store
access-fault exception."

So, in fact, we're doing it wrong - PMP faults should always be thrown,
regardless of also being a first or second stage fault.

The way riscv_cpu_tlb_fill() and get_physical_address() work is
adequate: a TRANSLATE_PMP_FAIL error is immediately reported and
reflected in the 'pmp_violation' flag. What we need is to change
raise_mmu_exception() to prioritize it.

Reported-by: Joseph Chan <[email protected]>
Fixes: 82d53ad ("target/riscv/cpu_helper.c: Invalid exception on MMU translation stage")
Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Message-ID: <[email protected]>
Cc: qemu-stable <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit 68e7c86)
Signed-off-by: Michael Tokarev <[email protected]>
Previous patch fixed the PMP priority in raise_mmu_exception() but we're still
setting mtval2 incorrectly. In riscv_cpu_tlb_fill(), after pmp check in 2 stage
translation part, mtval2 will be set in case of successes 2 stage translation but
failed pmp check.

In this case we gonna set mtval2 via env->guest_phys_fault_addr in context of
riscv_cpu_tlb_fill(), as this was a guest-page-fault, but it didn't and mtval2
should be zero, according to RISCV privileged spec sect. 9.4.4: When a guest
page-fault is taken into M-mode, mtval2 is written with either zero or guest
physical address that faulted, shifted by 2 bits. *For other traps, mtval2
is set to zero...*

Signed-off-by: Alexei Filippov <[email protected]>
Reviewed-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Message-ID: <[email protected]>
Cc: qemu-stable <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit 6c9a344)
Signed-off-by: Michael Tokarev <[email protected]>
When running the instruction

```
    cbo.flush 0(x0)
```

QEMU would segfault.

The issue was in cpu_gpr[a->rs1] as QEMU does not have cpu_gpr[0]
allocated.

In order to fix this let's use the existing get_address()
helper. This also has the benefit of performing pointer mask
calculations on the address specified in rs1.

The pointer masking specificiation specifically states:

"""
Cache Management Operations: All instructions in Zicbom, Zicbop and Zicboz
"""

So this is the correct behaviour and we previously have been incorrectly
not masking the address.

Signed-off-by: Alistair Francis <[email protected]>
Reported-by: Fabian Thomas <[email protected]>
Fixes: e05da09 ("target/riscv: implement Zicbom extension")
Reviewed-by: Richard Henderson <[email protected]>
Cc: qemu-stable <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit c5eb8d6)
Signed-off-by: Michael Tokarev <[email protected]>
In AIA spec, each hart (or each hart within a group) has a unique hart
number to locate the memory pages of interrupt files in the address
space. The number of bits required to represent any hart number is equal
to ceil(log2(hmax + 1)), where hmax is the largest hart number among
groups.

However, if the largest hart number among groups is a power of 2, QEMU
will pass an inaccurate hart-index-bit setting to Linux. For example, when
the guest OS has 4 harts, only ceil(log2(3 + 1)) = 2 bits are sufficient
to represent 4 harts, but we passes 3 to Linux. The code needs to be
updated to ensure accurate hart-index-bit settings.

Additionally, a Linux patch[1] is necessary to correctly recover the hart
index when the guest OS has only 1 hart, where the hart-index-bit is 0.

[1] https://lore.kernel.org/lkml/[email protected]/t/

Signed-off-by: Yong-Xuan Wang <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Cc: qemu-stable <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit 190b867)
Signed-off-by: Michael Tokarev <[email protected]>
Commit 33a2491 changed 'reg_width' to use 'vlenb', i.e. vector length
in bytes, when in this context we want 'reg_width' as the length in
bits.

Fix 'reg_width' back to the value in bits like 7cb5992
("target/riscv/gdbstub.c: use 'vlenb' instead of shifting 'vlen'") set
beforehand.

While we're at it, rename 'reg_width' to 'bitsize' to provide a bit more
clarity about what the variable represents. 'bitsize' is also used in
riscv_gen_dynamic_csr_feature() with the same purpose, i.e. as an input to
gdb_feature_builder_append_reg().

Cc: Akihiko Odaki <[email protected]>
Cc: Alex Bennée <[email protected]>
Reported-by: Robin Dapp <[email protected]>
Fixes: 33a2491 ("target/riscv: Use GDBFeature for dynamic XML")
Signed-off-by: Daniel Henrique Barboza <[email protected]>
Reviewed-by: LIU Zhiwei <[email protected]>
Acked-by: Alex Bennée <[email protected]>
Reviewed-by: Akihiko Odaki <[email protected]>
Reviewed-by: Alistair Francis <[email protected]>
Cc: qemu-stable <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit 583edc4)
Signed-off-by: Michael Tokarev <[email protected]>
Previously we only listed a single pmpcfg CSR and the first 16 pmpaddr
CSRs. This patch fixes this to list all 16 pmpcfg and all 64 pmpaddr
CSRs are part of the disassembly.

Reported-by: Eric DeVolder <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
Fixes: ea10325 ("RISC-V Disassembler")
Reviewed-by: Daniel Henrique Barboza <[email protected]>
Cc: qemu-stable <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Alistair Francis <[email protected]>
(cherry picked from commit 915758c)
Signed-off-by: Michael Tokarev <[email protected]>
xsave.flat checks that "executing the XSETBV instruction causes a general-
protection fault (#GP) if ECX = 0 and EAX[2:1] has the value 10b".  QEMU allows
that option, so the test fails.  Add the condition.

Cc: [email protected]
Fixes: 8925443 ("target/i386: implement XSAVE and XRSTOR of AVX registers", 2022-10-18)
Reported-by: Thomas Huth <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
(cherry picked from commit 7604bbc)
Signed-off-by: Michael Tokarev <[email protected]>
Features check of CPUID_SSE and CPUID_SSE2 should use cpuid_features,
rather than cpuid_ext_features.

Signed-off-by: Xinyu Li <[email protected]>
Reviewed-by: Zhao Liu <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
(cherry picked from commit da7c959)
Signed-off-by: Michael Tokarev <[email protected]>
Commit dfcf74f ("virtio-gpu: fix scanout migration post-load") broke
forward/backward version migration. Versioning of nested VMSD structures
is not straightforward, as the wire format doesn't have nested
structures versions. Introduce x-scanout-vmstate-version and a field
test to save/load appropriately according to the machine version.

Fixes: dfcf74f ("virtio-gpu: fix scanout migration post-load")
Signed-off-by: Marc-André Lureau <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: Fiona Ebner <[email protected]>
Tested-by: Fiona Ebner <[email protected]>
[fixed long lines]
Signed-off-by: Fabiano Rosas <[email protected]>
(cherry picked from commit 40a23ef)
Signed-off-by: Michael Tokarev <[email protected]>
By default, SDL disables the screen saver which prevents the host from powering
down the screen even if the screen is locked. This results in draining the
battery needlessly when the host isn't connected to a wall charger. Fix that by
enabling the screen saver.

Signed-off-by: Bernhard Beschow <[email protected]>
Acked-by: Marc-André Lureau <[email protected]>
Message-ID: <[email protected]>
(cherry picked from commit 2e701e6)
Signed-off-by: Michael Tokarev <[email protected]>
description:
    loongarch_cpu_dump_state() want to dump all loongarch cpu
state registers, but there is a tiny typographical error when
printing "PRCFG2".

Cc: [email protected]
Signed-off-by: lanyanzhi <[email protected]>
Reviewed-by: Richard Henderson <[email protected]>
Reviewed-by: Song Gao <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Song Gao <[email protected]>
(cherry picked from commit 78f932e)
Signed-off-by: Michael Tokarev <[email protected]>
Signed-off-by: Michael Tokarev <[email protected]>
@pz9115
Copy link
Author

pz9115 commented Jul 22, 2024

Here we can have a simple c program to verify the QEMU changes, since the logical is same, just verify cv.lb/cv.sb instructions in this case.

Firstly, we set the cvls.c to generate normal lb/sb instructions

#include <stdio.h>
#include <stdint.h>

int foo(int8_t a,int b){
   return a+b;
}

int main() {

    int res = foo(170,85);

    printf("%x\n", res);
    return 0;
}

Compile with corev-gcc ( use -S), we got the assemble code cvls.s

	.file	"cvls.c"
	.option nopic
	.attribute arch, "rv32i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xcvmem1p0"
	.attribute unaligned_access, 0
	.attribute stack_align, 16
	.text
	.align	1
	.globl	foo
	.type	foo, @function
foo:
	addi	sp,sp,-32
	sw	ra,28(sp)
	sw	s0,24(sp)
	addi	s0,sp,32
	mv	a5,a0
	sw	a1,-24(s0)
	sb	a5,-17(s0)
	lb	a4,-17(s0)
	lw	a5,-24(s0)
	add	a5,a4,a5
	mv	a0,a5
	lw	ra,28(sp)
	lw	s0,24(sp)
	addi	sp,sp,32
	jr	ra
	.size	foo, .-foo
	.section	.rodata
	.align	2
.LC0:
	.string	"%x\n"
	.text
	.align	1
	.globl	main
	.type	main, @function
main:
	addi	sp,sp,-32
	sw	ra,28(sp)
	sw	s0,24(sp)
	addi	s0,sp,32
	li	a1,85
	li	a0,-86
	call	foo
	sw	a0,-20(s0)
	lw	a1,-20(s0)
	lui	a5,%hi(.LC0)
	addi	a0,a5,%lo(.LC0)
	call	printf
	li	a5,0
	mv	a0,a5
	lw	ra,28(sp)
	lw	s0,24(sp)
	addi	sp,sp,32
	jr	ra
	.size	main, .-main
	.ident	"GCC: (gd3150e4de84) 14.1.0"
	.section	.note.GNU-stack,"",@progbits

then we need to modify the sb/lb into cv.sb/cv.lb in function foo, here is an example

foo:
        addi    sp, sp, -32
        sw      ra, 28(sp)
        sw      s0, 24(sp)
        addi    s0, sp, 32
        mv      a5, a0
        sw      a1, -24(s0)
        addi    t0, s0, -17     // store the address into t0
        cv.sb   a5, (t0), 0     // use cv.sb to store then t0 += Sext(0)
        cv.lb   a4, (t0), 0     //  use cv.lb to load then t0 += Sext(0)
        lw      a5, -24(s0)
        add     a5, a4, a5
        mv      a0, a5
        lw      ra, 28(sp)
        lw      s0, 24(sp)
        addi    sp, sp, 32
        jr      ra

Use corev-gcc to generate two elf files with different source assemble codes, the modified assemble codes need assign -march with xcvmem extension.

Finally, use qemu to run both elf file, see if we can get the same result.

To verify the post-increment feature, we can adjust the third argument in cv.sb, here is an example

foo:
        addi    sp, sp, -32
        sw      ra, 28(sp)
        sw      s0, 24(sp)
        addi    s0, sp, 32
        mv      a5, a0
        sw      a1, -24(s0)
        addi    t0, s0, -17     // store the address into t0
        cv.sb   a5, (t0), 20     // use cv.sb to store then t0 += Sext(20)
        addi    t0, t0, -20      // subtract the addition 20
        cv.lb   a4, (t0), -24     //  use cv.lb to load then t0 += Sext(0)
        lw      a5, t0(s0)
        add     a5, a4, a5
        mv      a0, a5
        lw      ra, 28(sp)
        lw      s0, 24(sp)
        addi    sp, sp, 32
        jr      ra

@pz9115
Copy link
Author

pz9115 commented Aug 2, 2024

This based on QEMU version for 9.0.1 release, need to checkout the branch.

commit 60b4f3a (tag: v9.0.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.