Linux 4.9: nvme-pci: No irq handler for vector #27

akaher · 2018-12-21T20:26:21Z

BackPorted required changes of pci-hyperv on Linux 4.9, this driver works.

But getting following issues with nvme:

[74530.686555] do_IRQ: 10.232 No irq handler for vector
[74530.712068] do_IRQ: 10.232 No irq handler for vector
[74530.737579] do_IRQ: 10.232 No irq handler for vector
[74530.763092] do_IRQ: 10.232 No irq handler for vector
[74532.832221] nvme nvme1: I/O 206 QID 6 timeout, reset controller
[74532.873967] nvme nvme1: completing aborted command with status: fffffffc
[74532.873971] blk_update_request: I/O error, dev nvme1n1, sector 1048320

Back-ported the following patch, but still facing same issue:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/patch/drivers/nvme/host/pci.c?id=0ff199cb48b4af6f29a1bf15d92d93f44a22eeb4

akaher · 2019-01-18T12:25:10Z

Found the reason why I am getting "No irq handler for vector":
nvme/host/pci.c allocates the irq vectors which assigned to be executed to X CPUs, now this info should be passed to Hypervisor and same should be copied to VECTOR_IRQ (this is percpu vector). However this assignment of CPUs is different as compare with the info passed to Hypervisor and on VECTOR_IRQ. Now once INT received by VM, because of entry is mismatching because of CPU number, INT is dropped.

Please help me to find out why this is happening incase of v4.9 and works fine with v4.14.

dcui · 2019-01-18T20:12:52Z

Can the 2 top patches on this branch help? https://github.com/dcui/linux/commits/decui/SLES12-SP3-AZURE-2018-1029.

I don't know what goes wrong here, as I don't have a NVMe to test. I hope the 2 patches could help, but I'm completely not sure.

akaher · 2019-01-22T13:03:20Z

Thanks Dexuan, after applying the following patches NVME IRQs, scheduled on CPU0 and CPU8 (total cpus in vm is 15) and no more "No irq handler for vector" in dmesg :
dcui/linux@cd09cb7
dcui/linux@eba61d2

Looking further, how to schedule on alternative CPU.

dcui · 2019-01-22T19:31:33Z

Glad to know the 2 patches can help!

Now it looks to me that the pci-hyperv driver is good, and you might need to improve the NVMe driver in v4.9 to spread interrupts to more CPUs if necessary (I assume the NVMe driver in the latest mainline kernel should do a better job on this).

akaher · 2019-01-24T04:12:13Z

Dexuan, any specific reason for not upstreaming following patch to stable mainline kernels:
dcui/linux@eba61d2

dcui · 2019-01-24T06:31:31Z

The patch was made for v4.12.14, which has reached End-of-Life: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/?h=linux-4.12.y

irq_data_get_effective_affinity_mask() is in v4.14+, so it's not needed there.

The other long-term stable kernels (4.9, 4.4, 3.x) seem a little old and it looks they're not widely used any more.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Linux 4.9: nvme-pci: No irq handler for vector #27

Linux 4.9: nvme-pci: No irq handler for vector #27

akaher commented Dec 21, 2018

akaher commented Jan 18, 2019

dcui commented Jan 18, 2019

akaher commented Jan 22, 2019

dcui commented Jan 22, 2019

akaher commented Jan 24, 2019

dcui commented Jan 24, 2019

Linux 4.9: nvme-pci: No irq handler for vector #27

Linux 4.9: nvme-pci: No irq handler for vector #27

Comments

akaher commented Dec 21, 2018

akaher commented Jan 18, 2019

dcui commented Jan 18, 2019

akaher commented Jan 22, 2019

dcui commented Jan 22, 2019

akaher commented Jan 24, 2019

dcui commented Jan 24, 2019