Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] DSP Panic on Intel MTL #9695

Open
as400l opened this issue Nov 29, 2024 · 8 comments
Open

[BUG] DSP Panic on Intel MTL #9695

as400l opened this issue Nov 29, 2024 · 8 comments
Labels
bug Something isn't working as expected P2 Critical bugs or normal features
Milestone

Comments

@as400l
Copy link

as400l commented Nov 29, 2024

Describe the bug
DSP Panic seen and full freeze of the OS.

To Reproduce
Open pavucontrol mute/unmute microphone few times. Close pavucontrol. Wait for freeze.

Reproduction Rate
100%

Expected behavior
No DSP Panic.

Impact
Cannot use builtin microphone.

Environment

  1. Branch name and commit hash of the 2 repositories: sof (firmware/topology) and linux (kernel driver).
    • Kernel: 6.12.1
    • SOF: sof-bin 2024.09.1
  2. Name of the topology file
    • Topology: sof-hda-generic-2ch.tplg
  3. Name of the platform(s) on which the bug is observed.
    • Platform: Intel Meteor Lake Ultra 9 185H, Asus Zenbook 14 OLED UX3405M, Alpine Linux

Screenshots or console output

[  186.448058] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump start ]------------
[  186.448069] sof-audio-pci-intel-mtl 0000:00:1f.3: DSP panic!
[  186.448071] sof-audio-pci-intel-mtl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (7)
[  186.448078] sof-audio-pci-intel-mtl 0000:00:1f.3: 0x50000005: module: ROM_EXT, state: FW_ENTERED, running
[  186.448083] sof-audio-pci-intel-mtl 0000:00:1f.3: Firmware state: 0x5, status/error code: 0x0
[  186.448116] sof-audio-pci-intel-mtl 0000:00:1f.3: Unknown toolchain is used
[  186.448120] sof-audio-pci-intel-mtl 0000:00:1f.3: error: DSP Firmware Oops
[  186.448121] sof-audio-pci-intel-mtl 0000:00:1f.3: error: Exception Cause: AllocaCause, MOVSP instruction, if caller’s registers are not in the register file
[  186.448123] sof-audio-pci-intel-mtl 0000:00:1f.3: EXCCAUSE 0x00000005 EXCVADDR 0x00000000 PS       0x00060d20 SAR     0x0000000c
[  186.448126] sof-audio-pci-intel-mtl 0000:00:1f.3: EPC1     0xa007626d EPC2     0x00000000 EPC3     0x00000000 EPC4    0x00000000
[  186.448128] sof-audio-pci-intel-mtl 0000:00:1f.3: EPC5     0x00000000 EPC6     0x00000000 EPC7     0x00000000 DEPC    0x00000000
[  186.448129] sof-audio-pci-intel-mtl 0000:00:1f.3: EPS2     0x00000000 EPS3     0x00000000 EPS4     0x00000000 EPS5    0x00000000
[  186.448131] sof-audio-pci-intel-mtl 0000:00:1f.3: EPS6     0x00000000 EPS7     0x00000000 INTENABL 0x00000000 INTERRU 0x00000000
[  186.448132] sof-audio-pci-intel-mtl 0000:00:1f.3: stack dump from 0x00000000
[  186.448134] sof-audio-pci-intel-mtl 0000:00:1f.3: AR registers:
[  186.448136] sof-audio-pci-intel-mtl 0000:00:1f.3: 0x0: a004ed15 a0111680 00000000 4015a7c0
[  186.448138] sof-audio-pci-intel-mtl 0000:00:1f.3: 0x10: a0166b00 00000018 401492b0 a0111680
[  186.448140] sof-audio-pci-intel-mtl 0000:00:1f.3: 0x20: a005fb41 a0111640 401492b0 a006506c
[  186.448142] sof-audio-pci-intel-mtl 0000:00:1f.3: 0x30: a005fb41 a0111640 401492b0 a006506c
[  186.448144] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump end ]------------
[  186.946817] sof-audio-pci-intel-mtl 0000:00:1f.3: ipc timed out for 0xe030001|0x300
[  186.946837] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ IPC dump start ]------------
[  186.946851] sof-audio-pci-intel-mtl 0000:00:1f.3: Host IPC initiator: 0x8e030001|0x300|0x0, target: 0x1b0a0000|0x0|0x0, ctl: 0x3
[  186.946856] sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ IPC dump end ]------------
[  186.946859] sof-audio-pci-intel-mtl 0000:00:1f.3: IPC timeout
[  186.946866] sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at soc_component_trigger on 0000:00:1f.3: -110
[  186.946878]  HDMI2: ASoC: trigger FE cmd: 1 failed: -110
[  186.946897] sof-audio-pci-intel-mtl 0000:00:1f.3: ipc4_tx_msg_unlocked: ipc message send for 0xe010001|0x0 failed: -19
[  186.946902] sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at soc_component_trigger on 0000:00:1f.3: -19
[  186.946904]  HDMI2: ASoC: trigger FE cmd: 0 failed: -19
[  186.947086] sof-audio-pci-intel-mtl 0000:00:1f.3: ipc4_tx_msg_unlocked: ipc message send for 0x13000003|0x1 failed: -19
[  186.947091] sof-audio-pci-intel-mtl 0000:00:1f.3: failed to pause all pipelines
[  186.947093] sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at soc_component_trigger on 0000:00:1f.3: -19
[  186.947096]  DMIC Raw: ASoC: trigger FE cmd: 0 failed: -19
[  186.947198] sof-audio-pci-intel-mtl 0000:00:1f.3: ipc4_tx_msg_unlocked: ipc message send for 0x46060004|0x19 failed: -19
[  186.947203] sof-audio-pci-intel-mtl 0000:00:1f.3: failed to unbind modules module-copier.12.2:0 -> tdfb.11.1:0
[  186.947208] sof-audio-pci-intel-mtl 0000:00:1f.3: ipc4_tx_msg_unlocked: ipc message send for 0x12040000|0x0 failed: -19
[  186.947212] sof-audio-pci-intel-mtl 0000:00:1f.3: failed to free pipeline widget pipeline.12
[  186.947219] sof-audio-pci-intel-mtl 0000:00:1f.3: ipc4_tx_msg_unlocked: ipc message send for 0x12050000|0x0 failed: -19
[  186.947222] sof-audio-pci-intel-mtl 0000:00:1f.3: failed to free pipeline widget pipeline.11
[  186.947225] sof-audio-pci-intel-mtl 0000:00:1f.3: Failed to free connected widgets
[  186.947233] sof-audio-pci-intel-mtl 0000:00:1f.3: sof_pcm_stream_free: sof_widget_list_free failed -19
[  186.947236] sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_prepare on 0000:00:1f.3: -19
[  186.947240]  DMIC Raw: ASoC: error at __soc_pcm_prepare on DMIC Raw: -19
[  186.947243]  DMIC Raw: ASoC: error at dpcm_fe_dai_prepare on DMIC Raw: -19
[  186.947344] sof-audio-pci-intel-mtl 0000:00:1f.3: ipc4_tx_msg_unlocked: ipc message send for 0x13020003|0x0 failed: -19
[  186.947348] sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at soc_dai_trigger on Analog CPU DAI: -19
[  186.947354]  HDA Analog: ASoC: error at dpcm_be_dai_trigger on HDA Analog: -19
[  186.947357]  HDA Analog: ASoC: trigger FE cmd: 0 failed: -19

Full dmesg in attachment.
dmesg.txt

@as400l as400l added the bug Something isn't working as expected label Nov 29, 2024
@lgirdwood
Copy link
Member

@as400l are you able to see this with alsamixer ? and if so which DMIC Kcontrol ?
@ujfalusi any additional kernel debug options to enable ?

@lgirdwood lgirdwood added this to the v2.12 milestone Nov 29, 2024
@lgirdwood lgirdwood added the P2 Critical bugs or normal features label Nov 29, 2024
@ujfalusi
Copy link
Contributor

As usual, @as400l:
Can you add this file sof-dyndbg.conf.txt
as /etc/modprobe.d/sof-dyndbg.conf, reboot and re-attach the dmesg log which contains the boot and the error itself?

In case the log is truncated because of a small log buffer, please add log_buf_len=4M to the kernel command line parameter (passed by the bootloader to the kernel)

@as400l
Copy link
Author

as400l commented Nov 29, 2024

Here is dmesg with the error and sof-dyndbg.conf enabled.

BTW - isn't it strange that it uses sof-hda-generic-2ch.tplg file ?

@lgirdwood - I tried with alsamixer but can't reproduce it. But, on the other hand, with alsamixer I can't unmute the mic. I have this LED on keyboard and no matter what I tried with alsamixer it's just constantly on. Which means that the mic was not unmuted.

dmesg.log.gz

@ujfalusi
Copy link
Contributor

@as400l, for some reason the dyndbg did not enabled the debug prints, we don't see what was the last message that was sent to the firmware, we know that the next would have been 0xe010002|0x0, which is not sent as the firmware has crashed.
Can you check again if the dyndbg is in place? The probing should be much more verbose with lots of prints about modules and stuff.

sof-hda-generic-2ch.tplg is chosen, because you have DMIC in your system

[   15.097047] sof-audio-pci-intel-mtl 0000:00:1f.3: DMICs detected in NHLT tables: 2

you also have BT offload advertised:

[   15.097044] sof-audio-pci-intel-mtl 0000:00:1f.3: NHLT device BT(0) detected, ssp_mask 0x4
[   15.097046] sof-audio-pci-intel-mtl 0000:00:1f.3: BT link detected in NHLT tables: 0x4

I'm not sure if that can cause any issues.

You can disable the dmic for testing the analog path (you will loose the laptop microphones) :

options snd_sof_intel_hda_generic dyndbg=+pmf dmic_num=0

in for example /etc/modprobe.d/no-dmic.conf

@as400l
Copy link
Author

as400l commented Nov 29, 2024

I tried multiple times with "wpctl set-mute @DEFAULT_AUDIO_SOURCE@ toggle". But could not reproduce this behaviour.

So maybe the real cause of this is actually XE drm module crash or hang related to pavucontrol ? Which may be seen at the end of dmesg I've sent ? Is this even possible ?

As to the debug prints. My kernel may is really slimmed down. So that may be the reason. May have to try with default distro kernel.

@lgirdwood
Copy link
Member

Here is dmesg with the error and sof-dyndbg.conf enabled.

BTW - isn't it strange that it uses sof-hda-generic-2ch.tplg file ?

@lgirdwood - I tried with alsamixer but can't reproduce it. But, on the other hand, with alsamixer I can't unmute the mic. I have this LED on keyboard and no matter what I tried with alsamixer it's just constantly on. Which means that the mic was not unmuted.

dmesg.log.gz

Ok, its strange that alsamixer wont unmute the mic, I assume you tried alsamixer -c N (where N is card number) to make sure all kcontrols have been tried.

Btw, is the keyboard LED on a key ? i.e. can it be pressed with Fn/Alt/Ctrl/shift combinations to switch LED on/off ? This should be mapped to the kcontrol that will mute/unmute the mic.

Please do try the stock kernel. We need to figure out what has happened here with stock kernel logs.

@as400l
Copy link
Author

as400l commented Dec 2, 2024

@lgirdwood - as I mentioned above - I tried with "wpctl" and it correctly mutes/unmutes microphone. LED goes off/on as it should. But I could not reproduce this error.

I'm leaning towards something else causing this panic.

Stock Alpine kernel was also not helpful since it's probably also stripped.

@as400l
Copy link
Author

as400l commented Dec 2, 2024

@lgirdwood
@ujfalusi

I compiled a kernel with DYNAMIC_DEBUG and here are logs with the error. I had to try to trigger it mutliple times as this time it wasn't so eager to panic.
Panic is at "313.393689".

dmesg.log.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected P2 Critical bugs or normal features
Projects
None yet
Development

No branches or pull requests

3 participants