• Bug#1107521: linux-image-6.12.27+bpo-amd64: ath12k_pci errors and event

    From Matt Mower@1:229/2 to All on Sun Jun 8 17:10:02 2025
    XPost: linux.debian.bugs.dist
    From: [email protected]

    Package: src:linux
    Version: 6.12.27-1~bpo12+1
    Severity: important
    X-Debbugs-Cc: [email protected]

    Dear Maintainer,

    After updating from linux-image-6.12.22+bpo-amd64 to linux-image-6.12.27+bpo- amd64, I am seeing lots of IO_PAGE_FAULT events for device ath12k_pci, and twice my system has completely frozen. The last freeze occurred during shutdown; the dmesg is attached and the tail is pasted here for convenience. My firmware is the latest available from https://git.codelinaro.org/clo/ath- firmware/ath12k-firmware as of this writing:

    Jun 08 07:28:16.437499 AI360 kernel: ath12k_pci 0000:c2:00.0: chip_id 0x2 chip_family 0x4 board_id 0xff soc_id 0x40170200
    Jun 08 07:28:16.437945 AI360 kernel: ath12k_pci 0000:c2:00.0: fw_version 0x1108811c fw_build_timestamp 2025-05-17 00:21 fw_build_id QC_IMAGE_VERSION_STRING=WLAN.HMT.1.1.c5-00284.1-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3

    dmesg tail from last shutdown where system froze:

    Jun 08 07:31:12.531172 AI360 kernel: ath12k_pci 0000:c2:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0xfe980000 flags=0x0020]
    Jun 08 07:31:12.531524 AI360 kernel: ath12k_pci 0000:c2:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0xfe980400 flags=0x0020]
    Jun 08 07:31:12.531654 AI360 kernel: ath12k_pci 0000:c2:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0xfe980500 flags=0x0020]
    Jun 08 07:31:12.531770 AI360 kernel: ath12k_pci 0000:c2:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0xfe980600 flags=0x0020]
    Jun 08 07:31:12.531879 AI360 kernel: ath12k_pci 0000:c2:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0xfe980700 flags=0x0020]
    Jun 08 07:31:12.531988 AI360 kernel: ath12k_pci 0000:c2:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0xfe980800 flags=0x0020]
    Jun 08 07:31:12.532085 AI360 kernel: ath12k_pci 0000:c2:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0xfe980900 flags=0x0020]
    Jun 08 07:31:12.532137 AI360 kernel: ath12k_pci 0000:c2:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0xfe980a00 flags=0x0020]
    Jun 08 07:31:12.532186 AI360 kernel: ath12k_pci 0000:c2:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0xfe980b00 flags=0x0020]
    Jun 08 07:31:12.532237 AI360 kernel: ath12k_pci 0000:c2:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0xfe980c00 flags=0x0020]
    Jun 08 07:31:12.532288 AI360 kernel: AMD-Vi: IOMMU Event log restarting
    Jun 08 07:31:12.627163 AI360 kernel: mhi mhi0: Requested to power ON
    Jun 08 07:31:12.627308 AI360 kernel: mhi mhi0: Power on setup success
    Jun 08 07:32:42.923214 AI360 kernel: mhi mhi0: MHI did not load image over BHI, ret: -5
    Jun 08 07:32:42.924008 AI360 kernel: ath12k_pci 0000:c2:00.0: failed to set mhi state: POWER_ON(2)
    Jun 08 07:32:42.924245 AI360 kernel: ath12k_pci 0000:c2:00.0: failed to start mhi: -110
    Jun 08 07:35:56.055573 AI360 kernel: ath12k_pci 0000:c2:00.0: failed to send WMI_START_SCAN_CMDID
    Jun 08 07:35:56.056269 AI360 kernel: ath12k_pci 0000:c2:00.0: failed to start hw scan: -108
    Jun 08 07:45:36.703208 AI360 kernel: Lockdown: systemd-logind: hibernation is restricted; see man kernel_lockdown.7
    Jun 08 07:45:41.075220 AI360 kernel: rfkill: input handler enabled
    Jun 08 07:45:53.291491 AI360 kernel: ath12k_pci 0000:c2:00.0: failed to submit WMI_VDEV_DELETE_CMDID
    Jun 08 07:45:53.309847 AI360 kernel: ath12k_pci 0000:c2:00.0: failed to delete WMI vdev 1: -108
    Jun 08 07:45:53.345825 AI360 kernel: list_del corruption. next->prev should be ffff8a4f0fc69950, but was ffff8a4f0c33de50. (next=ffff8a4f0c33de50)
    Jun 08 07:45:53.345879 AI360 kernel: ------------[ cut here ]------------
    Jun 08 07:45:53.345888 AI360 kernel: kernel BUG at lib/list_debug.c:65!


    -- Package-specific info:
    ** Kernel log: boot messages should be attached

    ** Model information
    sys_vendor: LENOVO
    product_name: 21M1001VUS
    product_version: ThinkPad T14s Gen 6
    chassis_vendor: LENOVO
    chassis_version: None
    bios_vendor: LENOVO
    bios_version: R2NET36W (1.10 )
    board_vendor: LENOVO
    board_name: 21M1001VUS
    board_version: SDK0T76576 WIN

    ** Configuration for modprobe:
    blacklist microcode
    blacklist arkfb
    blacklist aty128fb
    blacklist atyfb
    blacklist radeonfb
    blacklist cirrusfb
    blacklist cyber2000fb
    blacklist kyrofb
    blacklist matroxfb_base
    blacklist mb862xxfb
    blacklist neofb
    blacklist pm2fb
    blacklist pm3fb
    blacklist s3fb
    blacklist savagefb
    blacklist sisfb
    blacklist tdfxfb
    blacklist tridentfb
    blacklist vt8623fb
    options snd_pcsp index=-2
    options cx88_alsa index=-2
    options snd_atiixp_modem index=-2
    options snd_intel8x0m index=-2
    options snd_via82xx_modem index=-2
    options bonding max_bonds=0
    options dummy numdummies=0
    options ifb numifbs=0

    ** Network interface configuration:
    *** /etc/network/interfaces:

    source /etc/network/interfaces.d/*

    auto lo
    iface lo inet loopback

    ** PCI devices:
    00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Strix Root Complex [1022:1507]
    Subsystem: Lenovo Strix Root Complex [17aa:50f0]
    Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

    00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Strix IOMMU [1022:1508]
    Subsystem: Lenovo Strix IOMMU [17aa:50f0]
    Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0
    Interrupt: pin A routed to IRQ 26
    Capabilities: <access denied>

    00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Strix Dummy Host Bridge [1022:1509]
    Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    IOMMU group: 1


    [continued in next message]

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Uwe =?utf-8?Q?Kleine-K=C3=B6nig?=@1:229/2 to Matt Mower on Wed Jun 11 23:00:01 2025
    XPost: linux.debian.bugs.dist
    From: [email protected]

    Control: tag -1 +moreinfo

    Hello Matt,

    On Sun, Jun 08, 2025 at 08:00:43AM -0700, Matt Mower wrote:
    After updating from linux-image-6.12.22+bpo-amd64 to linux-image-6.12.27+bpo- amd64, I am seeing lots of IO_PAGE_FAULT events for device ath12k_pci, and twice my system has completely frozen.

    There are a few ath12k patches between 6.12.22 and 6.12.27 and a few
    more in later stable releases in the 6.12.x series. Can you please test
    on the latest 6.12 kernel? There is currently no backport kernel for
    that version, but in my experience the kernel from testing should
    install fine on a stable box.

    If the issue also happens on 6.12.32, someone has to bring that forward
    to upstream's attention.

    Best regards
    Uwe

    -----BEGIN PGP SIGNATURE-----

    iQEzBAABCgAdFiEEP4GsaTp6HlmJrf7Tj4D7WH0S/k4FAmhJ67UACgkQj4D7WH0S /k7UPQgAlBe3q8lkf66YwKh/a6lgrSyRQk5ZeziTNkdmpHboO3JkubenorlHUDYd roLpHlncKu/EGdp965KB1Mb87PisdRM036yBjXBeA4hSXjr5sWomEh2tIZSGtfcM ytr5rw6HlXCz2QgFbkFOtgW05AC6Us+U7jg/yhIWOtM6+0A2zWFl9Yh0ux+vhB+q Zml/AoKj0dVFc+hDPzXQZpxqbSchbC5RTgOMElXG2rs85EDbO8DgUkfd9nh/vOmJ UJ3vQVcIQAQVL7hlxUdoc2xE9BVKlRm4jAFUt4eBZU+XOv5Gl9C9PSPbwNbVbTyv ct/jtdOn90MRwculBDqGvuw/Tky/EA==
    =Sqgr
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Matt Mower@1:229/2 to [email protected] on Thu Jun 12 22:50:01 2025
    XPost: linux.debian.bugs.dist
    From: [email protected]

    After several suspends and plenty of uptime on battery, I'm fairly
    confident that linux-image-6.12.12+bpo-amd64_6.12.12-1~bpo12+1_amd64 paired with firmware-atheros_20250410-2~bpo12+1_all is stable. There are only 2 commits to the ath12k driver between 6.12.12 and 6.12.22:

    44de00e8bc8f wifi: ath12k: fix handling of 6 GHz rules
    425f6a38173b wifi: ath12k: fix tx power, max reg power update to firmware

    So, I'll next test 6.12.22 with reverts of one or both of these.

    On Wed, Jun 11, 2025 at 9:50 PM Matt Mower <[email protected]> wrote:

    For what it's worth, if you check the replies after the original report,
    you can see that I tested 6.12.32 as well as 6.12.27 with ath12k patches reverted. Unfortunately, I can reproduce it in 6.12.22 now as well; the bug only seems to appear when my laptop is not connected to a power source. I reproduce it most often when the device shuts down or goes to sleep, at
    which point it freezes completely. I snapped a picture when I saw more debugging information that normal on the screen (attached). I've tried downgrading to firmware-atheros_20250410-2~bpo12+1_all as well (dmesg
    reports firmware fw_version 0x100301e1 fw_build_timestamp 2023-12-06
    04:05), and that didn't resolve the issue.

    I'm continuing to test older kernel and firmware versions until I find something stable.

    On Wed, Jun 11, 2025 at 1:49 PM Uwe Kleine-König < [email protected]> wrote:

    Control: tag -1 +moreinfo

    Hello Matt,

    On Sun, Jun 08, 2025 at 08:00:43AM -0700, Matt Mower wrote:
    After updating from linux-image-6.12.22+bpo-amd64 to
    linux-image-6.12.27+bpo-
    amd64, I am seeing lots of IO_PAGE_FAULT events for device ath12k_pci,
    and
    twice my system has completely frozen.

    There are a few ath12k patches between 6.12.22 and 6.12.27 and a few
    more in later stable releases in the 6.12.x series. Can you please test
    on the latest 6.12 kernel? There is currently no backport kernel for
    that version, but in my experience the kernel from testing should
    install fine on a stable box.

    If the issue also happens on 6.12.32, someone has to bring that forward
    to upstream's attention.

    Best regards
    Uwe



    <div dir="ltr">After several suspends and plenty of uptime on battery, I&#39;m fairly confident that linux-image-6.12.12+bpo-amd64_6.12.12-1~bpo12+1_amd64 paired with firmware-atheros_20250410-2~bpo12+1_all is stable. There are only 2 commits to the
    ath12k driver between 6.12.12 and 6.12.22:<div><br></div><div>44de00e8bc8f wifi: ath12k: fix handling of 6 GHz rules<br>425f6a38173b wifi: ath12k: fix tx power, max reg power update to firmware</div><div><br></div><div>So, I&#39;ll next test 6.12.22 with
    reverts of one or both of these.</div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Wed, Jun 11, 2025 at 9:50 PM Matt Mower &lt;<a href="mailto:[email protected]">[email protected]</a>&gt; wrote:<br></div><
    blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">For what it&#39;s worth, if you check the replies after the original report, you can see that I tested 6.12.32 as well
    as 6.12.27 with ath12k patches reverted. Unfortunately, I can reproduce it in 6.12.22 now as well; the bug only seems to appear when my laptop is not connected to a power source. I reproduce it most often when the device shuts down or goes to sleep, at
    which point it freezes completely. I snapped a picture when I saw more debugging information that normal on the screen (attached). I&#39;ve tried downgrading to firmware-atheros_20250410-2~bpo12+1_all as well (dmesg reports firmware fw_version
    0x100301e1 fw_build_timestamp 2023-12-06 04:05), and that didn&#39;t resolve the issue.<div><br></div><div>I&#39;m continuing to test older kernel and firmware versions until I find something stable.</div></div><br><div class="gmail_quote"><div dir="ltr"
    class="gmail_attr">On Wed, Jun 11, 2025 at 1:49 PM Uwe Kleine-König &lt;<a href="mailto:[email protected]" target="_blank">[email protected]</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;
    border-left:1px solid rgb(204,204,204);padding-left:1ex">Control: tag -1 +moreinfo<br>

    Hello Matt,<br>

    On Sun, Jun 08, 2025 at 08:00:43AM -0700, Matt Mower wrote:<br>
    &gt; After updating from linux-image-6.12.22+bpo-amd64 to linux-image-6.12.27+bpo-<br>
    &gt; amd64, I am seeing lots of IO_PAGE_FAULT events for device ath12k_pci, and<br>
    &gt; twice my system has completely frozen.<br>


    [continued in next message]

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Salvatore Bonaccorso@1:229/2 to Matt Mower on Sat Jun 14 21:40:01 2025
    XPost: linux.debian.bugs.dist
    From: [email protected]

    Hi Matt,

    On Thu, Jun 12, 2025 at 01:46:03PM -0700, Matt Mower wrote:
    After several suspends and plenty of uptime on battery, I'm fairly
    confident that linux-image-6.12.12+bpo-amd64_6.12.12-1~bpo12+1_amd64 paired with firmware-atheros_20250410-2~bpo12+1_all is stable. There are only 2 commits to the ath12k driver between 6.12.12 and 6.12.22:

    44de00e8bc8f wifi: ath12k: fix handling of 6 GHz rules
    425f6a38173b wifi: ath12k: fix tx power, max reg power update to firmware

    So, I'll next test 6.12.22 with reverts of one or both of these.

    Looking forward to see the results, can you then please remove the
    moreinfo tag on the bug when providing the information? This brings
    the bug more prominently back on the radar.

    (And having identified the breaking commit helps filling a proper
    bugreport upstream).

    Regards,
    Salvatore

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Matt Mower@1:229/2 to All on Wed Jun 25 07:20:01 2025
    XPost: linux.debian.bugs.dist
    From: [email protected]

    OK, so I suggest we wait a bit for the problem to happen on 6.15.

    I have not reproduced this issue on 6.15.2 or 6.15.3.

    I tested 6.12.34 tonight and the issue still exists. I have not found
    changes that can be cherry-picked to resolve the issue in 6.12.x.

    Do you want to care for that then?

    I would appreciate some help with this so that I don't report incorrectly. Happy to continue providing information, but if someone could help with initiating the bug report, that would be helpful.

    <div dir="ltr">&gt; OK, so I suggest we wait a bit for the problem to happen on 6.15.<div><br></div><div>I have not reproduced this issue on 6.15.2 or 6.15.3.</div><div><br></div><div>I tested 6.12.34 tonight and the issue still exists. I have not found
    changes that can be cherry-picked to resolve the issue in 6.12.x.</div><div><br></div><div>&gt; Do you want to care for that then?</div><div><br></div><div>I would appreciate some help with this so that I don&#39;t report incorrectly. Happy to continue
    providing information, but if someone could help with initiating the bug report, that would be helpful.</div></div>

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)