• Bug#1110082: clblast: Bug precisions in autopkgtest testing compatibili

    From Gard Spreemann@21:1/5 to Clement LONGEAC on Sat Aug 2 21:00:01 2025
    Clement LONGEAC <[email protected]> writes:

    Source: clblast
    Version: 1.6.3-1
    Severity: normal
    X-Debbugs-Cc: [email protected]

    Dear Maintainer,

    *** Reporter, please consider answering these questions, where appropriate ***

    I am in an internship at Synchrotron-Soleil , my directors are Frederic- Emmanuel PICCA and Emmanuel FARHI. I implemented Rocm and Pocl autopkgtest for
    architecture amd64 and arm64. I implemented autopktests for rocm and pocl using
    Opencl on the package clblaston local. The aim is to have an overview of code compatibility with various AMD graphics cards, the codes on all the AMD boards
    available for CI rocm for GPU and Pocl for CPU.

    I implemented the POCL and ROCm tests in the control file. After running sbuild, it turns out that the tests involving the CPU, i.e. POCL, pass. However, the tests for the GPU part had accuracy problems, mainly due to the nature of the tests. The reason why these tests don't pass is that the card doesn't support, or supports poorly, FP16 calculations on ROCm. These libraries
    support these calculations, but this may not be enough to make them pass in terms of precision. We therefore need to increase the tolerance to see if this
    would be sufficient (rocBLAS, MIOpen and ROCm-Lib). This may be because the card has a kernel limitation. Apparently these are common problems with FP16 on
    ROCm, and intrinsic to the card you have. It's symptomatic if you don't have professional cards.

    Common problems with FP16 on ROCm, due to accuracy issues. CLBlast tests sometimes fail because tolerances are too tight for FP16. ROCm may use implicit
    FP16 to FP32 conversions (for compatibility), which may explain the numerical differences. Incomplete hardware support may be the cause. Some cards (such as
    the RX 6400) support FP16, but with limited extensions (e.g. no FP16_MMA on RDNA 2 consumer). The card we have would have partial FP16 support. This would
    explain our problem and any other problems we might have in the future.

    Dear Clement,

    Thank you for your work and this bug report! I would highly recommend
    you share this information with the CLBlast upstream [1]. They often
    take a while to respond, but usually attend to important matters like
    this in due time.

    Also, I'd be very happy to integrate your ROCm-based tests into the
    CLBlast package – could you provide them, for example as a git merge
    request on Salsa [2]?

    (I currently don't have access to any ROCm hardware, so it may be a
    while until I get around to testing things myself.)


    [1] https://github.com/CNugteren/CLBlast

    [2] https://salsa.debian.org/gspr/clblast


    Best,
    Gard

    --=-=-Content-Type: application/pgp-signature; name="signature.asc"

    -----BEGIN PGP SIGNATURE-----

    iQJGBAEBCgAwFiEEz8XvhRCFHnNVtV6AnRFYKv1UjPoFAmiOXbcSHGdzcHJAbm9u ZW1wdHkub3JnAAoJEJ0RWCr9VIz6ZH4P+wYZh1O4kne3Az9hpun5mGu2xZK0uBjH ZoBpNn7K0dKxsBLp6sbUD/kM01fjBGYj1VpSmtgOKCd8AMhXySodygYbyB3I3BaZ gy3Lkgw34F97z3L/ticWP0MT8rm8t2QgvEvDSRfrLdzm6flJQAm8LGlOLR7ngmB9 gF1JFHVNEQb3xOon4C4bQUvcZHnqO+H6S+P356C+SB6d1WeYS6bbj2gR7+Lp1CAM xvMMmzuHoZ83zfrD3j2Yt2mqK1fUKb4d7KR+ZE7OQ96irAYXmD+Wigj8+Rk7ht2V NJdzCod+w8brfSPJBiipVGt49yo/uye5usrBeunJEXBI6ZNRvi63LyYutrJAABLs SpROBQILMrrhjAB1Gcu4H1OxKuAU/RkU3ra1jW6ce0/N9a16+ZYZ/6zIniWuz3y9 E0Hcim4kxFb31B+JH9C9gFrLAuJ6AXM0tcDKOk+5r1vhfxAaQLcBIY1N5hM+vCbT sH0b6MMQ1+0NJo4MoYqgeyPvLfC0+6LJCjvzlYpsHyQukTa4391Lk1QF3lWZXQW6 fsvUZXw7htkMDc3GbBonE76t1k52zosaC1/sZ7E147uSeSm9KaDd95QrNqe0GSVb mqVW2j2C+YBzhWxqcMlx1Dy/0gaFil5v7foTHJOA5jSR+SiGXFNk1puIkRmqaZEg
    QAyBR3xColwI
    =fXr6
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)