• Bug#1109685: libsoup-3.0-0: deadlock with gstreamer1.0-plugins-good 1.2

    From Nick Steel@21:1/5 to All on Tue Jul 22 01:00:01 2025
    Package: libsoup-3.0-0
    Version: 3.6.5-2

    Dear Maintainer,

    Users of our GStreamer based application (Mopidy) have experienced deadlocks when using souphttpsrc, a GStreamer plugin built on libsoup-3.0. It was identified as a lock-ordering problem caused by libsoup's long-standing incorrect use of GLib GModule
    functions within its constructor. The libsoup fix was merged on 18/7/2025.

    The deadlock was made possible by changes to gstreamer1.0-plugins-good that are present in the Debian 1.26.2 package shipping in Trixie. I appreciate this isn't a security issue, but souphttpsrc is a fundamental GStreamer plugin that will otherwise be
    broken in the upcoming Trixie release. It would be great if you could consider porting the upstream libsoup fix.

    Thanks,
    Nick

    Upstream error description: https://gitlab.gnome.org/GNOME/libsoup/-/issues/463 Upstream fix: https://gitlab.gnome.org/GNOME/libsoup/-/merge_requests/475 Related GStreamer change: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/commit/7c3ee65d60a2d7a1a22ad889083525a38b656eb2
    Related GLib documentation: https://gitlab.gnome.org/GNOME/glib/-/merge_requests/4691

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to Nick Steel on Tue Jul 22 13:10:01 2025
    Control: forwarded -1 https://gitlab.gnome.org/GNOME/libsoup/-/issues/463 Control: tags -1 + fixed-upstream patch
    Control: affects -1 + mopidy
    Control: severity -1 important

    On Mon, 21 Jul 2025 at 23:50:29 +0100, Nick Steel wrote:
    Users of our GStreamer based application (Mopidy) have experienced
    deadlocks when using souphttpsrc, a GStreamer plugin built on
    libsoup-3.0. It was identified as a lock-ordering problem caused by
    libsoup's long-standing incorrect use of GLib GModule functions within
    its constructor. The libsoup fix was merged on 18/7/2025.

    (I am not really a libsoup3 maintainer, but I'm a GNOME team member,
    which is the next best thing available here.)

    This fix was only merged very recently and we are about to enter hard
    freeze, so the timeline to get it into 13.0 would be really tight, but
    we can try. If it's too late for 13.0 then addressing it via the 13.1
    stable update seems proportionate - I expect that we will need to update libsoup3 with additional CVE fixes at some point anyway.

    How badly does this affect mopidy? If this had been reported as a bug in mopidy, would it have been grave ("makes the package in question
    unusable by most or all users") or important ("has a major effect on the usability of a package, without rendering it completely unusable to
    everyone") or normal (an ordinary bug)?

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to Nick Steel on Thu Jul 24 11:20:01 2025
    On Tue, 22 Jul 2025 at 13:32:34 +0100, Nick Steel wrote:
    most of Mopidy's value comes from extensions (mostly distributed via PyPI >etc) and it's these that are exposed to the bug, some are consequently >rendered unusable

    How can this deadlock be reproduced on a Debian system, preferably with software from Debian only, or with third-party code if necessary?

    GModule uses a recursive lock, so the most obvious reproducers like

    g_module_open_full ("/usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstsoup.so", 0, error);

    do not deadlock: any reproducer would need to be multithreaded.

    In the stack trace linked from <https://gitlab.gnome.org/GNOME/libsoup/-/issues/463>, the threads that
    are not idle appear to be:

    Thread 1 (Thread 0x7f9414e25bc0 (LWP 737461)):
    #0 futex_wait (futex_word=0x7f9414f8aa08 <_rtld_global+2568>, expected=2, private=0) at ../sysdeps/nptl/futex-internal.h:146
    #1 __GI___lll_lock_wait (futex=futex@entry=0x7f9414f8aa08 <_rtld_global+2568>, private=0) at lowlevellock.c:49
    #2 0x00007f94146a8ec8 in lll_mutex_lock_optimized (mutex=0x7f9414f8aa08 <_rtld_global+2568>) at pthread_mutex_lock.c:48
    #3 ___pthread_mutex_lock (mutex=mutex@entry=0x7f9414f8aa08 <_rtld_global+2568>) at pthread_mutex_lock.c:128
    #4 0x00007f94146a1cbe in dlsym_implementation (handle=<optimized out>, name=<optimized out>, dl_caller=<optimized out>) at dlsym.c:52
    #5 ___dlsym (handle=<optimized out>, name=<optimized out>) at dlsym.c:68
    #6 0x00007f9413e963f0 in _g_module_symbol (handle=0x55c7f88d3ed0, symbol_name=0x7f94122033b0 "gst_message_get_type") at ../glib/gmodule/gmodule-dl.c:205
    #7 g_module_symbol (module=<optimized out>, symbol_name=symbol_name@entry=0x7f94122033b0 "gst_message_get_type", symbol=symbol@entry=0x7fffd0324090) at ../glib/gmodule/gmodule.c:837
    #8 0x00007f94132f1e00 in g_typelib_symbol (typelib=0x55c7f860ff40, symbol_name=0x7f94122033b0 "gst_message_get_type", symbol=0x7fffd0324090) at ../gobject-introspection/girepository/gitypelib.c:2522
    #9 0x00007f94132ef759 in g_registered_type_info_get_g_type (info=0x55c7f9730610) at ../gobject-introspection/girepository/giregisteredtypeinfo.c:136
    #15 0x00007f9412c257c4 in Python Exception <class 'gdb.error'>: value has been optimized out
    #16 0x00007f9412086d9f in gst_bus_async_signal_func (bus=0x7f940400c710, message=0x7f9404044340, data=<optimized out>) at ../gstreamer/subprojects/gstreamer/gst/gstbus.c:1286
    #17 0x00007f94120875c3 in gst_bus_source_dispatch (source=0x7f9404015bf0, callback=0x7f9412086d40 <gst_bus_async_signal_func>, user_data=0x0) at ../gstreamer/subprojects/gstreamer/gst/gstbus.c:841
    #18 0x00007f9412cb087d in g_main_dispatch (context=0x55c7f958a980) at ../glib/glib/gmain.c:3398
    #19 0x00007f9412cb1cd7 in g_main_context_dispatch_unlocked (context=0x55c7f958a980) at ../glib/glib/gmain.c:4249
    #20 g_main_context_iterate_unlocked (context=0x55c7f958a980, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/glib/gmain.c:4314
    #21 0x00007f9412cb2097 in g_main_loop_run (loop=0x55c7f95a92c0) at ../glib/glib/gmain.c:4516
    ...

    and

    Thread 3 (Thread 0x7f940f4486c0 (LWP 737465)):
    #0 futex_wait (futex_word=0x55c7f889a520, expected=2, private=0) at ../sysdeps/nptl/futex-internal.h:146
    #1 __GI___lll_lock_wait (futex=futex@entry=0x55c7f889a520, private=0) at lowlevellock.c:49
    #2 0x00007f94146a8ec8 in lll_mutex_lock_optimized (mutex=0x55c7f889a520) at pthread_mutex_lock.c:48
    #3 ___pthread_mutex_lock (mutex=0x55c7f889a520) at pthread_mutex_lock.c:128
    #4 0x00007f9412ce11ae in g_rec_mutex_lock_impl (mutex=0x7f9413e9a070 <g_module_global_lock>) at ../glib/glib/gthread-posix.c:253
    #5 0x00007f9413e96576 in g_module_open_full (file_name=file_name@entry=0x0, flags=flags@entry=0, error=error@entry=0x0) at ../glib/gmodule/gmodule.c:478
    --Type <RET> for more, q to quit, c to continue without paging--
    #6 0x00007f9413e9707c in g_module_open (file_name=file_name@entry=0x0, flags=flags@entry=0) at ../glib/gmodule/gmodule.c:708
    #7 0x00007f93cc73f488 in soup2_is_loaded () at ../libsoup/libsoup/soup-init.c:26
    #8 soup_init () at ../libsoup/libsoup/soup-init.c:54
    #9 soup_init_ctor () at ../libsoup/libsoup/soup-init.c:96
    #10 0x00007f9414f582f7 in call_init (l=<optimized out>, argc=2, argv=0x7fffd03252c8, env=0x7fffd03252e0) at dl-init.c:74
    #11 call_init (l=<optimized out>, argc=2, argv=0x7fffd03252c8, env=0x7fffd03252e0) at dl-init.c:26
    #12 0x00007f9414f583cd in _dl_init (main_map=0x7f9404052090, argc=2, argv=0x7fffd03252c8, env=0x7fffd03252e0) at dl-init.c:121
    #13 0x00007f9414f554b5 in __GI__dl_catch_exception (exception=exception@entry=0x0, operate=operate@entry=0x7f9414f5f160 <call_dl_init>, args=args@entry=0x7f940f446850) at dl-catch.c:215
    #14 0x00007f9414f5f0c9 in dl_open_worker (a=a@entry=0x7f940f446850) at dl-open.c:799
    #15 0x00007f9414f55416 in __GI__dl_catch_exception (exception=exception@entry=0x7f940f446830, operate=operate@entry=0x7f9414f5f000 <dl_open_worker>, args=args@entry=0x7f940f446850) at dl-catch.c:241
    #16 0x00007f9414f5f4de in _dl_open (file=0x7f940c4cfc14 "libsoup-3.0.so.0", mode=-2147483390, caller_dlopen=0x7f940c4c241f <soup_element_init+191>, nsid=-2, argc=2, argv=0x7fffd03252c8, env=0x7fffd03252e0) at dl-open.c:874
    #17 0x00007f94146a1b34 in dlopen_doit (a=a@entry=0x7f940f446b00) at dlopen.c:56 #18 0x00007f9414f55416 in __GI__dl_catch_exception (exception=exception@entry=0x7f940f446a40, operate=0x7f94146a1ad0 <dlopen_doit>, args=0x7f940f446b00) at dl-catch.c:241
    #19 0x00007f9414f55569 in _dl_catch_error (objname=0x7f940f446aa8, errstring=0x7f940f446ab0, mallocedp=0x7f940f446aa7, operate=<optimized out>, args=<optimized out>) at dl-catch.c:260
    #20 0x00007f94146a1623 in _dlerror_run (operate=operate@entry=0x7f94146a1ad0 <dlopen_doit>, args=args@entry=0x7f940f446b00) at dlerror.c:138
    #21 0x00007f94146a1beb in dlopen_implementation (file=<optimized out>, mode=<optimized out>, dl_caller=<optimized out>) at dlopen.c:71
    #22 ___dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:81
    #23 0x00007f940c4c241f in gst_soup_load_library () at ../gstreamer/subprojects/gst-plugins-good/ext/soup/gstsouploader.c:196
    #24 soup_element_init (plugin=0x55c7f89ac750) at ../gstreamer/subprojects/gst-plugins-good/ext/soup/gstsoupelement.c:60
    #25 0x00007f940c4c5ea1 in souphttpsrc_element_init (plugin=0x55c7f89ac750) at ../gstreamer/subprojects/gst-plugins-good/ext/soup/gstsouphttpsrc.c:2803
    ...

    meaning that thread 3 is holding the libdl lock to load and initialize libsoup3, then entering g_module_open_full(), while thread 1 is holding
    the GModule lock to look up a symbol (possibly in some different
    module), then entering dlsym().

    I can't see an obvious way to make this happen on-demand: both of those operations should normally be quick, so I would expect that you would
    have to be unlucky with timing for the deadlock to appear. I'll see
    whether I can construct an artifical reproducer with a module that is intentionally slow to initialize, or intentionally spams GModule calls,
    or something like that...

    A likely workaround would be to load some of the relevant modules and/or libraries early in the process's lifetime, from single-threaded code, so
    that they cannot possibly be involved in a deadlock.

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to Simon McVittie on Thu Jul 24 12:30:01 2025
    On Thu, 24 Jul 2025 at 10:13:08 +0100, Simon McVittie wrote:
    On Tue, 22 Jul 2025 at 13:32:34 +0100, Nick Steel wrote:
    most of Mopidy's value comes from extensions (mostly distributed via PyPI >>etc) and it's these that are exposed to the bug, some are consequently >>rendered unusable

    How can this deadlock be reproduced on a Debian system, preferably
    with software from Debian only, or with third-party code if necessary?

    GModule uses a recursive lock, so the most obvious reproducers like

    g_module_open_full ("/usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstsoup.so", 0, error);

    do not deadlock: any reproducer would need to be multithreaded.

    I was able to reproduce this with the attached artificial test-case.
    (Good result: it runs for 10 seconds of high CPU load and then exits gracefully. Bad result: it continues to run indefinitely, with CPU use
    dropping to zero when it becomes deadlocked.)

    I confirm that the upstream change makes the test-case work as it
    should.

    smcv

    /*
    gcc -o1109685 1109685.c $(pkgconf --cflags --libs gmodule-2.0)
    gdb ./1109685
    */

    #include <dlfcn.h>

    #include <gmodule.h>

    #define RUN_SECONDS 10

    static void *
    gmodule_thread_cb (void *nil)
    {
    g_autoptr(GError) error = NULL;
    /* Spam GModule calls to try to reproduce #1109685 */
    GModule *m;
    gint64 start, now, end;
    void *free_fn;

    g_printerr ("Starting GModule thread\n");

    m = g_module_open_full ("libc.so.6", 0, &error);

    if (m == NULL)
    g_error ("%s", error->message);

    start = g_get_monotonic_time ();
    end = start + RUN_SECONDS * G_USEC_PER_SEC;
    g_printerr ("Starting GModule load generation\n");

    for (now = g_get_monotonic_time ();
    now < end;
    now = g_get_monotonic_time ())
    g_module_symbol (m, "free", &free_fn);

    g_printerr ("Finished GModule load generation\n");
    g_module_close (m);
    return NULL;
    }

    static void *
    dlopen_thread_cb (void *nil)
    {
    /* Spam dlopen calls to try to reproduce #1109685 */
    gint64 start, now, end;
    void *handle;

    start = g_get_monotonic_time ();
    end = start + RUN_SECONDS * G_USEC_PER_SEC;
    g_printerr ("Starting dlopen load generation\n");

    for (now = g_get_monotonic_time ();
    now < end;
    now = g_get_monotonic_time ())
    {
    handle = dlopen ("libsoup-3.0.so.0", RTLD_LAZY | RTLD_LOCAL);
    dlclose (handle);
    }

    g_printerr ("Finished dlopen load generation\n");
    return NULL;
    }

    int
    main(void)
    {
    GThread *gmodule_thread;
    GThread *dlopen_thread;

    g_printerr ("Initializing\n");

    gmodule_thread = g_thread_new ("gmodule_thread", gmodule_thread_cb, NULL);
    dlopen_thread = g_thread_new ("dlopen_thread", dlopen_thread_cb, NULL);

    g_printerr ("Waiting for threads\n");
    g_thread_join (gmodule_thread);
    g_thread_join (dlopen_thread);
    g_printerr ("Finished\n");
    return 0;
    }

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)