• Bug#1109885: libsqlite3-0: JSON parser ignores extractions when source

    From Manny@21:1/5 to All on Fri Jul 25 15:10:01 2025
    Package: libsqlite3-0
    Version: 3.40.1-2+deb12u1
    Severity: normal
    Tags: upstream
    X-Debbugs-Cc: [email protected]

    This problem was discovered while trying to workaround another
    bug¹. When a JSON array is stored in a column (datatype=text),
    extractions on that array yield nothing.

    For a concrete example, take this JSON (the same sample as that given
    in the other bug report¹):

    $ curl -s 'https://lemmy-federate.com/api/instance.find?input=%7B%22search%22%3A%22%22%2C%22skip%22%3A0%2C%22take%22%3A6%2C%22enabledOnly%22%3Afalse%7D' > lemmy-federate_nodes.json

    One attempt to work around that other bug was to create a table
    (“flatTbl”) as a stage, and stuff a JSON array inside a column therein (“follows_json”):

    ===8<----------------------------------------
    #!/bin/bash

    sqlite3 -echo demo.db <<'EOF'
    create table if not exists flatTbl (
    id integer primary key,
    name text,
    host text,
    host_id integer,
    follows_json text);

    create table if not exists relationTbl (
    id integer primary key,
    community_id integer,
    created text,
    updated text,
    status text,
    error_reason text,
    attempt_count text,
    hostname text,
    host_id text);

    insert or replace into flatTbl select
    json_extract(value,'$.id'),
    json_extract(value,'$.name'),
    json_extract(value,'$.instance.host'),
    json_extract(value,'$.instance.id'),
    json_extract(value,'$.follows')
    from json_each(readfile('lemmy-federate_communities.json'),'$.result.data.communities');
    EOF
    ===8<----------------------------------------

    The above script results in flatTbl being successfully populated. We
    would then attempt to extract from arrays in the follows_json column
    and use that output to populate the relationTbl. A simple extraction
    of the first element gives nothing:

    ===8<----------------------------------------
    $ sqlite3 -line 'file:demo.db?mode=ro' "select json_extract(follows_json,'$.[0]') from flatTbl"
    json_extract(follows_json,'$.[0]') =

    json_extract(follows_json,'$.[0]') =

    json_extract(follows_json,'$.[0]') =

    json_extract(follows_json,'$.[0]') =

    json_extract(follows_json,'$.[0]') =

    json_extract(follows_json,'$.[0]') = ===8<----------------------------------------

    Note that the “-line” parameter is given for debugging purposes
    because without it we just see blank lines for output. There are six
    empty rows output for six records. If '$.[0]' is replaced with
    '$.[#-1]' (to extract the last element of the array), it’s the same
    result (shoots blanks).

    It’s worth noting that an extraction of the whole arrays work as expected:

    ===8<----------------------------------------
    $ sqlite3 -line 'file:demo.db?mode=ro' "select json_extract(follows_json,'$') from flatTbl"
    ===8<----------------------------------------

    Output from the above looks correct (but omitted here because it is
    bulky). However, that is likely a side-effect whereby a simple path
    of '$' requires no JSON parsing. That luck ends as soon as there is
    another attempt to extract from the JSON in an outer operation. E.g.

    ===8<----------------------------------------
    sqlite3 'file:demo.db?mode=ro' "select * from json_each((select json_extract(follows_json,'$') from flatTbl),'$.communityId')"
    ===8<----------------------------------------

    That yields no output. It’s as if JSON text that is stored in a table
    becomes unparsable.

    Bug 2 (docs):

    There is also either a documentation oversight or an anti-feature
    regarding the array notation '$.[#]'. The docs state that sharp (#)
    alone may be used in operations that append to a whole existing
    array. But there are other situations (such as extraction) where there
    is a need to specify all elements of the array. The docs say nothing
    about that. It should be both supported and documented to use sharp to
    extract an entire array (and ideally array slices as well). But if
    that is not the case for whatever reason, perhaps the documentation
    should explicitly disclose that limitation.

    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1109880

    -- System Information:
    Debian Release: 12.11
    APT prefers stable-updates
    APT policy: (990, 'stable-updates'), (990, 'stable-security'), (990, 'stable'), (500, 'oldstable')
    Architecture: amd64 (x86_64)
    Foreign Architectures: i386

    Kernel: Linux 5.10.0-28-amd64 (SMP w/2 CPU threads)
    Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
    Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
    Shell: /bin/sh linked to /usr/bin/dash
    Init: systemd (via /run/systemd/system)
    LSM: AppArmor: enabled

    Versions of packages libsqlite3-0 depends on:
    ii libc6 2.36-9+deb12u10

    libsqlite3-0 recommends no packages.

    libsqlite3-0 suggests no packages.

    -- no debconf information

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)