• Bug#1086976: dpkg: use reflinks on package install

    From Matteo Croce@1:229/2 to All on Thu Nov 7 17:40:02 2024
    XPost: linux.debian.bugs.dist
    From: [email protected]

    Package: dpkg
    Version: 1.21.22
    Severity: wishlist

    Dear Maintainer,

    I'm doing some experiments with dpkg on filesystems which support
    reflinks, like XFS and BtrFS

    The idea is to uncompress and align the content of the .deb files
    during the download phase, so during the installation phase files
    can be copied by reflinking file content from the archive files
    to the destination paths.

    This is implemented with the FICLONERANGE ioctl, as `cp --reflink=always`
    does since a few time.
    If reflinks are not supported, the code falls back to the copy transparently.

    With reflinks the installation of a big package like linux-firmware takes rougly a sixth of the time needed with a regular file copy:

    Stock archive:
    # time dpkg -i linux-firmware_20240318.git3b128b60-0ubuntu2.4_amd64.deb (Reading database ... 214246 files and directories currently installed.) Preparing to unpack linux-firmware_20240318.git3b128b60-0ubuntu2.4_amd64.deb ...
    Unpacking linux-firmware (20240318.git3b128b60-0ubuntu2.4) over (20240318.git3b128b60-0ubuntu2.4) ...
    Setting up linux-firmware (20240318.git3b128b60-0ubuntu2.4) ...

    real 0m39,264s
    user 0m3,141s
    sys 0m4,720s

    Flattened and aligned .deb package:
    # time dpkg -i linux-firmware_20240318.git3b128b60-0ubuntu2.4_amd64.debc (Reading database ... 214246 files and directories currently installed.) Preparing to unpack linux-firmware_20240318.git3b128b60-0ubuntu2.4_amd64.debc ...
    Unpacking linux-firmware (20240318.git3b128b60-0ubuntu2.4) over (20240318.git3b128b60-0ubuntu2.4) ...
    Setting up linux-firmware (20240318.git3b128b60-0ubuntu2.4) ...

    real 0m6,531s
    user 0m0,306s
    sys 0m1,471s

    On a system with a 2 TB magnetic disk, a full upgrade with 1450 packages
    took 44 minutes instead of 90, the extraction phase alone went from 78
    minutes to just 31.

    Another advantage is that the extraction doesn't use extra space,
    so you don't need to account both the compressed and uncompressed
    size of the files at the same time.

    The code is available at:
    https://github.com/teknoraver/dpkg/tree/cow

    and consists of three patches:
    1. reworks dpkg to avoid using pipes when data.tar is not compressed
    2. add PAX header support to dpkg
    3. the actual reflink support

    It's just a proof of concept yet, I just want to share the idea so
    don't focus too much on the code.

    Regards,
    Matteo Croce

    -- Package-specific info:
    This system uses merged-usr-via-aliased-dirs, going behind dpkg's
    back, breaking its core assumptions. This can cause silent file
    overwrites and disappearances, and its general tools misbehavior.
    See <https://wiki.debian.org/Teams/Dpkg/FAQ#broken-usrmerge>.

    -- System Information:
    Debian Release: 12.7
    APT prefers stable-security
    APT policy: (500, 'stable-security'), (500, 'stable')
    Architecture: amd64 (x86_64)

    Kernel: Linux 6.11.0-saturno (SMP w/8 CPU threads)
    Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
    Shell: /bin/sh linked to /usr/bin/dash
    Init: systemd (via /run/systemd/system)

    Versions of packages dpkg depends on:
    ii libbz2-1.0 1.0.8-5+b1
    ii libc6 2.36-9+deb12u8
    ii liblzma5 5.4.1-0.2
    ii libmd0 1.0.4-2
    ii libselinux1 3.4-1+b6
    ii libzstd1 1.5.4+dfsg2-5
    ii tar 1.34+dfsg-1.2+deb12u1
    ii zlib1g 1:1.2.13.dfsg-1

    dpkg recommends no packages.

    Versions of packages dpkg suggests:
    ii apt 2.6.1
    pn debsig-verify <none>

    -- no debconf information

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)