• Bug#1108534: e2fsprogs: getfattr returns different values for posix acl

    From Theodore Ts'o@21:1/5 to Tim Woodall on Wed Jul 2 23:20:01 2025
    On Wed, Jul 02, 2025 at 07:44:24PM +0100, Tim Woodall wrote:

    So it seems that my change is only needed on older kernels.


    OK, thanks for the hint. Using the same fuse2fs binary (built in a bookworm-amd64 chroot), I can replicate what you're seeing on 6.1.142:

    <tytso@trampoline> {/usr/projects/linux/ext4-6.1} ((v6.1.142))
    1477% kvm-xfstests shell
    ...
    root@kvm-xfstests:~# /vtmp/fuse2fs /vtmp/foo.img /mnt
    FUSE2FS (foo.img): Warning: fuse2fs does not support using the journal.
    There may be file system corruption or data loss if
    the file system is not gracefully unmounted.
    FUSE2FS (foo.img): Warning: Mounting unchecked fs, running e2fsck is recommended.
    root@kvm-xfstests:~# getfattr -m - -d /mnt/x
    getfattr: Removing leading '/' from absolute path names
    # file: mnt/x system.posix_acl_access=0sAgAAAAEABgAAAAAAAgAGAAAAAAAEAAQAAAAAABAABgAAAAAAIAAEAAAAAAA=

    .... but not 6.6.95:

    <tytso@trampoline> {/usr/projects/linux/ext4-6.6} ((v6.6.95))
    1479% kvm-xfstests shell
    ...
    root@kvm-xfstests:~# /vtmp/fuse2fs /vtmp/foo.img /mnt
    FUSE2FS (foo.img): Warning: fuse2fs does not support using the journal.
    There may be file system corruption or data loss if
    the file system is not gracefully unmounted.
    FUSE2FS (foo.img): Warning: Mounting unchecked fs, running e2fsck is recommended.
    root@kvm-xfstests:~# getfattr -m - -d /mnt/x
    getfattr: Removing leading '/' from absolute path names
    # file: mnt/x system.posix_acl_access=0sAgAAAAEABgD/////AgAGAAAAAAAEAAQA/////xAABgD/////IAAEAP////8=

    So it looks like some kind of bug fix or kernel change which landed
    sometime between 6.1 and 6.6, and which apparently was not backported
    to the 6.1 LTS kernel.

    Hmm... digging a bit deeper to see what changed in the kernel's fuse implementation between 6.1 and 6.6, it looks like there were some
    *radical changes*.

    % git log --reverse v6.1.. --grep acl -i fs/fuse

    Turns up these two interesting patches:

    commit cac2f8b8d8b50ef32b3e34f6dcbbf08937e4f616
    Author: Christian Brauner <[email protected]>
    Date: Thu Sep 22 17:17:00 2022 +0200

    fs: rename current get acl method

    The current way of setting and getting posix acls through the generic
    xattr interface is error prone and type unsafe. The vfs needs to
    interpret and fixup posix acls before storing or reporting it to
    userspace. Various hacks exist to make this work. The code is hard to
    understand and difficult to maintain in it's current form. Instead of
    making this work by hacking posix acls through xattr handlers we are
    building a dedicated posix acl api around the get and set inode
    operations. This removes a lot of hackiness and makes the codepaths
    easier to maintain. A lot of background can be found in [1].

    The current inode operation for getting posix acls takes an inode
    argument but various filesystems (e.g., 9p, cifs, overlayfs) need access
    to the dentry. In contrast to the ->set_acl() inode operation we cannot
    simply extend ->get_acl() to take a dentry argument. The ->get_acl()
    inode operation is called from:

    acl_permission_check()
    -> check_acl()
    -> get_acl()

    which is part of generic_permission() which in turn is part of
    inode_permission(). Both generic_permission() and inode_permission() are
    called in the ->permission() handler of various filesystems (e.g.,
    overlayfs). So simply passing a dentry argument to ->get_acl() would
    amount to also having to pass a dentry argument to ->permission(). We
    should avoid this unnecessary change.

    So instead of extending the existing inode operation rename it from
    ->get_acl() to ->get_inode_acl() and add a ->get_acl() method later that
    passes a dentry argument and which filesystems that need access to the
    dentry can implement instead of ->get_inode_acl(). Filesystems like cifs
    which allow setting and getting posix acls but not using them for
    permission checking during lookup can simply not implement
    ->get_inode_acl().

    This is intended to be a non-functional change.

    Link: https://lore.kernel.org/all/[email protected] [1]
    Suggested-by/Inspired-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>

    ... and ...

    commit 6a518afcc2066732e6c5c24281ce017bbbd85506
    Merge: bd90741318ee d6fdf29f7b99
    Author: Linus Torvalds <[email protected]>
    Date: Mon Dec 12 18:46:39 2022 -0800

    Merge tag 'fs.acl.rework.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping

    Pull VFS acl updates from Christian Brauner:
    "This contains the work that builds a dedicated vfs posix acl api.

    The origins of this work trace back to v5.19 but it took quite a while
    to understand the various filesystem specific implementations in
    sufficient detail and also come up with an acceptable solution.

    As we discussed and seen multiple times the current state of how posix
    acls are handled isn't nice and comes with a lot of problems: The
    current way of handling posix acls via the generic xattr api is error
    prone, hard to maintain, and type unsafe for the vfs until we call
    into the filesystem's dedicated get and set inode operations.

    It is already the case that posix acls are special-cased to death all
    the way through the vfs. There are an uncounted number of hacks that
    operate on the uapi posix acl struct instead of the dedicated vfs
    struct posix_acl. And the vfs must be involved in order to interpret
    and fixup posix acls before storing them to the backing store, caching
    them, reporting them to userspace, or for permission checking.

    Currently a range of hacks and duct tape exist to make this work. As
    with most things this is really no ones fault it's just something that
    happened over time. But the code is hard to understand and difficult
    to maintain and one is constantly at risk of introducing bugs and
    regressions when having to touch it.

    Instead of continuing to hack posix acls through the xattr handlers
    this series builds a dedicated posix acl api solely around the get and
    set inode operations.

    Going forward, the vfs_get_acl(), vfs_remove_acl(), and vfs_set_acl()
    helpers must be used in order to interact with posix acls. They
    operate directly on the vfs internal struct posix_acl instead of
    abusing the uapi posix acl struct as we currently do. In the end this
    removes all of the hackiness, makes the codepaths easier to maintain,
    and gets us type safety.

    (several more pagefuls of testing and implementation details removed)

    The thing is, for the ACL types in question, the kernel shouldn't even
    be looking at e_id; that field is undefined for ACL_USER_OBJ,
    ACL_GROUP_OBJ, et.al, since the kernel would be using the inode's user
    and group ownership, and for ACL_MASK and ACL_OTHER there is no id
    number to be used. So the kernel shouldn't even be *looking* at that
    field. So whether the value is 0 or -1 shouldn't make a difference.
    And change that I don't understand always give me pause....

    BTW, what caused you to think that -1 might fix things on a Bookworm
    6.1 kernel?

    - Ted

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)