linux/include
Christian Brauner 344bac8f0d
fs: kill MNT_ONRB
Move mnt->mnt_node into the union with mnt->mnt_rcu and mnt->mnt_llist
instead of keeping it with mnt->mnt_list. This allows us to use
RB_CLEAR_NODE(&mnt->mnt_node) in umount_tree() as well as
list_empty(&mnt->mnt_node). That in turn allows us to remove MNT_ONRB.

This also fixes the bug reported in [1] where seemingly MNT_ONRB wasn't
set in @mnt->mnt_flags even though the mount was present in the mount
rbtree of the mount namespace.

The root cause is the following race. When a btrfs subvolume is mounted
a temporary mount is created:

btrfs_get_tree_subvol()
{
        mnt = fc_mount()
        // Register the newly allocated mount with sb->mounts:
        lock_mount_hash();
        list_add_tail(&mnt->mnt_instance, &mnt->mnt.mnt_sb->s_mounts);
        unlock_mount_hash();
}

and registered on sb->s_mounts. Later it is added to an anonymous mount
namespace via mount_subvol():

-> mount_subvol()
   -> mount_subtree()
      -> alloc_mnt_ns()
         mnt_add_to_ns()
         vfs_path_lookup()
         put_mnt_ns()

The mnt_add_to_ns() call raises MNT_ONRB in @mnt->mnt_flags. If someone
concurrently does a ro remount:

reconfigure_super()
-> sb_prepare_remount_readonly()
   {
           list_for_each_entry(mnt, &sb->s_mounts, mnt_instance) {
   }

all mounts registered in sb->s_mounts are visited and first
MNT_WRITE_HOLD is raised, then MNT_READONLY is raised, and finally
MNT_WRITE_HOLD is removed again.

The flag modification for MNT_WRITE_HOLD/MNT_READONLY and MNT_ONRB race
so MNT_ONRB might be lost.

Fixes: 2eea9ce431 ("mounts: keep list of mounts in an rbtree")
Cc: <stable@kernel.org> # v6.8+
Link: https://lore.kernel.org/r/20241215-vfs-6-14-mount-work-v1-1-fd55922c4af8@kernel.org
Link: https://lore.kernel.org/r/ec6784ed-8722-4695-980a-4400d4e7bd1a@gmx.com [1]
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09 16:58:50 +01:00
..
acpi common: switch back from remove_new() to remove() callback 2024-11-25 17:31:39 -08:00
asm-generic - Fix a case where posix timers with a thread-group-wide target would miss 2024-12-01 12:41:21 -08:00
clocksource
crypto This update includes the following changes: 2024-11-19 10:28:41 -08:00
cxl
drm drm for 6.13-rc1 2024-11-21 14:56:17 -08:00
dt-bindings Char/Misc/IIO/Whatever driver subsystem updates for 6.13-rc1 2024-11-29 11:58:27 -08:00
keys
kunit The core framework gained a clk provider helper, a clk consumer helper, and 2024-11-22 17:02:25 -08:00
kvm KVM: arm64: vgic: Kill VGIC_MAX_PRIVATE definition 2024-11-20 17:21:08 -08:00
linux fs: kill MNT_ONRB 2025-01-09 16:58:50 +01:00
math-emu
media
memory
misc
net Kbuild updates for v6.13 2024-11-30 13:41:50 -08:00
pcmcia
ras
rdma
rv
scsi Random number generator updates for Linux 6.13-rc1. 2024-11-19 10:43:44 -08:00
soc The core framework gained a clk provider helper, a clk consumer helper, and 2024-11-22 17:02:25 -08:00
sound ALSA: hda/tas2781: Add speaker id check for ASUS projects 2024-11-26 08:54:08 +01:00
target
trace NFS client updates for Linux 6.13 2024-11-30 10:17:53 -08:00
uapi io_uring-6.13-20242901 2024-11-30 15:43:02 -08:00
ufs
vdso
video - Improved handling of LCD power states and interactions with the fbdev subsystem. 2024-11-22 16:29:57 -08:00
xen