Search

Search Results (324292 CVEs found)

CVE Vendors Products Updated CVSS v3.1
CVE-2025-68377 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: ns: initialize ns_list_node for initial namespaces Make sure that the list is always initialized for initial namespaces.
CVE-2025-68376 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: coresight: ETR: Fix ETR buffer use-after-free issue When ETR is enabled as CS_MODE_SYSFS, if the buffer size is changed and enabled again, currently sysfs_buf will point to the newly allocated memory(buf_new) and free the old memory(buf_old). But the etr_buf that is being used by the ETR remains pointed to buf_old, not updated to buf_new. In this case, it will result in a memory use-after-free issue. Fix this by checking ETR's mode before updating and releasing buf_old, if the mode is CS_MODE_SYSFS, then skip updating and releasing it.
CVE-2025-68375 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: perf/x86: Fix NULL event access and potential PEBS record loss When intel_pmu_drain_pebs_icl() is called to drain PEBS records, the perf_event_overflow() could be called to process the last PEBS record. While perf_event_overflow() could trigger the interrupt throttle and stop all events of the group, like what the below call-chain shows. perf_event_overflow() -> __perf_event_overflow() ->__perf_event_account_interrupt() -> perf_event_throttle_group() -> perf_event_throttle() -> event->pmu->stop() -> x86_pmu_stop() The side effect of stopping the events is that all corresponding event pointers in cpuc->events[] array are cleared to NULL. Assume there are two PEBS events (event a and event b) in a group. When intel_pmu_drain_pebs_icl() calls perf_event_overflow() to process the last PEBS record of PEBS event a, interrupt throttle is triggered and all pointers of event a and event b are cleared to NULL. Then intel_pmu_drain_pebs_icl() tries to process the last PEBS record of event b and encounters NULL pointer access. To avoid this issue, move cpuc->events[] clearing from x86_pmu_stop() to x86_pmu_del(). It's safe since cpuc->active_mask or cpuc->pebs_enabled is always checked before access the event pointer from cpuc->events[].
CVE-2025-68374 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: md: fix rcu protection in md_wakeup_thread We attempted to use RCU to protect the pointer 'thread', but directly passed the value when calling md_wakeup_thread(). This means that the RCU pointer has been acquired before rcu_read_lock(), which renders rcu_read_lock() ineffective and could lead to a use-after-free.
CVE-2025-68373 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: md: avoid repeated calls to del_gendisk There is a uaf problem which is found by case 23rdev-lifetime: Oops: general protection fault, probably for non-canonical address 0xdead000000000122 RIP: 0010:bdi_unregister+0x4b/0x170 Call Trace: <TASK> __del_gendisk+0x356/0x3e0 mddev_unlock+0x351/0x360 rdev_attr_store+0x217/0x280 kernfs_fop_write_iter+0x14a/0x210 vfs_write+0x29e/0x550 ksys_write+0x74/0xf0 do_syscall_64+0xbb/0x380 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff5250a177e The sequence is: 1. rdev remove path gets reconfig_mutex 2. rdev remove path release reconfig_mutex in mddev_unlock 3. md stop calls do_md_stop and sets MD_DELETED 4. rdev remove path calls del_gendisk because MD_DELETED is set 5. md stop path release reconfig_mutex and calls del_gendisk again So there is a race condition we should resolve. This patch adds a flag MD_DO_DELETE to avoid the race condition.
CVE-2025-68372 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: nbd: defer config put in recv_work There is one uaf issue in recv_work when running NBD_CLEAR_SOCK and NBD_CMD_RECONFIGURE: nbd_genl_connect // conf_ref=2 (connect and recv_work A) nbd_open // conf_ref=3 recv_work A done // conf_ref=2 NBD_CLEAR_SOCK // conf_ref=1 nbd_genl_reconfigure // conf_ref=2 (trigger recv_work B) close nbd // conf_ref=1 recv_work B config_put // conf_ref=0 atomic_dec(&config->recv_threads); -> UAF Or only running NBD_CLEAR_SOCK: nbd_genl_connect // conf_ref=2 nbd_open // conf_ref=3 NBD_CLEAR_SOCK // conf_ref=2 close nbd nbd_release config_put // conf_ref=1 recv_work config_put // conf_ref=0 atomic_dec(&config->recv_threads); -> UAF Commit 87aac3a80af5 ("nbd: call nbd_config_put() before notifying the waiter") moved nbd_config_put() to run before waking up the waiter in recv_work, in order to ensure that nbd_start_device_ioctl() would not be woken up while nbd->task_recv was still uncleared. However, in nbd_start_device_ioctl(), after being woken up it explicitly calls flush_workqueue() to make sure all current works are finished. Therefore, there is no need to move the config put ahead of the wakeup. Move nbd_config_put() to the end of recv_work, so that the reference is held for the whole lifetime of the worker thread. This makes sure the config cannot be freed while recv_work is still running, even if clear + reconfigure interleave. In addition, we don't need to worry about recv_work dropping the last nbd_put (which causes deadlock): path A (netlink with NBD_CFLAG_DESTROY_ON_DISCONNECT): connect // nbd_refs=1 (trigger recv_work) open nbd // nbd_refs=2 NBD_CLEAR_SOCK close nbd nbd_release nbd_disconnect_and_put flush_workqueue // recv_work done nbd_config_put nbd_put // nbd_refs=1 nbd_put // nbd_refs=0 queue_work path B (netlink without NBD_CFLAG_DESTROY_ON_DISCONNECT): connect // nbd_refs=2 (trigger recv_work) open nbd // nbd_refs=3 NBD_CLEAR_SOCK // conf_refs=2 close nbd nbd_release nbd_config_put // conf_refs=1 nbd_put // nbd_refs=2 recv_work done // conf_refs=0, nbd_refs=1 rmmod // nbd_refs=0 Depends-on: e2daec488c57 ("nbd: Fix hungtask when nbd_config_put")
CVE-2025-68371 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: scsi: smartpqi: Fix device resources accessed after device removal Correct possible race conditions during device removal. Previously, a scheduled work item to reset a LUN could still execute after the device was removed, leading to use-after-free and other resource access issues. This race condition occurs because the abort handler may schedule a LUN reset concurrently with device removal via sdev_destroy(), leading to use-after-free and improper access to freed resources. - Check in the device reset handler if the device is still present in the controller's SCSI device list before running; if not, the reset is skipped. - Cancel any pending TMF work that has not started in sdev_destroy(). - Ensure device freeing in sdev_destroy() is done while holding the LUN reset mutex to avoid races with ongoing resets.
CVE-2025-68370 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: coresight: tmc: add the handle of the event to the path The handle is essential for retrieving the AUX_EVENT of each CPU and is required in perf mode. It has been added to the coresight_path so that dependent devices can access it from the path when needed. The existing bug can be reproduced with: perf record -e cs_etm//k -C 0-9 dd if=/dev/zero of=/dev/null Showing an oops as follows: Unable to handle kernel paging request at virtual address 000f6e84934ed19e Call trace: tmc_etr_get_buffer+0x30/0x80 [coresight_tmc] (P) catu_enable_hw+0xbc/0x3d0 [coresight_catu] catu_enable+0x70/0xe0 [coresight_catu] coresight_enable_path+0xb0/0x258 [coresight]
CVE-2025-68369 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: ntfs3: init run lock for extend inode After setting the inode mode of $Extend to a regular file, executing the truncate system call will enter the do_truncate() routine, causing the run_lock uninitialized error reported by syzbot. Prior to patch 4e8011ffec79, if the inode mode of $Extend was not set to a regular file, the do_truncate() routine would not be entered. Add the run_lock initialization when loading $Extend. syzbot reported: INFO: trying to register non-static key. Call Trace: dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 assign_lock_key+0x133/0x150 kernel/locking/lockdep.c:984 register_lock_class+0x105/0x320 kernel/locking/lockdep.c:1299 __lock_acquire+0x99/0xd20 kernel/locking/lockdep.c:5112 lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5868 down_write+0x96/0x1f0 kernel/locking/rwsem.c:1590 ntfs_set_size+0x140/0x200 fs/ntfs3/inode.c:860 ntfs_extend+0x1d9/0x970 fs/ntfs3/file.c:387 ntfs_setattr+0x2e8/0xbe0 fs/ntfs3/file.c:808
CVE-2025-68368 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: md: init bioset in mddev_init IO operations may be needed before md_run(), such as updating metadata after writing sysfs. Without bioset, this triggers a NULL pointer dereference as below: BUG: kernel NULL pointer dereference, address: 0000000000000020 Call Trace: md_update_sb+0x658/0xe00 new_level_store+0xc5/0x120 md_attr_store+0xc9/0x1e0 sysfs_kf_write+0x6f/0xa0 kernfs_fop_write_iter+0x141/0x2a0 vfs_write+0x1fc/0x5a0 ksys_write+0x79/0x180 __x64_sys_write+0x1d/0x30 x64_sys_call+0x2818/0x2880 do_syscall_64+0xa9/0x580 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Reproducer ``` mdadm -CR /dev/md0 -l1 -n2 /dev/sd[cd] echo inactive > /sys/block/md0/md/array_state echo 10 > /sys/block/md0/md/new_level ``` mddev_init() can only be called once per mddev, no need to test if bioset has been initialized anymore.
CVE-2025-68367 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: macintosh/mac_hid: fix race condition in mac_hid_toggle_emumouse The following warning appears when running syzkaller, and this issue also exists in the mainline code. ------------[ cut here ]------------ list_add double add: new=ffffffffa57eee28, prev=ffffffffa57eee28, next=ffffffffa5e63100. WARNING: CPU: 0 PID: 1491 at lib/list_debug.c:35 __list_add_valid_or_report+0xf7/0x130 Modules linked in: CPU: 0 PID: 1491 Comm: syz.1.28 Not tainted 6.6.0+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:__list_add_valid_or_report+0xf7/0x130 RSP: 0018:ff1100010dfb7b78 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffffffa57eee18 RCX: ffffffff97fc9817 RDX: 0000000000040000 RSI: ffa0000002383000 RDI: 0000000000000001 RBP: ffffffffa57eee28 R08: 0000000000000001 R09: ffe21c0021bf6f2c R10: 0000000000000001 R11: 6464615f7473696c R12: ffffffffa5e63100 R13: ffffffffa57eee28 R14: ffffffffa57eee28 R15: ff1100010dfb7d48 FS: 00007fb14398b640(0000) GS:ff11000119600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000010d096005 CR4: 0000000000773ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 80000000 Call Trace: <TASK> input_register_handler+0xb3/0x210 mac_hid_start_emulation+0x1c5/0x290 mac_hid_toggle_emumouse+0x20a/0x240 proc_sys_call_handler+0x4c2/0x6e0 new_sync_write+0x1b1/0x2d0 vfs_write+0x709/0x950 ksys_write+0x12a/0x250 do_syscall_64+0x5a/0x110 entry_SYSCALL_64_after_hwframe+0x78/0xe2 The WARNING occurs when two processes concurrently write to the mac-hid emulation sysctl, causing a race condition in mac_hid_toggle_emumouse(). Both processes read old_val=0, then both try to register the input handler, leading to a double list_add of the same handler. CPU0 CPU1 ------------------------- ------------------------- vfs_write() //write 1 vfs_write() //write 1 proc_sys_write() proc_sys_write() mac_hid_toggle_emumouse() mac_hid_toggle_emumouse() old_val = *valp // old_val=0 old_val = *valp // old_val=0 mutex_lock_killable() proc_dointvec() // *valp=1 mac_hid_start_emulation() input_register_handler() mutex_unlock() mutex_lock_killable() proc_dointvec() mac_hid_start_emulation() input_register_handler() //Trigger Warning mutex_unlock() Fix this by moving the old_val read inside the mutex lock region.
CVE-2025-68366 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: nbd: defer config unlock in nbd_genl_connect There is one use-after-free warning when running NBD_CMD_CONNECT and NBD_CLEAR_SOCK: nbd_genl_connect nbd_alloc_and_init_config // config_refs=1 nbd_start_device // config_refs=2 set NBD_RT_HAS_CONFIG_REF open nbd // config_refs=3 recv_work done // config_refs=2 NBD_CLEAR_SOCK // config_refs=1 close nbd // config_refs=0 refcount_inc -> uaf ------------[ cut here ]------------ refcount_t: addition on 0; use-after-free. WARNING: CPU: 24 PID: 1014 at lib/refcount.c:25 refcount_warn_saturate+0x12e/0x290 nbd_genl_connect+0x16d0/0x1ab0 genl_family_rcv_msg_doit+0x1f3/0x310 genl_rcv_msg+0x44a/0x790 The issue can be easily reproduced by adding a small delay before refcount_inc(&nbd->config_refs) in nbd_genl_connect(): mutex_unlock(&nbd->config_lock); if (!ret) { set_bit(NBD_RT_HAS_CONFIG_REF, &config->runtime_flags); + printk("before sleep\n"); + mdelay(5 * 1000); + printk("after sleep\n"); refcount_inc(&nbd->config_refs); nbd_connect_reply(info, nbd->index); }
CVE-2025-68365 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: fs/ntfs3: Initialize allocated memory before use KMSAN reports: Multiple uninitialized values detected: - KMSAN: uninit-value in ntfs_read_hdr (3) - KMSAN: uninit-value in bcmp (3) Memory is allocated by __getname(), which is a wrapper for kmem_cache_alloc(). This memory is used before being properly cleared. Change kmem_cache_alloc() to kmem_cache_zalloc() to properly allocate and clear memory before use.
CVE-2025-68364 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: ocfs2: relax BUG() to ocfs2_error() in __ocfs2_move_extent() In '__ocfs2_move_extent()', relax 'BUG()' to 'ocfs2_error()' just to avoid crashing the whole kernel due to a filesystem corruption.
CVE-2025-68363 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: bpf: Check skb->transport_header is set in bpf_skb_check_mtu The bpf_skb_check_mtu helper needs to use skb->transport_header when the BPF_MTU_CHK_SEGS flag is used: bpf_skb_check_mtu(skb, ifindex, &mtu_len, 0, BPF_MTU_CHK_SEGS) The transport_header is not always set. There is a WARN_ON_ONCE report when CONFIG_DEBUG_NET is enabled + skb->gso_size is set + bpf_prog_test_run is used: WARNING: CPU: 1 PID: 2216 at ./include/linux/skbuff.h:3071 skb_gso_validate_network_len bpf_skb_check_mtu bpf_prog_3920e25740a41171_tc_chk_segs_flag # A test in the next patch bpf_test_run bpf_prog_test_run_skb For a normal ingress skb (not test_run), skb_reset_transport_header is performed but there is plan to avoid setting it as described in commit 2170a1f09148 ("net: no longer reset transport_header in __netif_receive_skb_core()"). This patch fixes the bpf helper by checking skb_transport_header_was_set(). The check is done just before skb->transport_header is used, to avoid breaking the existing bpf prog. The WARN_ON_ONCE is limited to bpf_prog_test_run, so targeting bpf-next.
CVE-2025-68362 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: wifi: rtl818x: rtl8187: Fix potential buffer underflow in rtl8187_rx_cb() The rtl8187_rx_cb() calculates the rx descriptor header address by subtracting its size from the skb tail pointer. However, it does not validate if the received packet (skb->len from urb->actual_length) is large enough to contain this header. If a truncated packet is received, this will lead to a buffer underflow, reading memory before the start of the skb data area, and causing a kernel panic. Add length checks for both rtl8187 and rtl8187b descriptor headers before attempting to access them, dropping the packet cleanly if the check fails.
CVE-2025-68361 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: erofs: limit the level of fs stacking for file-backed mounts Otherwise, it could cause potential kernel stack overflow (e.g., EROFS mounting itself).
CVE-2025-68360 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: wifi: mt76: wed: use proper wed reference in mt76 wed driver callabacks MT7996 driver can use both wed and wed_hif2 devices to offload traffic from/to the wireless NIC. In the current codebase we assume to always use the primary wed device in wed callbacks resulting in the following crash if the hw runs wed_hif2 (e.g. 6GHz link). [ 297.455876] Unable to handle kernel read from unreadable memory at virtual address 000000000000080a [ 297.464928] Mem abort info: [ 297.467722] ESR = 0x0000000096000005 [ 297.471461] EC = 0x25: DABT (current EL), IL = 32 bits [ 297.476766] SET = 0, FnV = 0 [ 297.479809] EA = 0, S1PTW = 0 [ 297.482940] FSC = 0x05: level 1 translation fault [ 297.487809] Data abort info: [ 297.490679] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [ 297.496156] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 297.501196] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 297.506500] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000107480000 [ 297.512927] [000000000000080a] pgd=08000001097fb003, p4d=08000001097fb003, pud=08000001097fb003, pmd=0000000000000000 [ 297.523532] Internal error: Oops: 0000000096000005 [#1] SMP [ 297.715393] CPU: 2 UID: 0 PID: 45 Comm: kworker/u16:2 Tainted: G O 6.12.50 #0 [ 297.723908] Tainted: [O]=OOT_MODULE [ 297.727384] Hardware name: Banana Pi BPI-R4 (2x SFP+) (DT) [ 297.732857] Workqueue: nf_ft_offload_del nf_flow_rule_route_ipv6 [nf_flow_table] [ 297.740254] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 297.747205] pc : mt76_wed_offload_disable+0x64/0xa0 [mt76] [ 297.752688] lr : mtk_wed_flow_remove+0x58/0x80 [ 297.757126] sp : ffffffc080fe3ae0 [ 297.760430] x29: ffffffc080fe3ae0 x28: ffffffc080fe3be0 x27: 00000000deadbef7 [ 297.767557] x26: ffffff80c5ebca00 x25: 0000000000000001 x24: ffffff80c85f4c00 [ 297.774683] x23: ffffff80c1875b78 x22: ffffffc080d42cd0 x21: ffffffc080660018 [ 297.781809] x20: ffffff80c6a076d0 x19: ffffff80c6a043c8 x18: 0000000000000000 [ 297.788935] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000000 [ 297.796060] x14: 0000000000000019 x13: ffffff80c0ad8ec0 x12: 00000000fa83b2da [ 297.803185] x11: ffffff80c02700c0 x10: ffffff80c0ad8ec0 x9 : ffffff81fef96200 [ 297.810311] x8 : ffffff80c02700c0 x7 : ffffff80c02700d0 x6 : 0000000000000002 [ 297.817435] x5 : 0000000000000400 x4 : 0000000000000000 x3 : 0000000000000000 [ 297.824561] x2 : 0000000000000001 x1 : 0000000000000800 x0 : ffffff80c6a063c8 [ 297.831686] Call trace: [ 297.834123] mt76_wed_offload_disable+0x64/0xa0 [mt76] [ 297.839254] mtk_wed_flow_remove+0x58/0x80 [ 297.843342] mtk_flow_offload_cmd+0x434/0x574 [ 297.847689] mtk_wed_setup_tc_block_cb+0x30/0x40 [ 297.852295] nf_flow_offload_ipv6_hook+0x7f4/0x964 [nf_flow_table] [ 297.858466] nf_flow_rule_route_ipv6+0x438/0x4a4 [nf_flow_table] [ 297.864463] process_one_work+0x174/0x300 [ 297.868465] worker_thread+0x278/0x430 [ 297.872204] kthread+0xd8/0xdc [ 297.875251] ret_from_fork+0x10/0x20 [ 297.878820] Code: 928b5ae0 8b000273 91400a60 f943fa61 (79401421) [ 297.884901] ---[ end trace 0000000000000000 ]--- Fix the issue detecting the proper wed reference to use running wed callabacks.
CVE-2025-68359 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: btrfs: fix double free of qgroup record after failure to add delayed ref head In the previous code it was possible to incur into a double kfree() scenario when calling add_delayed_ref_head(). This could happen if the record was reported to already exist in the btrfs_qgroup_trace_extent_nolock() call, but then there was an error later on add_delayed_ref_head(). In this case, since add_delayed_ref_head() returned an error, the caller went to free the record. Since add_delayed_ref_head() couldn't set this kfree'd pointer to NULL, then kfree() would have acted on a non-NULL 'record' object which was pointing to memory already freed by the callee. The problem comes from the fact that the responsibility to kfree the object is on both the caller and the callee at the same time. Hence, the fix for this is to shift the ownership of the 'qrecord' object out of the add_delayed_ref_head(). That is, we will never attempt to kfree() the given object inside of this function, and will expect the caller to act on the 'qrecord' object on its own. The only exception where the 'qrecord' object cannot be kfree'd is if it was inserted into the tracing logic, for which we already have the 'qrecord_inserted_ret' boolean to account for this. Hence, the caller has to kfree the object only if add_delayed_ref_head() reports not to have inserted it on the tracing logic. As a side-effect of the above, we must guarantee that 'qrecord_inserted_ret' is properly initialized at the start of the function, not at the end, and then set when an actual insert happens. This way we avoid 'qrecord_inserted_ret' having an invalid value on an early exit. The documentation from the add_delayed_ref_head() has also been updated to reflect on the exact ownership of the 'qrecord' object.
CVE-2025-68358 1 Linux 1 Linux Kernel 2025-12-24 N/A
In the Linux kernel, the following vulnerability has been resolved: btrfs: fix racy bitfield write in btrfs_clear_space_info_full() From the memory-barriers.txt document regarding memory barrier ordering guarantees: (*) These guarantees do not apply to bitfields, because compilers often generate code to modify these using non-atomic read-modify-write sequences. Do not attempt to use bitfields to synchronize parallel algorithms. (*) Even in cases where bitfields are protected by locks, all fields in a given bitfield must be protected by one lock. If two fields in a given bitfield are protected by different locks, the compiler's non-atomic read-modify-write sequences can cause an update to one field to corrupt the value of an adjacent field. btrfs_space_info has a bitfield sharing an underlying word consisting of the fields full, chunk_alloc, and flush: struct btrfs_space_info { struct btrfs_fs_info * fs_info; /* 0 8 */ struct btrfs_space_info * parent; /* 8 8 */ ... int clamp; /* 172 4 */ unsigned int full:1; /* 176: 0 4 */ unsigned int chunk_alloc:1; /* 176: 1 4 */ unsigned int flush:1; /* 176: 2 4 */ ... Therefore, to be safe from parallel read-modify-writes losing a write to one of the bitfield members protected by a lock, all writes to all the bitfields must use the lock. They almost universally do, except for btrfs_clear_space_info_full() which iterates over the space_infos and writes out found->full = 0 without a lock. Imagine that we have one thread completing a transaction in which we finished deleting a block_group and are thus calling btrfs_clear_space_info_full() while simultaneously the data reclaim ticket infrastructure is running do_async_reclaim_data_space(): T1 T2 btrfs_commit_transaction btrfs_clear_space_info_full data_sinfo->full = 0 READ: full:0, chunk_alloc:0, flush:1 do_async_reclaim_data_space(data_sinfo) spin_lock(&space_info->lock); if(list_empty(tickets)) space_info->flush = 0; READ: full: 0, chunk_alloc:0, flush:1 MOD/WRITE: full: 0, chunk_alloc:0, flush:0 spin_unlock(&space_info->lock); return; MOD/WRITE: full:0, chunk_alloc:0, flush:1 and now data_sinfo->flush is 1 but the reclaim worker has exited. This breaks the invariant that flush is 0 iff there is no work queued or running. Once this invariant is violated, future allocations that go into __reserve_bytes() will add tickets to space_info->tickets but will see space_info->flush is set to 1 and not queue the work. After this, they will block forever on the resulting ticket, as it is now impossible to kick the worker again. I also confirmed by looking at the assembly of the affected kernel that it is doing RMW operations. For example, to set the flush (3rd) bit to 0, the assembly is: andb $0xfb,0x60(%rbx) and similarly for setting the full (1st) bit to 0: andb $0xfe,-0x20(%rax) So I think this is really a bug on practical systems. I have observed a number of systems in this exact state, but am currently unable to reproduce it. Rather than leaving this footgun lying around for the future, take advantage of the fact that there is room in the struct anyway, and that it is already quite large and simply change the three bitfield members to bools. This avoids writes to space_info->full having any effect on ---truncated---