| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
md: avoid repeated calls to del_gendisk
There is a uaf problem which is found by case 23rdev-lifetime:
Oops: general protection fault, probably for non-canonical address 0xdead000000000122
RIP: 0010:bdi_unregister+0x4b/0x170
Call Trace:
<TASK>
__del_gendisk+0x356/0x3e0
mddev_unlock+0x351/0x360
rdev_attr_store+0x217/0x280
kernfs_fop_write_iter+0x14a/0x210
vfs_write+0x29e/0x550
ksys_write+0x74/0xf0
do_syscall_64+0xbb/0x380
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff5250a177e
The sequence is:
1. rdev remove path gets reconfig_mutex
2. rdev remove path release reconfig_mutex in mddev_unlock
3. md stop calls do_md_stop and sets MD_DELETED
4. rdev remove path calls del_gendisk because MD_DELETED is set
5. md stop path release reconfig_mutex and calls del_gendisk again
So there is a race condition we should resolve. This patch adds a
flag MD_DO_DELETE to avoid the race condition. |
| In the Linux kernel, the following vulnerability has been resolved:
nbd: defer config put in recv_work
There is one uaf issue in recv_work when running NBD_CLEAR_SOCK and
NBD_CMD_RECONFIGURE:
nbd_genl_connect // conf_ref=2 (connect and recv_work A)
nbd_open // conf_ref=3
recv_work A done // conf_ref=2
NBD_CLEAR_SOCK // conf_ref=1
nbd_genl_reconfigure // conf_ref=2 (trigger recv_work B)
close nbd // conf_ref=1
recv_work B
config_put // conf_ref=0
atomic_dec(&config->recv_threads); -> UAF
Or only running NBD_CLEAR_SOCK:
nbd_genl_connect // conf_ref=2
nbd_open // conf_ref=3
NBD_CLEAR_SOCK // conf_ref=2
close nbd
nbd_release
config_put // conf_ref=1
recv_work
config_put // conf_ref=0
atomic_dec(&config->recv_threads); -> UAF
Commit 87aac3a80af5 ("nbd: call nbd_config_put() before notifying the
waiter") moved nbd_config_put() to run before waking up the waiter in
recv_work, in order to ensure that nbd_start_device_ioctl() would not
be woken up while nbd->task_recv was still uncleared.
However, in nbd_start_device_ioctl(), after being woken up it explicitly
calls flush_workqueue() to make sure all current works are finished.
Therefore, there is no need to move the config put ahead of the wakeup.
Move nbd_config_put() to the end of recv_work, so that the reference is
held for the whole lifetime of the worker thread. This makes sure the
config cannot be freed while recv_work is still running, even if clear
+ reconfigure interleave.
In addition, we don't need to worry about recv_work dropping the last
nbd_put (which causes deadlock):
path A (netlink with NBD_CFLAG_DESTROY_ON_DISCONNECT):
connect // nbd_refs=1 (trigger recv_work)
open nbd // nbd_refs=2
NBD_CLEAR_SOCK
close nbd
nbd_release
nbd_disconnect_and_put
flush_workqueue // recv_work done
nbd_config_put
nbd_put // nbd_refs=1
nbd_put // nbd_refs=0
queue_work
path B (netlink without NBD_CFLAG_DESTROY_ON_DISCONNECT):
connect // nbd_refs=2 (trigger recv_work)
open nbd // nbd_refs=3
NBD_CLEAR_SOCK // conf_refs=2
close nbd
nbd_release
nbd_config_put // conf_refs=1
nbd_put // nbd_refs=2
recv_work done // conf_refs=0, nbd_refs=1
rmmod // nbd_refs=0
Depends-on: e2daec488c57 ("nbd: Fix hungtask when nbd_config_put") |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: smartpqi: Fix device resources accessed after device removal
Correct possible race conditions during device removal.
Previously, a scheduled work item to reset a LUN could still execute
after the device was removed, leading to use-after-free and other
resource access issues.
This race condition occurs because the abort handler may schedule a LUN
reset concurrently with device removal via sdev_destroy(), leading to
use-after-free and improper access to freed resources.
- Check in the device reset handler if the device is still present in
the controller's SCSI device list before running; if not, the reset
is skipped.
- Cancel any pending TMF work that has not started in sdev_destroy().
- Ensure device freeing in sdev_destroy() is done while holding the
LUN reset mutex to avoid races with ongoing resets. |
| In the Linux kernel, the following vulnerability has been resolved:
coresight: tmc: add the handle of the event to the path
The handle is essential for retrieving the AUX_EVENT of each CPU and is
required in perf mode. It has been added to the coresight_path so that
dependent devices can access it from the path when needed.
The existing bug can be reproduced with:
perf record -e cs_etm//k -C 0-9 dd if=/dev/zero of=/dev/null
Showing an oops as follows:
Unable to handle kernel paging request at virtual address 000f6e84934ed19e
Call trace:
tmc_etr_get_buffer+0x30/0x80 [coresight_tmc] (P)
catu_enable_hw+0xbc/0x3d0 [coresight_catu]
catu_enable+0x70/0xe0 [coresight_catu]
coresight_enable_path+0xb0/0x258 [coresight] |
| In the Linux kernel, the following vulnerability has been resolved:
ntfs3: init run lock for extend inode
After setting the inode mode of $Extend to a regular file, executing the
truncate system call will enter the do_truncate() routine, causing the
run_lock uninitialized error reported by syzbot.
Prior to patch 4e8011ffec79, if the inode mode of $Extend was not set to
a regular file, the do_truncate() routine would not be entered.
Add the run_lock initialization when loading $Extend.
syzbot reported:
INFO: trying to register non-static key.
Call Trace:
dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
assign_lock_key+0x133/0x150 kernel/locking/lockdep.c:984
register_lock_class+0x105/0x320 kernel/locking/lockdep.c:1299
__lock_acquire+0x99/0xd20 kernel/locking/lockdep.c:5112
lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5868
down_write+0x96/0x1f0 kernel/locking/rwsem.c:1590
ntfs_set_size+0x140/0x200 fs/ntfs3/inode.c:860
ntfs_extend+0x1d9/0x970 fs/ntfs3/file.c:387
ntfs_setattr+0x2e8/0xbe0 fs/ntfs3/file.c:808 |
| In the Linux kernel, the following vulnerability has been resolved:
md: init bioset in mddev_init
IO operations may be needed before md_run(), such as updating metadata
after writing sysfs. Without bioset, this triggers a NULL pointer
dereference as below:
BUG: kernel NULL pointer dereference, address: 0000000000000020
Call Trace:
md_update_sb+0x658/0xe00
new_level_store+0xc5/0x120
md_attr_store+0xc9/0x1e0
sysfs_kf_write+0x6f/0xa0
kernfs_fop_write_iter+0x141/0x2a0
vfs_write+0x1fc/0x5a0
ksys_write+0x79/0x180
__x64_sys_write+0x1d/0x30
x64_sys_call+0x2818/0x2880
do_syscall_64+0xa9/0x580
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Reproducer
```
mdadm -CR /dev/md0 -l1 -n2 /dev/sd[cd]
echo inactive > /sys/block/md0/md/array_state
echo 10 > /sys/block/md0/md/new_level
```
mddev_init() can only be called once per mddev, no need to test if bioset
has been initialized anymore. |
| In the Linux kernel, the following vulnerability has been resolved:
macintosh/mac_hid: fix race condition in mac_hid_toggle_emumouse
The following warning appears when running syzkaller, and this issue also
exists in the mainline code.
------------[ cut here ]------------
list_add double add: new=ffffffffa57eee28, prev=ffffffffa57eee28, next=ffffffffa5e63100.
WARNING: CPU: 0 PID: 1491 at lib/list_debug.c:35 __list_add_valid_or_report+0xf7/0x130
Modules linked in:
CPU: 0 PID: 1491 Comm: syz.1.28 Not tainted 6.6.0+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:__list_add_valid_or_report+0xf7/0x130
RSP: 0018:ff1100010dfb7b78 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffffffa57eee18 RCX: ffffffff97fc9817
RDX: 0000000000040000 RSI: ffa0000002383000 RDI: 0000000000000001
RBP: ffffffffa57eee28 R08: 0000000000000001 R09: ffe21c0021bf6f2c
R10: 0000000000000001 R11: 6464615f7473696c R12: ffffffffa5e63100
R13: ffffffffa57eee28 R14: ffffffffa57eee28 R15: ff1100010dfb7d48
FS: 00007fb14398b640(0000) GS:ff11000119600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000010d096005 CR4: 0000000000773ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 80000000
Call Trace:
<TASK>
input_register_handler+0xb3/0x210
mac_hid_start_emulation+0x1c5/0x290
mac_hid_toggle_emumouse+0x20a/0x240
proc_sys_call_handler+0x4c2/0x6e0
new_sync_write+0x1b1/0x2d0
vfs_write+0x709/0x950
ksys_write+0x12a/0x250
do_syscall_64+0x5a/0x110
entry_SYSCALL_64_after_hwframe+0x78/0xe2
The WARNING occurs when two processes concurrently write to the mac-hid
emulation sysctl, causing a race condition in mac_hid_toggle_emumouse().
Both processes read old_val=0, then both try to register the input handler,
leading to a double list_add of the same handler.
CPU0 CPU1
------------------------- -------------------------
vfs_write() //write 1 vfs_write() //write 1
proc_sys_write() proc_sys_write()
mac_hid_toggle_emumouse() mac_hid_toggle_emumouse()
old_val = *valp // old_val=0
old_val = *valp // old_val=0
mutex_lock_killable()
proc_dointvec() // *valp=1
mac_hid_start_emulation()
input_register_handler()
mutex_unlock()
mutex_lock_killable()
proc_dointvec()
mac_hid_start_emulation()
input_register_handler() //Trigger Warning
mutex_unlock()
Fix this by moving the old_val read inside the mutex lock region. |
| In the Linux kernel, the following vulnerability has been resolved:
nbd: defer config unlock in nbd_genl_connect
There is one use-after-free warning when running NBD_CMD_CONNECT and
NBD_CLEAR_SOCK:
nbd_genl_connect
nbd_alloc_and_init_config // config_refs=1
nbd_start_device // config_refs=2
set NBD_RT_HAS_CONFIG_REF open nbd // config_refs=3
recv_work done // config_refs=2
NBD_CLEAR_SOCK // config_refs=1
close nbd // config_refs=0
refcount_inc -> uaf
------------[ cut here ]------------
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 24 PID: 1014 at lib/refcount.c:25 refcount_warn_saturate+0x12e/0x290
nbd_genl_connect+0x16d0/0x1ab0
genl_family_rcv_msg_doit+0x1f3/0x310
genl_rcv_msg+0x44a/0x790
The issue can be easily reproduced by adding a small delay before
refcount_inc(&nbd->config_refs) in nbd_genl_connect():
mutex_unlock(&nbd->config_lock);
if (!ret) {
set_bit(NBD_RT_HAS_CONFIG_REF, &config->runtime_flags);
+ printk("before sleep\n");
+ mdelay(5 * 1000);
+ printk("after sleep\n");
refcount_inc(&nbd->config_refs);
nbd_connect_reply(info, nbd->index);
} |
| In the Linux kernel, the following vulnerability has been resolved:
fs/ntfs3: Initialize allocated memory before use
KMSAN reports: Multiple uninitialized values detected:
- KMSAN: uninit-value in ntfs_read_hdr (3)
- KMSAN: uninit-value in bcmp (3)
Memory is allocated by __getname(), which is a wrapper for
kmem_cache_alloc(). This memory is used before being properly
cleared. Change kmem_cache_alloc() to kmem_cache_zalloc() to
properly allocate and clear memory before use. |
| In the Linux kernel, the following vulnerability has been resolved:
ocfs2: relax BUG() to ocfs2_error() in __ocfs2_move_extent()
In '__ocfs2_move_extent()', relax 'BUG()' to 'ocfs2_error()' just
to avoid crashing the whole kernel due to a filesystem corruption. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: Check skb->transport_header is set in bpf_skb_check_mtu
The bpf_skb_check_mtu helper needs to use skb->transport_header when
the BPF_MTU_CHK_SEGS flag is used:
bpf_skb_check_mtu(skb, ifindex, &mtu_len, 0, BPF_MTU_CHK_SEGS)
The transport_header is not always set. There is a WARN_ON_ONCE
report when CONFIG_DEBUG_NET is enabled + skb->gso_size is set +
bpf_prog_test_run is used:
WARNING: CPU: 1 PID: 2216 at ./include/linux/skbuff.h:3071
skb_gso_validate_network_len
bpf_skb_check_mtu
bpf_prog_3920e25740a41171_tc_chk_segs_flag # A test in the next patch
bpf_test_run
bpf_prog_test_run_skb
For a normal ingress skb (not test_run), skb_reset_transport_header
is performed but there is plan to avoid setting it as described in
commit 2170a1f09148 ("net: no longer reset transport_header in __netif_receive_skb_core()").
This patch fixes the bpf helper by checking
skb_transport_header_was_set(). The check is done just before
skb->transport_header is used, to avoid breaking the existing bpf prog.
The WARN_ON_ONCE is limited to bpf_prog_test_run, so targeting bpf-next. |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: rtl818x: rtl8187: Fix potential buffer underflow in rtl8187_rx_cb()
The rtl8187_rx_cb() calculates the rx descriptor header address
by subtracting its size from the skb tail pointer.
However, it does not validate if the received packet
(skb->len from urb->actual_length) is large enough to contain this
header.
If a truncated packet is received, this will lead to a buffer
underflow, reading memory before the start of the skb data area,
and causing a kernel panic.
Add length checks for both rtl8187 and rtl8187b descriptor headers
before attempting to access them, dropping the packet cleanly if the
check fails. |
| In the Linux kernel, the following vulnerability has been resolved:
erofs: limit the level of fs stacking for file-backed mounts
Otherwise, it could cause potential kernel stack overflow (e.g., EROFS
mounting itself). |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: mt76: wed: use proper wed reference in mt76 wed driver callabacks
MT7996 driver can use both wed and wed_hif2 devices to offload traffic
from/to the wireless NIC. In the current codebase we assume to always
use the primary wed device in wed callbacks resulting in the following
crash if the hw runs wed_hif2 (e.g. 6GHz link).
[ 297.455876] Unable to handle kernel read from unreadable memory at virtual address 000000000000080a
[ 297.464928] Mem abort info:
[ 297.467722] ESR = 0x0000000096000005
[ 297.471461] EC = 0x25: DABT (current EL), IL = 32 bits
[ 297.476766] SET = 0, FnV = 0
[ 297.479809] EA = 0, S1PTW = 0
[ 297.482940] FSC = 0x05: level 1 translation fault
[ 297.487809] Data abort info:
[ 297.490679] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[ 297.496156] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 297.501196] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 297.506500] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000107480000
[ 297.512927] [000000000000080a] pgd=08000001097fb003, p4d=08000001097fb003, pud=08000001097fb003, pmd=0000000000000000
[ 297.523532] Internal error: Oops: 0000000096000005 [#1] SMP
[ 297.715393] CPU: 2 UID: 0 PID: 45 Comm: kworker/u16:2 Tainted: G O 6.12.50 #0
[ 297.723908] Tainted: [O]=OOT_MODULE
[ 297.727384] Hardware name: Banana Pi BPI-R4 (2x SFP+) (DT)
[ 297.732857] Workqueue: nf_ft_offload_del nf_flow_rule_route_ipv6 [nf_flow_table]
[ 297.740254] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 297.747205] pc : mt76_wed_offload_disable+0x64/0xa0 [mt76]
[ 297.752688] lr : mtk_wed_flow_remove+0x58/0x80
[ 297.757126] sp : ffffffc080fe3ae0
[ 297.760430] x29: ffffffc080fe3ae0 x28: ffffffc080fe3be0 x27: 00000000deadbef7
[ 297.767557] x26: ffffff80c5ebca00 x25: 0000000000000001 x24: ffffff80c85f4c00
[ 297.774683] x23: ffffff80c1875b78 x22: ffffffc080d42cd0 x21: ffffffc080660018
[ 297.781809] x20: ffffff80c6a076d0 x19: ffffff80c6a043c8 x18: 0000000000000000
[ 297.788935] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000000
[ 297.796060] x14: 0000000000000019 x13: ffffff80c0ad8ec0 x12: 00000000fa83b2da
[ 297.803185] x11: ffffff80c02700c0 x10: ffffff80c0ad8ec0 x9 : ffffff81fef96200
[ 297.810311] x8 : ffffff80c02700c0 x7 : ffffff80c02700d0 x6 : 0000000000000002
[ 297.817435] x5 : 0000000000000400 x4 : 0000000000000000 x3 : 0000000000000000
[ 297.824561] x2 : 0000000000000001 x1 : 0000000000000800 x0 : ffffff80c6a063c8
[ 297.831686] Call trace:
[ 297.834123] mt76_wed_offload_disable+0x64/0xa0 [mt76]
[ 297.839254] mtk_wed_flow_remove+0x58/0x80
[ 297.843342] mtk_flow_offload_cmd+0x434/0x574
[ 297.847689] mtk_wed_setup_tc_block_cb+0x30/0x40
[ 297.852295] nf_flow_offload_ipv6_hook+0x7f4/0x964 [nf_flow_table]
[ 297.858466] nf_flow_rule_route_ipv6+0x438/0x4a4 [nf_flow_table]
[ 297.864463] process_one_work+0x174/0x300
[ 297.868465] worker_thread+0x278/0x430
[ 297.872204] kthread+0xd8/0xdc
[ 297.875251] ret_from_fork+0x10/0x20
[ 297.878820] Code: 928b5ae0 8b000273 91400a60 f943fa61 (79401421)
[ 297.884901] ---[ end trace 0000000000000000 ]---
Fix the issue detecting the proper wed reference to use running wed
callabacks. |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix double free of qgroup record after failure to add delayed ref head
In the previous code it was possible to incur into a double kfree()
scenario when calling add_delayed_ref_head(). This could happen if the
record was reported to already exist in the
btrfs_qgroup_trace_extent_nolock() call, but then there was an error
later on add_delayed_ref_head(). In this case, since
add_delayed_ref_head() returned an error, the caller went to free the
record. Since add_delayed_ref_head() couldn't set this kfree'd pointer
to NULL, then kfree() would have acted on a non-NULL 'record' object
which was pointing to memory already freed by the callee.
The problem comes from the fact that the responsibility to kfree the
object is on both the caller and the callee at the same time. Hence, the
fix for this is to shift the ownership of the 'qrecord' object out of
the add_delayed_ref_head(). That is, we will never attempt to kfree()
the given object inside of this function, and will expect the caller to
act on the 'qrecord' object on its own. The only exception where the
'qrecord' object cannot be kfree'd is if it was inserted into the
tracing logic, for which we already have the 'qrecord_inserted_ret'
boolean to account for this. Hence, the caller has to kfree the object
only if add_delayed_ref_head() reports not to have inserted it on the
tracing logic.
As a side-effect of the above, we must guarantee that
'qrecord_inserted_ret' is properly initialized at the start of the
function, not at the end, and then set when an actual insert
happens. This way we avoid 'qrecord_inserted_ret' having an invalid
value on an early exit.
The documentation from the add_delayed_ref_head() has also been updated
to reflect on the exact ownership of the 'qrecord' object. |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix racy bitfield write in btrfs_clear_space_info_full()
From the memory-barriers.txt document regarding memory barrier ordering
guarantees:
(*) These guarantees do not apply to bitfields, because compilers often
generate code to modify these using non-atomic read-modify-write
sequences. Do not attempt to use bitfields to synchronize parallel
algorithms.
(*) Even in cases where bitfields are protected by locks, all fields
in a given bitfield must be protected by one lock. If two fields
in a given bitfield are protected by different locks, the compiler's
non-atomic read-modify-write sequences can cause an update to one
field to corrupt the value of an adjacent field.
btrfs_space_info has a bitfield sharing an underlying word consisting of
the fields full, chunk_alloc, and flush:
struct btrfs_space_info {
struct btrfs_fs_info * fs_info; /* 0 8 */
struct btrfs_space_info * parent; /* 8 8 */
...
int clamp; /* 172 4 */
unsigned int full:1; /* 176: 0 4 */
unsigned int chunk_alloc:1; /* 176: 1 4 */
unsigned int flush:1; /* 176: 2 4 */
...
Therefore, to be safe from parallel read-modify-writes losing a write to
one of the bitfield members protected by a lock, all writes to all the
bitfields must use the lock. They almost universally do, except for
btrfs_clear_space_info_full() which iterates over the space_infos and
writes out found->full = 0 without a lock.
Imagine that we have one thread completing a transaction in which we
finished deleting a block_group and are thus calling
btrfs_clear_space_info_full() while simultaneously the data reclaim
ticket infrastructure is running do_async_reclaim_data_space():
T1 T2
btrfs_commit_transaction
btrfs_clear_space_info_full
data_sinfo->full = 0
READ: full:0, chunk_alloc:0, flush:1
do_async_reclaim_data_space(data_sinfo)
spin_lock(&space_info->lock);
if(list_empty(tickets))
space_info->flush = 0;
READ: full: 0, chunk_alloc:0, flush:1
MOD/WRITE: full: 0, chunk_alloc:0, flush:0
spin_unlock(&space_info->lock);
return;
MOD/WRITE: full:0, chunk_alloc:0, flush:1
and now data_sinfo->flush is 1 but the reclaim worker has exited. This
breaks the invariant that flush is 0 iff there is no work queued or
running. Once this invariant is violated, future allocations that go
into __reserve_bytes() will add tickets to space_info->tickets but will
see space_info->flush is set to 1 and not queue the work. After this,
they will block forever on the resulting ticket, as it is now impossible
to kick the worker again.
I also confirmed by looking at the assembly of the affected kernel that
it is doing RMW operations. For example, to set the flush (3rd) bit to 0,
the assembly is:
andb $0xfb,0x60(%rbx)
and similarly for setting the full (1st) bit to 0:
andb $0xfe,-0x20(%rax)
So I think this is really a bug on practical systems. I have observed
a number of systems in this exact state, but am currently unable to
reproduce it.
Rather than leaving this footgun lying around for the future, take
advantage of the fact that there is room in the struct anyway, and that
it is already quite large and simply change the three bitfield members to
bools. This avoids writes to space_info->full having any effect on
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
iomap: allocate s_dio_done_wq for async reads as well
Since commit 222f2c7c6d14 ("iomap: always run error completions in user
context"), read error completions are deferred to s_dio_done_wq. This
means the workqueue also needs to be allocated for async reads. |
| In the Linux kernel, the following vulnerability has been resolved:
gfs2: Prevent recursive memory reclaim
Function new_inode() returns a new inode with inode->i_mapping->gfp_mask
set to GFP_HIGHUSER_MOVABLE. This value includes the __GFP_FS flag, so
allocations in that address space can recurse into filesystem memory
reclaim. We don't want that to happen because it can consume a
significant amount of stack memory.
Worse than that is that it can also deadlock: for example, in several
places, gfs2_unstuff_dinode() is called inside filesystem transactions.
This calls filemap_grab_folio(), which can allocate a new folio, which
can trigger memory reclaim. If memory reclaim recurses into the
filesystem and starts another transaction, a deadlock will ensue.
To fix these kinds of problems, prevent memory reclaim from recursing
into filesystem code by making sure that the gfp_mask of inode address
spaces doesn't include __GFP_FS.
The "meta" and resource group address spaces were already using GFP_NOFS
as their gfp_mask (which doesn't include __GFP_FS). The default value
of GFP_HIGHUSER_MOVABLE is less restrictive than GFP_NOFS, though. To
avoid being overly limiting, use the default value and only knock off
the __GFP_FS flag. I'm not sure if this will actually make a
difference, but it also shouldn't hurt.
This patch is loosely based on commit ad22c7a043c2 ("xfs: prevent stack
overflows from page cache allocation").
Fixes xfstest generic/273. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf: Fix exclusive map memory leak
When excl_prog_hash is 0 and excl_prog_hash_size is non-zero, the map also
needs to be freed. Otherwise, the map memory will not be reclaimed, just
like the memory leak problem reported by syzbot [1].
syzbot reported:
BUG: memory leak
backtrace (crc 7b9fb9b4):
map_create+0x322/0x11e0 kernel/bpf/syscall.c:1512
__sys_bpf+0x3556/0x3610 kernel/bpf/syscall.c:6131 |
| In the Linux kernel, the following vulnerability has been resolved:
regulator: core: Protect regulator_supply_alias_list with regulator_list_mutex
regulator_supply_alias_list was accessed without any locking in
regulator_supply_alias(), regulator_register_supply_alias(), and
regulator_unregister_supply_alias(). Concurrent registration,
unregistration and lookups can race, leading to:
1 use-after-free if an alias entry is removed while being read,
2 duplicate entries when two threads register the same alias,
3 inconsistent alias mappings observed by consumers.
Protect all traversals, insertions and deletions on
regulator_supply_alias_list with the existing regulator_list_mutex. |