| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
bus: fsl-mc: fix use-after-free in driver_override_show()
The driver_override_show() function reads the driver_override string
without holding the device_lock. However, driver_override_store() uses
driver_set_override(), which modifies and frees the string while holding
the device_lock.
This can result in a concurrent use-after-free if the string is freed
by the store function while being read by the show function.
Fix this by holding the device_lock around the read operation. |
| In the Linux kernel, the following vulnerability has been resolved:
ksmbd: fix infinite loop caused by next_smb2_rcv_hdr_off reset in error paths
The problem occurs when a signed request fails smb2 signature verification
check. In __process_request(), if check_sign_req() returns an error,
set_smb2_rsp_status(work, STATUS_ACCESS_DENIED) is called.
set_smb2_rsp_status() set work->next_smb2_rcv_hdr_off as zero. By resetting
next_smb2_rcv_hdr_off to zero, the pointer to the next command in the chain
is lost. Consequently, is_chained_smb2_message() continues to point to
the same request header instead of advancing. If the header's NextCommand
field is non-zero, the function returns true, causing __handle_ksmbd_work()
to repeatedly process the same failed request in an infinite loop.
This results in the kernel log being flooded with "bad smb2 signature"
messages and high CPU usage.
This patch fixes the issue by changing the return value from
SERVER_HANDLER_CONTINUE to SERVER_HANDLER_ABORT. This ensures that
the processing loop terminates immediately rather than attempting to
continue from an invalidated offset. |
| In the Linux kernel, the following vulnerability has been resolved:
mptcp: fix race in mptcp_pm_nl_flush_addrs_doit()
syzbot and Eulgyu Kim reported crashes in mptcp_pm_nl_get_local_id()
and/or mptcp_pm_nl_is_backup()
Root cause is list_splice_init() in mptcp_pm_nl_flush_addrs_doit()
which is not RCU ready.
list_splice_init_rcu() can not be called here while holding pernet->lock
spinlock.
Many thanks to Eulgyu Kim for providing a repro and testing our patches. |
| In the Linux kernel, the following vulnerability has been resolved:
mm/hugetlb: fix hugetlb_pmd_shared()
Patch series "mm/hugetlb: fixes for PMD table sharing (incl. using
mmu_gather)", v3.
One functional fix, one performance regression fix, and two related
comment fixes.
I cleaned up my prototype I recently shared [1] for the performance fix,
deferring most of the cleanups I had in the prototype to a later point.
While doing that I identified the other things.
The goal of this patch set is to be backported to stable trees "fairly"
easily. At least patch #1 and #4.
Patch #1 fixes hugetlb_pmd_shared() not detecting any sharing
Patch #2 + #3 are simple comment fixes that patch #4 interacts with.
Patch #4 is a fix for the reported performance regression due to excessive
IPI broadcasts during fork()+exit().
The last patch is all about TLB flushes, IPIs and mmu_gather.
Read: complicated
There are plenty of cleanups in the future to be had + one reasonable
optimization on x86. But that's all out of scope for this series.
Runtime tested, with a focus on fixing the performance regression using
the original reproducer [2] on x86.
This patch (of 4):
We switched from (wrongly) using the page count to an independent shared
count. Now, shared page tables have a refcount of 1 (excluding
speculative references) and instead use ptdesc->pt_share_count to identify
sharing.
We didn't convert hugetlb_pmd_shared(), so right now, we would never
detect a shared PMD table as such, because sharing/unsharing no longer
touches the refcount of a PMD table.
Page migration, like mbind() or migrate_pages() would allow for migrating
folios mapped into such shared PMD tables, even though the folios are not
exclusive. In smaps we would account them as "private" although they are
"shared", and we would be wrongly setting the PM_MMAP_EXCLUSIVE in the
pagemap interface.
Fix it by properly using ptdesc_pmd_is_shared() in hugetlb_pmd_shared(). |
| This CVE ID has been rejected or withdrawn by its CVE Numbering Authority. |
| This CVE ID has been rejected or withdrawn by its CVE Numbering Authority. |
| SPIP before 4.3.6, 4.2.17, and 4.1.20 allows unauthorized content disclosure in the private area. The application does not properly check authorization when displaying content of articles and sections (rubriques) in AJAX-loaded fragments, allowing an authenticated attacker to access restricted content. This vulnerability is not mitigated by the SPIP security screen. |
| SPIP before 4.3.6, 4.2.17, and 4.1.20 allows Cross-Site Scripting (XSS) in the private area. The content of the error message displayed by the 'transmettre' API is not properly sanitized, allowing an attacker to inject malicious scripts. This vulnerability is mitigated by the SPIP security screen. |
| In the Linux kernel, the following vulnerability has been resolved:
nilfs2: Fix potential block overflow that cause system hang
When a user executes the FITRIM command, an underflow can occur when
calculating nblocks if end_block is too small. Since nblocks is of
type sector_t, which is u64, a negative nblocks value will become a
very large positive integer. This ultimately leads to the block layer
function __blkdev_issue_discard() taking an excessively long time to
process the bio chain, and the ns_segctor_sem lock remains held for a
long period. This prevents other tasks from acquiring the ns_segctor_sem
lock, resulting in the hang reported by syzbot in [1].
If the ending block is too small, typically if it is smaller than 4KiB
range, depending on the usage of the segment 0, it may be possible to
attempt a discard request beyond the device size causing the hang.
Exiting successfully and assign the discarded size (0 in this case)
to range->len.
Although the start and len values in the user input range are too small,
a conservative strategy is adopted here to safely ignore them, which is
equivalent to a no-op; it will not perform any trimming and will not
throw an error.
[1]
task:segctord state:D stack:28968 pid:6093 tgid:6093 ppid:2 task_flags:0x200040 flags:0x00080000
Call Trace:
rwbase_write_lock+0x3dd/0x750 kernel/locking/rwbase_rt.c:272
nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357
nilfs_segctor_thread_construct fs/nilfs2/segment.c:2569 [inline]
nilfs_segctor_thread+0x6ec/0xe00 fs/nilfs2/segment.c:2684
[ryusuke: corrected part of the commit message about the consequences] |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: qla2xxx: Validate sp before freeing associated memory
System crash with the following signature
[154563.214890] nvme nvme2: NVME-FC{1}: controller connect complete
[154564.169363] qla2xxx [0000:b0:00.1]-3002:2: nvme: Sched: Set ZIO exchange threshold to 3.
[154564.169405] qla2xxx [0000:b0:00.1]-ffffff:2: SET ZIO Activity exchange threshold to 5.
[154565.539974] qla2xxx [0000:b0:00.1]-5013:2: RSCN database changed – 0078 0080 0000.
[154565.545744] qla2xxx [0000:b0:00.1]-5013:2: RSCN database changed – 0078 00a0 0000.
[154565.545857] qla2xxx [0000:b0:00.1]-11a2:2: FEC=enabled (data rate).
[154565.552760] qla2xxx [0000:b0:00.1]-11a2:2: FEC=enabled (data rate).
[154565.553079] BUG: kernel NULL pointer dereference, address: 00000000000000f8
[154565.553080] #PF: supervisor read access in kernel mode
[154565.553082] #PF: error_code(0x0000) - not-present page
[154565.553084] PGD 80000010488ab067 P4D 80000010488ab067 PUD 104978a067 PMD 0
[154565.553089] Oops: 0000 1 PREEMPT SMP PTI
[154565.553092] CPU: 10 PID: 858 Comm: qla2xxx_2_dpc Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.11.1.el9_5.x86_64 #1
[154565.553096] Hardware name: HPE Synergy 660 Gen10/Synergy 660 Gen10 Compute Module, BIOS I43 09/30/2024
[154565.553097] RIP: 0010:qla_fab_async_scan.part.0+0x40b/0x870 [qla2xxx]
[154565.553141] Code: 00 00 e8 58 a3 ec d4 49 89 e9 ba 12 20 00 00 4c 89 e6 49 c7 c0 00 ee a8 c0 48 c7 c1 66 c0 a9 c0 bf 00 80 00 10 e8 15 69 00 00 <4c> 8b 8d f8 00 00 00 4d 85 c9 74 35 49 8b 84 24 00 19 00 00 48 8b
[154565.553143] RSP: 0018:ffffb4dbc8aebdd0 EFLAGS: 00010286
[154565.553145] RAX: 0000000000000000 RBX: ffff8ec2cf0908d0 RCX: 0000000000000002
[154565.553147] RDX: 0000000000000000 RSI: ffffffffc0a9c896 RDI: ffffb4dbc8aebd47
[154565.553148] RBP: 0000000000000000 R08: ffffb4dbc8aebd45 R09: 0000000000ffff0a
[154565.553150] R10: 0000000000000000 R11: 000000000000000f R12: ffff8ec2cf0908d0
[154565.553151] R13: ffff8ec2cf090900 R14: 0000000000000102 R15: ffff8ec2cf084000
[154565.553152] FS: 0000000000000000(0000) GS:ffff8ed27f800000(0000) knlGS:0000000000000000
[154565.553154] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[154565.553155] CR2: 00000000000000f8 CR3: 000000113ae0a005 CR4: 00000000007706f0
[154565.553157] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[154565.553158] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[154565.553159] PKRU: 55555554
[154565.553160] Call Trace:
[154565.553162] <TASK>
[154565.553165] ? show_trace_log_lvl+0x1c4/0x2df
[154565.553172] ? show_trace_log_lvl+0x1c4/0x2df
[154565.553177] ? qla_fab_async_scan.part.0+0x40b/0x870 [qla2xxx]
[154565.553215] ? __die_body.cold+0x8/0xd
[154565.553218] ? page_fault_oops+0x134/0x170
[154565.553223] ? snprintf+0x49/0x70
[154565.553229] ? exc_page_fault+0x62/0x150
[154565.553238] ? asm_exc_page_fault+0x22/0x30
Check for sp being non NULL before freeing any associated memory |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: qla2xxx: Delay module unload while fabric scan in progress
System crash seen during load/unload test in a loop.
[105954.384919] RBP: ffff914589838dc0 R08: 0000000000000000 R09: 0000000000000086
[105954.384920] R10: 000000000000000f R11: ffffa31240904be5 R12: ffff914605f868e0
[105954.384921] R13: ffff914605f86910 R14: 0000000000008010 R15: 00000000ddb7c000
[105954.384923] FS: 0000000000000000(0000) GS:ffff9163fec40000(0000) knlGS:0000000000000000
[105954.384925] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[105954.384926] CR2: 000055d31ce1d6a0 CR3: 0000000119f5e001 CR4: 0000000000770ee0
[105954.384928] PKRU: 55555554
[105954.384929] Call Trace:
[105954.384931] <IRQ>
[105954.384934] qla24xx_sp_unmap+0x1f3/0x2a0 [qla2xxx]
[105954.384962] ? qla_async_scan_sp_done+0x114/0x1f0 [qla2xxx]
[105954.384980] ? qla24xx_els_ct_entry+0x4de/0x760 [qla2xxx]
[105954.384999] ? __wake_up_common+0x80/0x190
[105954.385004] ? qla24xx_process_response_queue+0xc2/0xaa0 [qla2xxx]
[105954.385023] ? qla24xx_msix_rsp_q+0x44/0xb0 [qla2xxx]
[105954.385040] ? __handle_irq_event_percpu+0x3d/0x190
[105954.385044] ? handle_irq_event+0x58/0xb0
[105954.385046] ? handle_edge_irq+0x93/0x240
[105954.385050] ? __common_interrupt+0x41/0xa0
[105954.385055] ? common_interrupt+0x3e/0xa0
[105954.385060] ? asm_common_interrupt+0x22/0x40
The root cause of this was that there was a free (dma_free_attrs) in the
interrupt context. There was a device discovery/fabric scan in
progress. A module unload was issued which set the UNLOADING flag. As
part of the discovery, after receiving an interrupt a work queue was
scheduled (which involved a work to be queued). Since the UNLOADING
flag is set, the work item was not allocated and the mapped memory had
to be freed. The free occurred in interrupt context leading to system
crash. Delay the driver unload until the fabric scan is complete to
avoid the crash. |
| In the Linux kernel, the following vulnerability has been resolved:
PCI: endpoint: Avoid creating sub-groups asynchronously
The asynchronous creation of sub-groups by a delayed work could lead to a
NULL pointer dereference when the driver directory is removed before the
work completes.
The crash can be easily reproduced with the following commands:
# cd /sys/kernel/config/pci_ep/functions/pci_epf_test
# for i in {1..20}; do mkdir test && rmdir test; done
BUG: kernel NULL pointer dereference, address: 0000000000000088
...
Call Trace:
configfs_register_group+0x3d/0x190
pci_epf_cfs_work+0x41/0x110
process_one_work+0x18f/0x350
worker_thread+0x25a/0x3a0
Fix this issue by using configfs_add_default_group() API which does not
have the deadlock problem as configfs_register_group() and does not require
the delayed work handler.
[mani: slightly reworded the description and added stable list] |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: qla2xxx: Free sp in error path to fix system crash
System crash seen during load/unload test in a loop,
[61110.449331] qla2xxx [0000:27:00.0]-0042:0: Disabled MSI-X.
[61110.467494] =============================================================================
[61110.467498] BUG qla2xxx_srbs (Tainted: G OE -------- --- ): Objects remaining in qla2xxx_srbs on __kmem_cache_shutdown()
[61110.467501] -----------------------------------------------------------------------------
[61110.467502] Slab 0x000000000ffc8162 objects=51 used=1 fp=0x00000000e25d3d85 flags=0x57ffffc0010200(slab|head|node=1|zone=2|lastcpupid=0x1fffff)
[61110.467509] CPU: 53 PID: 455206 Comm: rmmod Kdump: loaded Tainted: G OE -------- --- 5.14.0-284.11.1.el9_2.x86_64 #1
[61110.467513] Hardware name: HPE ProLiant DL385 Gen10 Plus v2/ProLiant DL385 Gen10 Plus v2, BIOS A42 08/17/2023
[61110.467515] Call Trace:
[61110.467516] <TASK>
[61110.467519] dump_stack_lvl+0x34/0x48
[61110.467526] slab_err.cold+0x53/0x67
[61110.467534] __kmem_cache_shutdown+0x16e/0x320
[61110.467540] kmem_cache_destroy+0x51/0x160
[61110.467544] qla2x00_module_exit+0x93/0x99 [qla2xxx]
[61110.467607] ? __do_sys_delete_module.constprop.0+0x178/0x280
[61110.467613] ? syscall_trace_enter.constprop.0+0x145/0x1d0
[61110.467616] ? do_syscall_64+0x5c/0x90
[61110.467619] ? exc_page_fault+0x62/0x150
[61110.467622] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[61110.467626] </TASK>
[61110.467627] Disabling lock debugging due to kernel taint
[61110.467635] Object 0x0000000026f7e6e6 @offset=16000
[61110.467639] ------------[ cut here ]------------
[61110.467639] kmem_cache_destroy qla2xxx_srbs: Slab cache still has objects when called from qla2x00_module_exit+0x93/0x99 [qla2xxx]
[61110.467659] WARNING: CPU: 53 PID: 455206 at mm/slab_common.c:520 kmem_cache_destroy+0x14d/0x160
[61110.467718] CPU: 53 PID: 455206 Comm: rmmod Kdump: loaded Tainted: G B OE -------- --- 5.14.0-284.11.1.el9_2.x86_64 #1
[61110.467720] Hardware name: HPE ProLiant DL385 Gen10 Plus v2/ProLiant DL385 Gen10 Plus v2, BIOS A42 08/17/2023
[61110.467721] RIP: 0010:kmem_cache_destroy+0x14d/0x160
[61110.467724] Code: 99 7d 07 00 48 89 ef e8 e1 6a 07 00 eb b3 48 8b 55 60 48 8b 4c 24 20 48 c7 c6 70 fc 66 90 48 c7 c7 f8 ef a1 90 e8 e1 ed 7c 00 <0f> 0b eb 93 c3 cc cc cc cc 66 2e 0f 1f 84 00 00 00 00 00 55 48 89
[61110.467725] RSP: 0018:ffffa304e489fe80 EFLAGS: 00010282
[61110.467727] RAX: 0000000000000000 RBX: ffffffffc0d9a860 RCX: 0000000000000027
[61110.467729] RDX: ffff8fd5ff9598a8 RSI: 0000000000000001 RDI: ffff8fd5ff9598a0
[61110.467730] RBP: ffff8fb6aaf78700 R08: 0000000000000000 R09: 0000000100d863b7
[61110.467731] R10: ffffa304e489fd20 R11: ffffffff913bef48 R12: 0000000040002000
[61110.467731] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[61110.467733] FS: 00007f64c89fb740(0000) GS:ffff8fd5ff940000(0000) knlGS:0000000000000000
[61110.467734] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[61110.467735] CR2: 00007f0f02bfe000 CR3: 00000020ad6dc005 CR4: 0000000000770ee0
[61110.467736] PKRU: 55555554
[61110.467737] Call Trace:
[61110.467738] <TASK>
[61110.467739] qla2x00_module_exit+0x93/0x99 [qla2xxx]
[61110.467755] ? __do_sys_delete_module.constprop.0+0x178/0x280
Free sp in the error path to fix the crash. |
| In the Linux kernel, the following vulnerability has been resolved:
mptcp: ensure context reset on disconnect()
After the blamed commit below, if the MPC subflow is already in TCP_CLOSE
status or has fallback to TCP at mptcp_disconnect() time,
mptcp_do_fastclose() skips setting the `send_fastclose flag` and the later
__mptcp_close_ssk() does not reset anymore the related subflow context.
Any later connection will be created with both the `request_mptcp` flag
and the msk-level fallback status off (it is unconditionally cleared at
MPTCP disconnect time), leading to a warning in subflow_data_ready():
WARNING: CPU: 26 PID: 8996 at net/mptcp/subflow.c:1519 subflow_data_ready (net/mptcp/subflow.c:1519 (discriminator 13))
Modules linked in:
CPU: 26 UID: 0 PID: 8996 Comm: syz.22.39 Not tainted 6.18.0-rc7-05427-g11fc074f6c36 #1 PREEMPT(voluntary)
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
RIP: 0010:subflow_data_ready (net/mptcp/subflow.c:1519 (discriminator 13))
Code: 90 0f 0b 90 90 e9 04 fe ff ff e8 b7 1e f5 fe 89 ee bf 07 00 00 00 e8 db 19 f5 fe 83 fd 07 0f 84 35 ff ff ff e8 9d 1e f5 fe 90 <0f> 0b 90 e9 27 ff ff ff e8 8f 1e f5 fe 4c 89 e7 48 89 de e8 14 09
RSP: 0018:ffffc9002646fb30 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88813b218000 RCX: ffffffff825c8435
RDX: ffff8881300b3580 RSI: ffffffff825c8443 RDI: 0000000000000005
RBP: 000000000000000b R08: ffffffff825c8435 R09: 000000000000000b
R10: 0000000000000005 R11: 0000000000000007 R12: ffff888131ac0000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 00007f88330af6c0(0000) GS:ffff888a93dd2000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f88330aefe8 CR3: 000000010ff59000 CR4: 0000000000350ef0
Call Trace:
<TASK>
tcp_data_ready (net/ipv4/tcp_input.c:5356)
tcp_data_queue (net/ipv4/tcp_input.c:5445)
tcp_rcv_state_process (net/ipv4/tcp_input.c:7165)
tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1955)
__release_sock (include/net/sock.h:1158 (discriminator 6) net/core/sock.c:3180 (discriminator 6))
release_sock (net/core/sock.c:3737)
mptcp_sendmsg (net/mptcp/protocol.c:1763 net/mptcp/protocol.c:1857)
inet_sendmsg (net/ipv4/af_inet.c:853 (discriminator 7))
__sys_sendto (net/socket.c:727 (discriminator 15) net/socket.c:742 (discriminator 15) net/socket.c:2244 (discriminator 15))
__x64_sys_sendto (net/socket.c:2247)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
RIP: 0033:0x7f883326702d
Address the issue setting an explicit `fastclosing` flag at fastclose
time, and checking such flag after mptcp_do_fastclose(). |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix racy bitfield write in btrfs_clear_space_info_full()
From the memory-barriers.txt document regarding memory barrier ordering
guarantees:
(*) These guarantees do not apply to bitfields, because compilers often
generate code to modify these using non-atomic read-modify-write
sequences. Do not attempt to use bitfields to synchronize parallel
algorithms.
(*) Even in cases where bitfields are protected by locks, all fields
in a given bitfield must be protected by one lock. If two fields
in a given bitfield are protected by different locks, the compiler's
non-atomic read-modify-write sequences can cause an update to one
field to corrupt the value of an adjacent field.
btrfs_space_info has a bitfield sharing an underlying word consisting of
the fields full, chunk_alloc, and flush:
struct btrfs_space_info {
struct btrfs_fs_info * fs_info; /* 0 8 */
struct btrfs_space_info * parent; /* 8 8 */
...
int clamp; /* 172 4 */
unsigned int full:1; /* 176: 0 4 */
unsigned int chunk_alloc:1; /* 176: 1 4 */
unsigned int flush:1; /* 176: 2 4 */
...
Therefore, to be safe from parallel read-modify-writes losing a write to
one of the bitfield members protected by a lock, all writes to all the
bitfields must use the lock. They almost universally do, except for
btrfs_clear_space_info_full() which iterates over the space_infos and
writes out found->full = 0 without a lock.
Imagine that we have one thread completing a transaction in which we
finished deleting a block_group and are thus calling
btrfs_clear_space_info_full() while simultaneously the data reclaim
ticket infrastructure is running do_async_reclaim_data_space():
T1 T2
btrfs_commit_transaction
btrfs_clear_space_info_full
data_sinfo->full = 0
READ: full:0, chunk_alloc:0, flush:1
do_async_reclaim_data_space(data_sinfo)
spin_lock(&space_info->lock);
if(list_empty(tickets))
space_info->flush = 0;
READ: full: 0, chunk_alloc:0, flush:1
MOD/WRITE: full: 0, chunk_alloc:0, flush:0
spin_unlock(&space_info->lock);
return;
MOD/WRITE: full:0, chunk_alloc:0, flush:1
and now data_sinfo->flush is 1 but the reclaim worker has exited. This
breaks the invariant that flush is 0 iff there is no work queued or
running. Once this invariant is violated, future allocations that go
into __reserve_bytes() will add tickets to space_info->tickets but will
see space_info->flush is set to 1 and not queue the work. After this,
they will block forever on the resulting ticket, as it is now impossible
to kick the worker again.
I also confirmed by looking at the assembly of the affected kernel that
it is doing RMW operations. For example, to set the flush (3rd) bit to 0,
the assembly is:
andb $0xfb,0x60(%rbx)
and similarly for setting the full (1st) bit to 0:
andb $0xfe,-0x20(%rax)
So I think this is really a bug on practical systems. I have observed
a number of systems in this exact state, but am currently unable to
reproduce it.
Rather than leaving this footgun lying around for the future, take
advantage of the fact that there is room in the struct anyway, and that
it is already quite large and simply change the three bitfield members to
bools. This avoids writes to space_info->full having any effect on
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
devlink: rate: Unset parent pointer in devl_rate_nodes_destroy
The function devl_rate_nodes_destroy is documented to "Unset parent for
all rate objects". However, it was only calling the driver-specific
`rate_leaf_parent_set` or `rate_node_parent_set` ops and decrementing
the parent's refcount, without actually setting the
`devlink_rate->parent` pointer to NULL.
This leaves a dangling pointer in the `devlink_rate` struct, which cause
refcount error in netdevsim[1] and mlx5[2]. In addition, this is
inconsistent with the behavior of `devlink_nl_rate_parent_node_set`,
where the parent pointer is correctly cleared.
This patch fixes the issue by explicitly setting `devlink_rate->parent`
to NULL after notifying the driver, thus fulfilling the function's
documented behavior for all rate objects.
[1]
repro steps:
echo 1 > /sys/bus/netdevsim/new_device
devlink dev eswitch set netdevsim/netdevsim1 mode switchdev
echo 1 > /sys/bus/netdevsim/devices/netdevsim1/sriov_numvfs
devlink port function rate add netdevsim/netdevsim1/test_node
devlink port function rate set netdevsim/netdevsim1/128 parent test_node
echo 1 > /sys/bus/netdevsim/del_device
dmesg:
refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 8 PID: 1530 at lib/refcount.c:31 refcount_warn_saturate+0x42/0xe0
CPU: 8 UID: 0 PID: 1530 Comm: bash Not tainted 6.18.0-rc4+ #1 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:refcount_warn_saturate+0x42/0xe0
Call Trace:
<TASK>
devl_rate_leaf_destroy+0x8d/0x90
__nsim_dev_port_del+0x6c/0x70 [netdevsim]
nsim_dev_reload_destroy+0x11c/0x140 [netdevsim]
nsim_drv_remove+0x2b/0xb0 [netdevsim]
device_release_driver_internal+0x194/0x1f0
bus_remove_device+0xc6/0x130
device_del+0x159/0x3c0
device_unregister+0x1a/0x60
del_device_store+0x111/0x170 [netdevsim]
kernfs_fop_write_iter+0x12e/0x1e0
vfs_write+0x215/0x3d0
ksys_write+0x5f/0xd0
do_syscall_64+0x55/0x10f0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
[2]
devlink dev eswitch set pci/0000:08:00.0 mode switchdev
devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 1000
devlink port function rate add pci/0000:08:00.0/group1
devlink port function rate set pci/0000:08:00.0/32768 parent group1
modprobe -r mlx5_ib mlx5_fwctl mlx5_core
dmesg:
refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 7 PID: 16151 at lib/refcount.c:31 refcount_warn_saturate+0x42/0xe0
CPU: 7 UID: 0 PID: 16151 Comm: bash Not tainted 6.17.0-rc7_for_upstream_min_debug_2025_10_02_12_44 #1 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
RIP: 0010:refcount_warn_saturate+0x42/0xe0
Call Trace:
<TASK>
devl_rate_leaf_destroy+0x8d/0x90
mlx5_esw_offloads_devlink_port_unregister+0x33/0x60 [mlx5_core]
mlx5_esw_offloads_unload_rep+0x3f/0x50 [mlx5_core]
mlx5_eswitch_unload_sf_vport+0x40/0x90 [mlx5_core]
mlx5_sf_esw_event+0xc4/0x120 [mlx5_core]
notifier_call_chain+0x33/0xa0
blocking_notifier_call_chain+0x3b/0x50
mlx5_eswitch_disable_locked+0x50/0x110 [mlx5_core]
mlx5_eswitch_disable+0x63/0x90 [mlx5_core]
mlx5_unload+0x1d/0x170 [mlx5_core]
mlx5_uninit_one+0xa2/0x130 [mlx5_core]
remove_one+0x78/0xd0 [mlx5_core]
pci_device_remove+0x39/0xa0
device_release_driver_internal+0x194/0x1f0
unbind_store+0x99/0xa0
kernfs_fop_write_iter+0x12e/0x1e0
vfs_write+0x215/0x3d0
ksys_write+0x5f/0xd0
do_syscall_64+0x53/0x1f0
entry_SYSCALL_64_after_hwframe+0x4b/0x53 |
| In the Linux kernel, the following vulnerability has been resolved:
ACPI: APEI: send SIGBUS to current task if synchronous memory error not recovered
If a synchronous error is detected as a result of user-space process
triggering a 2-bit uncorrected error, the CPU will take a synchronous
error exception such as Synchronous External Abort (SEA) on Arm64. The
kernel will queue a memory_failure() work which poisons the related
page, unmaps the page, and then sends a SIGBUS to the process, so that
a system wide panic can be avoided.
However, no memory_failure() work will be queued when abnormal
synchronous errors occur. These errors can include situations like
invalid PA, unexpected severity, no memory failure config support,
invalid GUID section, etc. In such a case, the user-space process will
trigger SEA again. This loop can potentially exceed the platform
firmware threshold or even trigger a kernel hard lockup, leading to a
system reboot.
Fix it by performing a force kill if no memory_failure() work is queued
for synchronous errors.
[ rjw: Changelog edits ] |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: cfg80211: Add missing lock in cfg80211_check_and_end_cac()
Callers of wdev_chandef() must hold the wiphy mutex.
But the worker cfg80211_propagate_cac_done_wk() never takes the lock.
Which triggers the warning below with the mesh_peer_connected_dfs
test from hostapd and not (yet) released mac80211 code changes:
WARNING: CPU: 0 PID: 495 at net/wireless/chan.c:1552 wdev_chandef+0x60/0x165
Modules linked in:
CPU: 0 UID: 0 PID: 495 Comm: kworker/u4:2 Not tainted 6.14.0-rc5-wt-g03960e6f9d47 #33 13c287eeabfe1efea01c0bcc863723ab082e17cf
Workqueue: cfg80211 cfg80211_propagate_cac_done_wk
Stack:
00000000 00000001 ffffff00 6093267c
00000000 6002ec30 6d577c50 60037608
00000000 67e8d108 6063717b 00000000
Call Trace:
[<6002ec30>] ? _printk+0x0/0x98
[<6003c2b3>] show_stack+0x10e/0x11a
[<6002ec30>] ? _printk+0x0/0x98
[<60037608>] dump_stack_lvl+0x71/0xb8
[<6063717b>] ? wdev_chandef+0x60/0x165
[<6003766d>] dump_stack+0x1e/0x20
[<6005d1b7>] __warn+0x101/0x20f
[<6005d3a8>] warn_slowpath_fmt+0xe3/0x15d
[<600b0c5c>] ? mark_lock.part.0+0x0/0x4ec
[<60751191>] ? __this_cpu_preempt_check+0x0/0x16
[<600b11a2>] ? mark_held_locks+0x5a/0x6e
[<6005d2c5>] ? warn_slowpath_fmt+0x0/0x15d
[<60052e53>] ? unblock_signals+0x3a/0xe7
[<60052f2d>] ? um_set_signals+0x2d/0x43
[<60751191>] ? __this_cpu_preempt_check+0x0/0x16
[<607508b2>] ? lock_is_held_type+0x207/0x21f
[<6063717b>] wdev_chandef+0x60/0x165
[<605f89b4>] regulatory_propagate_dfs_state+0x247/0x43f
[<60052f00>] ? um_set_signals+0x0/0x43
[<605e6bfd>] cfg80211_propagate_cac_done_wk+0x3a/0x4a
[<6007e460>] process_scheduled_works+0x3bc/0x60e
[<6007d0ec>] ? move_linked_works+0x4d/0x81
[<6007d120>] ? assign_work+0x0/0xaa
[<6007f81f>] worker_thread+0x220/0x2dc
[<600786ef>] ? set_pf_worker+0x0/0x57
[<60087c96>] ? to_kthread+0x0/0x43
[<6008ab3c>] kthread+0x2d3/0x2e2
[<6007f5ff>] ? worker_thread+0x0/0x2dc
[<6006c05b>] ? calculate_sigpending+0x0/0x56
[<6003b37d>] new_thread_handler+0x4a/0x64
irq event stamp: 614611
hardirqs last enabled at (614621): [<00000000600bc96b>] __up_console_sem+0x82/0xaf
hardirqs last disabled at (614630): [<00000000600bc92c>] __up_console_sem+0x43/0xaf
softirqs last enabled at (614268): [<00000000606c55c6>] __ieee80211_wake_queue+0x933/0x985
softirqs last disabled at (614266): [<00000000606c52d6>] __ieee80211_wake_queue+0x643/0x985 |
| In the Linux kernel, the following vulnerability has been resolved:
xsk: Fix race condition in AF_XDP generic RX path
Move rx_lock from xsk_socket to xsk_buff_pool.
Fix synchronization for shared umem mode in
generic RX path where multiple sockets share
single xsk_buff_pool.
RX queue is exclusive to xsk_socket, while FILL
queue can be shared between multiple sockets.
This could result in race condition where two
CPU cores access RX path of two different sockets
sharing the same umem.
Protect both queues by acquiring spinlock in shared
xsk_buff_pool.
Lock contention may be minimized in the future by some
per-thread FQ buffering.
It's safe and necessary to move spin_lock_bh(rx_lock)
after xsk_rcv_check():
* xs->pool and spinlock_init is synchronized by
xsk_bind() -> xsk_is_bound() memory barriers.
* xsk_rcv_check() may return true at the moment
of xsk_release() or xsk_unbind_dev(),
however this will not cause any data races or
race conditions. xsk_unbind_dev() removes xdp
socket from all maps and waits for completion
of all outstanding rx operations. Packets in
RX path will either complete safely or drop. |
| In the Linux kernel, the following vulnerability has been resolved:
net: dsa: free routing table on probe failure
If complete = true in dsa_tree_setup(), it means that we are the last
switch of the tree which is successfully probing, and we should be
setting up all switches from our probe path.
After "complete" becomes true, dsa_tree_setup_cpu_ports() or any
subsequent function may fail. If that happens, the entire tree setup is
in limbo: the first N-1 switches have successfully finished probing
(doing nothing but having allocated persistent memory in the tree's
dst->ports, and maybe dst->rtable), and switch N failed to probe, ending
the tree setup process before anything is tangible from the user's PoV.
If switch N fails to probe, its memory (ports) will be freed and removed
from dst->ports. However, the dst->rtable elements pointing to its ports,
as created by dsa_link_touch(), will remain there, and will lead to
use-after-free if dereferenced.
If dsa_tree_setup_switches() returns -EPROBE_DEFER, which is entirely
possible because that is where ds->ops->setup() is, we get a kasan
report like this:
==================================================================
BUG: KASAN: slab-use-after-free in mv88e6xxx_setup_upstream_port+0x240/0x568
Read of size 8 at addr ffff000004f56020 by task kworker/u8:3/42
Call trace:
__asan_report_load8_noabort+0x20/0x30
mv88e6xxx_setup_upstream_port+0x240/0x568
mv88e6xxx_setup+0xebc/0x1eb0
dsa_register_switch+0x1af4/0x2ae0
mv88e6xxx_register_switch+0x1b8/0x2a8
mv88e6xxx_probe+0xc4c/0xf60
mdio_probe+0x78/0xb8
really_probe+0x2b8/0x5a8
__driver_probe_device+0x164/0x298
driver_probe_device+0x78/0x258
__device_attach_driver+0x274/0x350
Allocated by task 42:
__kasan_kmalloc+0x84/0xa0
__kmalloc_cache_noprof+0x298/0x490
dsa_switch_touch_ports+0x174/0x3d8
dsa_register_switch+0x800/0x2ae0
mv88e6xxx_register_switch+0x1b8/0x2a8
mv88e6xxx_probe+0xc4c/0xf60
mdio_probe+0x78/0xb8
really_probe+0x2b8/0x5a8
__driver_probe_device+0x164/0x298
driver_probe_device+0x78/0x258
__device_attach_driver+0x274/0x350
Freed by task 42:
__kasan_slab_free+0x48/0x68
kfree+0x138/0x418
dsa_register_switch+0x2694/0x2ae0
mv88e6xxx_register_switch+0x1b8/0x2a8
mv88e6xxx_probe+0xc4c/0xf60
mdio_probe+0x78/0xb8
really_probe+0x2b8/0x5a8
__driver_probe_device+0x164/0x298
driver_probe_device+0x78/0x258
__device_attach_driver+0x274/0x350
The simplest way to fix the bug is to delete the routing table in its
entirety. dsa_tree_setup_routing_table() has no problem in regenerating
it even if we deleted links between ports other than those of switch N,
because dsa_link_touch() first checks whether the port pair already
exists in dst->rtable, allocating if not.
The deletion of the routing table in its entirety already exists in
dsa_tree_teardown(), so refactor that into a function that can also be
called from the tree setup error path.
In my analysis of the commit to blame, it is the one which added
dsa_link elements to dst->rtable. Prior to that, each switch had its own
ds->rtable which is freed when the switch fails to probe. But the tree
is potentially persistent memory. |