| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
idpf: fix possible vport_config NULL pointer deref in remove
Attempting to remove the driver will cause a crash in cases where
the vport failed to initialize. Following trace is from an instance where
the driver failed during an attempt to create a VF:
[ 1661.543624] idpf 0000:84:00.7: Device HW Reset initiated
[ 1722.923726] idpf 0000:84:00.7: Transaction timed-out (op:1 cookie:2900 vc_op:1 salt:29 timeout:60000ms)
[ 1723.353263] BUG: kernel NULL pointer dereference, address: 0000000000000028
...
[ 1723.358472] RIP: 0010:idpf_remove+0x11c/0x200 [idpf]
...
[ 1723.364973] Call Trace:
[ 1723.365475] <TASK>
[ 1723.365972] pci_device_remove+0x42/0xb0
[ 1723.366481] device_release_driver_internal+0x1a9/0x210
[ 1723.366987] pci_stop_bus_device+0x6d/0x90
[ 1723.367488] pci_stop_and_remove_bus_device+0x12/0x20
[ 1723.367971] pci_iov_remove_virtfn+0xbd/0x120
[ 1723.368309] sriov_disable+0x34/0xe0
[ 1723.368643] idpf_sriov_configure+0x58/0x140 [idpf]
[ 1723.368982] sriov_numvfs_store+0xda/0x1c0
Avoid the NULL pointer dereference by adding NULL pointer check for
vport_config[i], before freeing user_config.q_coalesce. |
| In the Linux kernel, the following vulnerability has been resolved:
ice: fix PTP cleanup on driver removal in error path
Improve the cleanup on releasing PTP resources in error path.
The error case might happen either at the driver probe and PTP
feature initialization or on PTP restart (errors in reset handling, NVM
update etc). In both cases, calls to PF PTP cleanup (ice_ptp_cleanup_pf
function) and 'ps_lock' mutex deinitialization were missed.
Additionally, ptp clock was not unregistered in the latter case.
Keep PTP state as 'uninitialized' on init to distinguish between error
scenarios and to avoid resource release duplication at driver removal.
The consequence of missing ice_ptp_cleanup_pf call is the following call
trace dumped when ice_adapter object is freed (port list is not empty,
as it is required at this stage):
[ T93022] ------------[ cut here ]------------
[ T93022] WARNING: CPU: 10 PID: 93022 at
ice/ice_adapter.c:67 ice_adapter_put+0xef/0x100 [ice]
...
[ T93022] RIP: 0010:ice_adapter_put+0xef/0x100 [ice]
...
[ T93022] Call Trace:
[ T93022] <TASK>
[ T93022] ? ice_adapter_put+0xef/0x100 [ice
33d2647ad4f6d866d41eefff1806df37c68aef0c]
[ T93022] ? __warn.cold+0xb0/0x10e
[ T93022] ? ice_adapter_put+0xef/0x100 [ice
33d2647ad4f6d866d41eefff1806df37c68aef0c]
[ T93022] ? report_bug+0xd8/0x150
[ T93022] ? handle_bug+0xe9/0x110
[ T93022] ? exc_invalid_op+0x17/0x70
[ T93022] ? asm_exc_invalid_op+0x1a/0x20
[ T93022] ? ice_adapter_put+0xef/0x100 [ice
33d2647ad4f6d866d41eefff1806df37c68aef0c]
[ T93022] pci_device_remove+0x42/0xb0
[ T93022] device_release_driver_internal+0x19f/0x200
[ T93022] driver_detach+0x48/0x90
[ T93022] bus_remove_driver+0x70/0xf0
[ T93022] pci_unregister_driver+0x42/0xb0
[ T93022] ice_module_exit+0x10/0xdb0 [ice
33d2647ad4f6d866d41eefff1806df37c68aef0c]
...
[ T93022] ---[ end trace 0000000000000000 ]---
[ T93022] ice: module unloaded |
| In the Linux kernel, the following vulnerability has been resolved:
LoongArch: BPF: Disable trampoline for kernel module function trace
The current LoongArch BPF trampoline implementation is incompatible
with tracing functions in kernel modules. This causes several severe
and user-visible problems:
* The `bpf_selftests/module_attach` test fails consistently.
* Kernel lockup when a BPF program is attached to a module function [1].
* Critical kernel modules like WireGuard experience traffic disruption
when their functions are traced with fentry [2].
Given the severity and the potential for other unknown side-effects, it
is safest to disable the feature entirely for now. This patch prevents
the BPF subsystem from allowing trampoline attachments to kernel module
functions on LoongArch.
This is a temporary mitigation until the core issues in the trampoline
code for kernel module handling can be identified and fixed.
[root@fedora bpf]# ./test_progs -a module_attach -v
bpf_testmod.ko is already unloaded.
Loading bpf_testmod.ko...
Successfully loaded bpf_testmod.ko.
test_module_attach:PASS:skel_open 0 nsec
test_module_attach:PASS:set_attach_target 0 nsec
test_module_attach:PASS:set_attach_target_explicit 0 nsec
test_module_attach:PASS:skel_load 0 nsec
libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
test_module_attach:FAIL:skel_attach skeleton attach failed: -524
Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
Successfully unloaded bpf_testmod.ko.
[1]: https://lore.kernel.org/loongarch/CAK3+h2wDmpC-hP4u4pJY8T-yfKyk4yRzpu2LMO+C13FMT58oqQ@mail.gmail.com/
[2]: https://lore.kernel.org/loongarch/CAK3+h2wYcpc+OwdLDUBvg2rF9rvvyc5amfHT-KcFaK93uoELPg@mail.gmail.com/ |
| In the Linux kernel, the following vulnerability has been resolved:
Input: pegasus-notetaker - fix potential out-of-bounds access
In the pegasus_notetaker driver, the pegasus_probe() function allocates
the URB transfer buffer using the wMaxPacketSize value from
the endpoint descriptor. An attacker can use a malicious USB descriptor
to force the allocation of a very small buffer.
Subsequently, if the device sends an interrupt packet with a specific
pattern (e.g., where the first byte is 0x80 or 0x42),
the pegasus_parse_packet() function parses the packet without checking
the allocated buffer size. This leads to an out-of-bounds memory access. |
| In the Linux kernel, the following vulnerability has been resolved:
cifs: fix memory leak in smb3_fs_context_parse_param error path
Add proper cleanup of ctx->source and fc->source to the
cifs_parse_mount_err error handler. This ensures that memory allocated
for the source strings is correctly freed on all error paths, matching
the cleanup already performed in the success path by
smb3_cleanup_fs_context_contents().
Pointers are also set to NULL after freeing to prevent potential
double-free issues.
This change fixes a memory leak originally detected by syzbot. The
leak occurred when processing Opt_source mount options if an error
happened after ctx->source and fc->source were successfully
allocated but before the function completed.
The specific leak sequence was:
1. ctx->source = smb3_fs_context_fullpath(ctx, '/') allocates memory
2. fc->source = kstrdup(ctx->source, GFP_KERNEL) allocates more memory
3. A subsequent error jumps to cifs_parse_mount_err
4. The old error handler freed passwords but not the source strings,
causing the memory to leak.
This issue was not addressed by commit e8c73eb7db0a ("cifs: client:
fix memory leak in smb3_fs_context_parse_param"), which only fixed
leaks from repeated fsconfig() calls but not this error path.
Patch updated with minor change suggested by kernel test robot |
| In the Linux kernel, the following vulnerability has been resolved:
net: ethernet: ti: netcp: Standardize knav_dma_open_channel to return NULL on error
Make knav_dma_open_channel consistently return NULL on error instead
of ERR_PTR. Currently the header include/linux/soc/ti/knav_dma.h
returns NULL when the driver is disabled, but the driver
implementation does not even return NULL or ERR_PTR on failure,
causing inconsistency in the users. This results in a crash in
netcp_free_navigator_resources as followed (trimmed):
Unhandled fault: alignment exception (0x221) at 0xfffffff2
[fffffff2] *pgd=80000800207003, *pmd=82ffda003, *pte=00000000
Internal error: : 221 [#1] SMP ARM
Modules linked in:
CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-rc7 #1 NONE
Hardware name: Keystone
PC is at knav_dma_close_channel+0x30/0x19c
LR is at netcp_free_navigator_resources+0x2c/0x28c
[... TRIM...]
Call trace:
knav_dma_close_channel from netcp_free_navigator_resources+0x2c/0x28c
netcp_free_navigator_resources from netcp_ndo_open+0x430/0x46c
netcp_ndo_open from __dev_open+0x114/0x29c
__dev_open from __dev_change_flags+0x190/0x208
__dev_change_flags from netif_change_flags+0x1c/0x58
netif_change_flags from dev_change_flags+0x38/0xa0
dev_change_flags from ip_auto_config+0x2c4/0x11f0
ip_auto_config from do_one_initcall+0x58/0x200
do_one_initcall from kernel_init_freeable+0x1cc/0x238
kernel_init_freeable from kernel_init+0x1c/0x12c
kernel_init from ret_from_fork+0x14/0x38
[... TRIM...]
Standardize the error handling by making the function return NULL on
all error conditions. The API is used in just the netcp_core.c so the
impact is limited.
Note, this change, in effect reverts commit 5b6cb43b4d62 ("net:
ethernet: ti: netcp_core: return error while dma channel open issue"),
but provides a less error prone implementation. |
| In the Linux kernel, the following vulnerability has been resolved:
mptcp: Fix proto fallback detection with BPF
The sockmap feature allows bpf syscall from userspace, or based
on bpf sockops, replacing the sk_prot of sockets during protocol stack
processing with sockmap's custom read/write interfaces.
'''
tcp_rcv_state_process()
syn_recv_sock()/subflow_syn_recv_sock()
tcp_init_transfer(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB)
bpf_skops_established <== sockops
bpf_sock_map_update(sk) <== call bpf helper
tcp_bpf_update_proto() <== update sk_prot
'''
When the server has MPTCP enabled but the client sends a TCP SYN
without MPTCP, subflow_syn_recv_sock() performs a fallback on the
subflow, replacing the subflow sk's sk_prot with the native sk_prot.
'''
subflow_syn_recv_sock()
subflow_ulp_fallback()
subflow_drop_ctx()
mptcp_subflow_ops_undo_override()
'''
Then, this subflow can be normally used by sockmap, which replaces the
native sk_prot with sockmap's custom sk_prot. The issue occurs when the
user executes accept::mptcp_stream_accept::mptcp_fallback_tcp_ops().
Here, it uses sk->sk_prot to compare with the native sk_prot, but this
is incorrect when sockmap is used, as we may incorrectly set
sk->sk_socket->ops.
This fix uses the more generic sk_family for the comparison instead.
Additionally, this also prevents a WARNING from occurring:
result from ./scripts/decode_stacktrace.sh:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 337 at net/mptcp/protocol.c:68 mptcp_stream_accept \
(net/mptcp/protocol.c:4005)
Modules linked in:
...
PKRU: 55555554
Call Trace:
<TASK>
do_accept (net/socket.c:1989)
__sys_accept4 (net/socket.c:2028 net/socket.c:2057)
__x64_sys_accept (net/socket.c:2067)
x64_sys_call (arch/x86/entry/syscall_64.c:41)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
RIP: 0033:0x7f87ac92b83d
---[ end trace 0000000000000000 ]--- |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: target: tcm_loop: Fix segfault in tcm_loop_tpg_address_show()
If the allocation of tl_hba->sh fails in tcm_loop_driver_probe() and we
attempt to dereference it in tcm_loop_tpg_address_show() we will get a
segfault, see below for an example. So, check tl_hba->sh before
dereferencing it.
Unable to allocate struct scsi_host
BUG: kernel NULL pointer dereference, address: 0000000000000194
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 8356 Comm: tokio-runtime-w Not tainted 6.6.104.2-4.azl3 #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 09/28/2024
RIP: 0010:tcm_loop_tpg_address_show+0x2e/0x50 [tcm_loop]
...
Call Trace:
<TASK>
configfs_read_iter+0x12d/0x1d0 [configfs]
vfs_read+0x1b5/0x300
ksys_read+0x6f/0xf0
... |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: ufs: ufs-qcom: Fix UFS OCP issue during UFS power down (PC=3)
According to UFS specifications, the power-off sequence for a UFS device
includes:
- Sending an SSU command with Power_Condition=3 and await a response.
- Asserting RST_N low.
- Turning off REF_CLK.
- Turning off VCC.
- Turning off VCCQ/VCCQ2.
As part of ufs shutdown, after the SSU command completion, asserting
hardware reset (HWRST) triggers the device firmware to wake up and
execute its reset routine. This routine initializes hardware blocks and
takes a few milliseconds to complete. During this time, the ICCQ draws a
large current.
This large ICCQ current may cause issues for the regulator which is
supplying power to UFS, because the turn off request from UFS driver to
the regulator framework will be immediately followed by low power
mode(LPM) request by regulator framework. This is done by framework
because UFS which is the only client is requesting for disable. So if
the rail is still in the process of shutting down while ICCQ exceeds LPM
current thresholds, and LPM mode is activated in hardware during this
state, it may trigger an overcurrent protection (OCP) fault in the
regulator.
To prevent this, a 10ms delay is added after asserting HWRST. This
allows the reset operation to complete while power rails remain active
and in high-power mode.
Currently there is no way for Host to query whether the reset is
completed or not and hence this the delay is based on experiments with
Qualcomm UFS controllers across multiple UFS vendors. |
| In the Linux kernel, the following vulnerability has been resolved:
binfmt_misc: restore write access before closing files opened by open_exec()
bm_register_write() opens an executable file using open_exec(), which
internally calls do_open_execat() and denies write access on the file to
avoid modification while it is being executed.
However, when an error occurs, bm_register_write() closes the file using
filp_close() directly. This does not restore the write permission, which
may cause subsequent write operations on the same file to fail.
Fix this by calling exe_file_allow_write_access() before filp_close() to
restore the write permission properly. |
| In the Linux kernel, the following vulnerability has been resolved:
nilfs2: avoid having an active sc_timer before freeing sci
Because kthread_stop did not stop sc_task properly and returned -EINTR,
the sc_timer was not properly closed, ultimately causing the problem [1]
reported by syzbot when freeing sci due to the sc_timer not being closed.
Because the thread sc_task main function nilfs_segctor_thread() returns 0
when it succeeds, when the return value of kthread_stop() is not 0 in
nilfs_segctor_destroy(), we believe that it has not properly closed
sc_timer.
We use timer_shutdown_sync() to sync wait for sc_timer to shutdown, and
set the value of sc_task to NULL under the protection of lock
sc_state_lock, so as to avoid the issue caused by sc_timer not being
properly shutdowned.
[1]
ODEBUG: free active (active state 0) object: 00000000dacb411a object type: timer_list hint: nilfs_construction_timeout
Call trace:
nilfs_segctor_destroy fs/nilfs2/segment.c:2811 [inline]
nilfs_detach_log_writer+0x668/0x8cc fs/nilfs2/segment.c:2877
nilfs_put_super+0x4c/0x12c fs/nilfs2/super.c:509 |
| In the Linux kernel, the following vulnerability has been resolved:
ipv4: route: Prevent rt_bind_exception() from rebinding stale fnhe
The sit driver's packet transmission path calls: sit_tunnel_xmit() ->
update_or_create_fnhe(), which lead to fnhe_remove_oldest() being called
to delete entries exceeding FNHE_RECLAIM_DEPTH+random.
The race window is between fnhe_remove_oldest() selecting fnheX for
deletion and the subsequent kfree_rcu(). During this time, the
concurrent path's __mkroute_output() -> find_exception() can fetch the
soon-to-be-deleted fnheX, and rt_bind_exception() then binds it with a
new dst using a dst_hold(). When the original fnheX is freed via RCU,
the dst reference remains permanently leaked.
CPU 0 CPU 1
__mkroute_output()
find_exception() [fnheX]
update_or_create_fnhe()
fnhe_remove_oldest() [fnheX]
rt_bind_exception() [bind dst]
RCU callback [fnheX freed, dst leak]
This issue manifests as a device reference count leak and a warning in
dmesg when unregistering the net device:
unregister_netdevice: waiting for sitX to become free. Usage count = N
Ido Schimmel provided the simple test validation method [1].
The fix clears 'oldest->fnhe_daddr' before calling fnhe_flush_routes().
Since rt_bind_exception() checks this field, setting it to zero prevents
the stale fnhe from being reused and bound to a new dst just before it
is freed.
[1]
ip netns add ns1
ip -n ns1 link set dev lo up
ip -n ns1 address add 192.0.2.1/32 dev lo
ip -n ns1 link add name dummy1 up type dummy
ip -n ns1 route add 192.0.2.2/32 dev dummy1
ip -n ns1 link add name gretap1 up arp off type gretap \
local 192.0.2.1 remote 192.0.2.2
ip -n ns1 route add 198.51.0.0/16 dev gretap1
taskset -c 0 ip netns exec ns1 mausezahn gretap1 \
-A 198.51.100.1 -B 198.51.0.0/16 -t udp -p 1000 -c 0 -q &
taskset -c 2 ip netns exec ns1 mausezahn gretap1 \
-A 198.51.100.1 -B 198.51.0.0/16 -t udp -p 1000 -c 0 -q &
sleep 10
ip netns pids ns1 | xargs kill
ip netns del ns1 |
| In the Linux kernel, the following vulnerability has been resolved:
net: netpoll: fix incorrect refcount handling causing incorrect cleanup
commit efa95b01da18 ("netpoll: fix use after free") incorrectly
ignored the refcount and prematurely set dev->npinfo to NULL during
netpoll cleanup, leading to improper behavior and memory leaks.
Scenario causing lack of proper cleanup:
1) A netpoll is associated with a NIC (e.g., eth0) and netdev->npinfo is
allocated, and refcnt = 1
- Keep in mind that npinfo is shared among all netpoll instances. In
this case, there is just one.
2) Another netpoll is also associated with the same NIC and
npinfo->refcnt += 1.
- Now dev->npinfo->refcnt = 2;
- There is just one npinfo associated to the netdev.
3) When the first netpolls goes to clean up:
- The first cleanup succeeds and clears np->dev->npinfo, ignoring
refcnt.
- It basically calls `RCU_INIT_POINTER(np->dev->npinfo, NULL);`
- Set dev->npinfo = NULL, without proper cleanup
- No ->ndo_netpoll_cleanup() is either called
4) Now the second target tries to clean up
- The second cleanup fails because np->dev->npinfo is already NULL.
* In this case, ops->ndo_netpoll_cleanup() was never called, and
the skb pool is not cleaned as well (for the second netpoll
instance)
- This leaks npinfo and skbpool skbs, which is clearly reported by
kmemleak.
Revert commit efa95b01da18 ("netpoll: fix use after free") and adds
clarifying comments emphasizing that npinfo cleanup should only happen
once the refcount reaches zero, ensuring stable and correct netpoll
behavior. |
| In the Linux kernel, the following vulnerability has been resolved:
posix-timers: Plug potential memory leak in do_timer_create()
When posix timer creation is set to allocate a given timer ID and the
access to the user space value faults, the function terminates without
freeing the already allocated posix timer structure.
Move the allocation after the user space access to cure that.
[ tglx: Massaged change log ] |
| In the Linux kernel, the following vulnerability has been resolved:
vmw_balloon: indicate success when effectively deflating during migration
When migrating a balloon page, we first deflate the old page to then
inflate the new page.
However, if inflating the new page succeeded, we effectively deflated the
old page, reducing the balloon size.
In that case, the migration actually worked: similar to migrating+
immediately deflating the new page. The old page will be freed back to
the buddy.
Right now, the core will leave the page be marked as isolated (as we
returned an error). When later trying to putback that page, we will run
into the WARN_ON_ONCE() in balloon_page_putback().
That handling was changed in commit 3544c4faccb8 ("mm/balloon_compaction:
stop using __ClearPageMovable()"); before that change, we would have
tolerated that way of handling it.
To fix it, let's just return 0 in that case, making the core effectively
just clear the "isolated" flag + freeing it back to the buddy as if the
migration succeeded. Note that the new page will also get freed when the
core puts the last reference.
Note that this also makes it all be more consistent: we will no longer
unisolate the page in the balloon driver while keeping it marked as being
isolated in migration core.
This was found by code inspection. |
| In the Linux kernel, the following vulnerability has been resolved:
mm: don't spin in add_stack_record when gfp flags don't allow
syzbot was able to find the following path:
add_stack_record_to_list mm/page_owner.c:182 [inline]
inc_stack_record_count mm/page_owner.c:214 [inline]
__set_page_owner+0x2c3/0x4a0 mm/page_owner.c:333
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x240/0x2a0 mm/page_alloc.c:1851
prep_new_page mm/page_alloc.c:1859 [inline]
get_page_from_freelist+0x21e4/0x22c0 mm/page_alloc.c:3858
alloc_pages_nolock_noprof+0x94/0x120 mm/page_alloc.c:7554
Don't spin in add_stack_record_to_list() when it is called
from *_nolock() context. |
| In the Linux kernel, the following vulnerability has been resolved:
comedi: multiq3: sanitize config options in multiq3_attach()
Syzbot identified an issue [1] in multiq3_attach() that induces a
task timeout due to open() or COMEDI_DEVCONFIG ioctl operations,
specifically, in the case of multiq3 driver.
This problem arose when syzkaller managed to craft weird configuration
options used to specify the number of channels in encoder subdevice.
If a particularly great number is passed to s->n_chan in
multiq3_attach() via it->options[2], then multiple calls to
multiq3_encoder_reset() at the end of driver-specific attach() method
will be running for minutes, thus blocking tasks and affected devices
as well.
While this issue is most likely not too dangerous for real-life
devices, it still makes sense to sanitize configuration inputs. Enable
a sensible limit on the number of encoder chips (4 chips max, each
with 2 channels) to stop this behaviour from manifesting.
[1] Syzbot crash:
INFO: task syz.2.19:6067 blocked for more than 143 seconds.
...
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5254 [inline]
__schedule+0x17c4/0x4d60 kernel/sched/core.c:6862
__schedule_loop kernel/sched/core.c:6944 [inline]
schedule+0x165/0x360 kernel/sched/core.c:6959
schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:7016
__mutex_lock_common kernel/locking/mutex.c:676 [inline]
__mutex_lock+0x7e6/0x1350 kernel/locking/mutex.c:760
comedi_open+0xc0/0x590 drivers/comedi/comedi_fops.c:2868
chrdev_open+0x4cc/0x5e0 fs/char_dev.c:414
do_dentry_open+0x953/0x13f0 fs/open.c:965
vfs_open+0x3b/0x340 fs/open.c:1097
... |
| In the Linux kernel, the following vulnerability has been resolved:
KVM: SVM: Don't skip unrelated instruction if INT3/INTO is replaced
When re-injecting a soft interrupt from an INT3, INT0, or (select) INTn
instruction, discard the exception and retry the instruction if the code
stream is changed (e.g. by a different vCPU) between when the CPU
executes the instruction and when KVM decodes the instruction to get the
next RIP.
As effectively predicted by commit 6ef88d6e36c2 ("KVM: SVM: Re-inject
INT3/INTO instead of retrying the instruction"), failure to verify that
the correct INTn instruction was decoded can effectively clobber guest
state due to decoding the wrong instruction and thus specifying the
wrong next RIP.
The bug most often manifests as "Oops: int3" panics on static branch
checks in Linux guests. Enabling or disabling a static branch in Linux
uses the kernel's "text poke" code patching mechanism. To modify code
while other CPUs may be executing that code, Linux (temporarily)
replaces the first byte of the original instruction with an int3 (opcode
0xcc), then patches in the new code stream except for the first byte,
and finally replaces the int3 with the first byte of the new code
stream. If a CPU hits the int3, i.e. executes the code while it's being
modified, then the guest kernel must look up the RIP to determine how to
handle the #BP, e.g. by emulating the new instruction. If the RIP is
incorrect, then this lookup fails and the guest kernel panics.
The bug reproduces almost instantly by hacking the guest kernel to
repeatedly check a static branch[1] while running a drgn script[2] on
the host to constantly swap out the memory containing the guest's TSS.
[1]: https://gist.github.com/osandov/44d17c51c28c0ac998ea0334edf90b5a
[2]: https://gist.github.com/osandov/10e45e45afa29b11e0c7209247afc00b |
| In the Linux kernel, the following vulnerability has been resolved:
crypto: zstd - fix double-free in per-CPU stream cleanup
The crypto/zstd module has a double-free bug that occurs when multiple
tfms are allocated and freed.
The issue happens because zstd_streams (per-CPU contexts) are freed in
zstd_exit() during every tfm destruction, rather than being managed at
the module level. When multiple tfms exist, each tfm exit attempts to
free the same shared per-CPU streams, resulting in a double-free.
This leads to a stack trace similar to:
BUG: Bad page state in process kworker/u16:1 pfn:106fd93
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x106fd93
flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
page_type: 0xffffffff()
raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: nonzero entire_mapcount
Modules linked in: ...
CPU: 3 UID: 0 PID: 2506 Comm: kworker/u16:1 Kdump: loaded Tainted: G B
Hardware name: ...
Workqueue: btrfs-delalloc btrfs_work_helper
Call Trace:
<TASK>
dump_stack_lvl+0x5d/0x80
bad_page+0x71/0xd0
free_unref_page_prepare+0x24e/0x490
free_unref_page+0x60/0x170
crypto_acomp_free_streams+0x5d/0xc0
crypto_acomp_exit_tfm+0x23/0x50
crypto_destroy_tfm+0x60/0xc0
...
Change the lifecycle management of zstd_streams to free the streams only
once during module cleanup. |
| In the Linux kernel, the following vulnerability has been resolved:
ext4: refresh inline data size before write operations
The cached ei->i_inline_size can become stale between the initial size
check and when ext4_update_inline_data()/ext4_create_inline_data() use
it. Although ext4_get_max_inline_size() reads the correct value at the
time of the check, concurrent xattr operations can modify i_inline_size
before ext4_write_lock_xattr() is acquired.
This causes ext4_update_inline_data() and ext4_create_inline_data() to
work with stale capacity values, leading to a BUG_ON() crash in
ext4_write_inline_data():
kernel BUG at fs/ext4/inline.c:1331!
BUG_ON(pos + len > EXT4_I(inode)->i_inline_size);
The race window:
1. ext4_get_max_inline_size() reads i_inline_size = 60 (correct)
2. Size check passes for 50-byte write
3. [Another thread adds xattr, i_inline_size changes to 40]
4. ext4_write_lock_xattr() acquires lock
5. ext4_update_inline_data() uses stale i_inline_size = 60
6. Attempts to write 50 bytes but only 40 bytes actually available
7. BUG_ON() triggers
Fix this by recalculating i_inline_size via ext4_find_inline_data_nolock()
immediately after acquiring xattr_sem. This ensures ext4_update_inline_data()
and ext4_create_inline_data() work with current values that are protected
from concurrent modifications.
This is similar to commit a54c4613dac1 ("ext4: fix race writing to an
inline_data file while its xattrs are changing") which fixed i_inline_off
staleness. This patch addresses the related i_inline_size staleness issue. |