| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
libceph: reset sparse-read state in osd_fault()
When a fault occurs, the connection is abandoned, reestablished, and any
pending operations are retried. The OSD client tracks the progress of a
sparse-read reply using a separate state machine, largely independent of
the messenger's state.
If a connection is lost mid-payload or the sparse-read state machine
returns an error, the sparse-read state is not reset. The OSD client
will then interpret the beginning of a new reply as the continuation of
the old one. If this makes the sparse-read machinery enter a failure
state, it may never recover, producing loops like:
libceph: [0] got 0 extents
libceph: data len 142248331 != extent len 0
libceph: osd0 (1)...:6801 socket error on read
libceph: data len 142248331 != extent len 0
libceph: osd0 (1)...:6801 socket error on read
Therefore, reset the sparse-read state in osd_fault(), ensuring retries
start from a clean state. |
| In the Linux kernel, the following vulnerability has been resolved:
of: unittest: Fix memory leak in unittest_data_add()
In unittest_data_add(), if of_resolve_phandles() fails, the allocated
unittest_data is not freed, leading to a memory leak.
Fix this by using scope-based cleanup helper __free(kfree) for automatic
resource cleanup. This ensures unittest_data is automatically freed when
it goes out of scope in error paths.
For the success path, use retain_and_null_ptr() to transfer ownership
of the memory to the device tree and prevent double freeing. |
| In the Linux kernel, the following vulnerability has been resolved:
netfilter: nf_conncount: update last_gc only when GC has been performed
Currently last_gc is being updated everytime a new connection is
tracked, that means that it is updated even if a GC wasn't performed.
With a sufficiently high packet rate, it is possible to always bypass
the GC, causing the list to grow infinitely.
Update the last_gc value only when a GC has been actually performed. |
| In the Linux kernel, the following vulnerability has been resolved:
bpf, test_run: Subtract size of xdp_frame from allowed metadata size
The xdp_frame structure takes up part of the XDP frame headroom,
limiting the size of the metadata. However, in bpf_test_run, we don't
take this into account, which makes it possible for userspace to supply
a metadata size that is too large (taking up the entire headroom).
If userspace supplies such a large metadata size in live packet mode,
the xdp_update_frame_from_buff() call in xdp_test_run_init_page() call
will fail, after which packet transmission proceeds with an
uninitialised frame structure, leading to the usual Bad Stuff.
The commit in the Fixes tag fixed a related bug where the second check
in xdp_update_frame_from_buff() could fail, but did not add any
additional constraints on the metadata size. Complete the fix by adding
an additional check on the metadata size. Reorder the checks slightly to
make the logic clearer and add a comment. |
| In the Linux kernel, the following vulnerability has been resolved:
mm/damon/sysfs-scheme: cleanup access_pattern subdirs on scheme dir setup failure
When a DAMOS-scheme DAMON sysfs directory setup fails after setup of
access_pattern/ directory, subdirectories of access_pattern/ directory are
not cleaned up. As a result, DAMON sysfs interface is nearly broken until
the system reboots, and the memory for the unremoved directory is leaked.
Cleanup the directories under such failures. |
| In the Linux kernel, the following vulnerability has been resolved:
virtio_net: Fix misalignment bug in struct virtnet_info
Use the new TRAILING_OVERLAP() helper to fix a misalignment bug
along with the following warning:
drivers/net/virtio_net.c:429:46: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
This helper creates a union between a flexible-array member (FAM)
and a set of members that would otherwise follow it (in this case
`u8 rss_hash_key_data[VIRTIO_NET_RSS_MAX_KEY_SIZE];`). This
overlays the trailing members (rss_hash_key_data) onto the FAM
(hash_key_data) while keeping the FAM and the start of MEMBERS aligned.
The static_assert() ensures this alignment remains.
Notice that due to tail padding in flexible `struct
virtio_net_rss_config_trailer`, `rss_trailer.hash_key_data`
(at offset 83 in struct virtnet_info) and `rss_hash_key_data` (at
offset 84 in struct virtnet_info) are misaligned by one byte. See
below:
struct virtio_net_rss_config_trailer {
__le16 max_tx_vq; /* 0 2 */
__u8 hash_key_length; /* 2 1 */
__u8 hash_key_data[]; /* 3 0 */
/* size: 4, cachelines: 1, members: 3 */
/* padding: 1 */
/* last cacheline: 4 bytes */
};
struct virtnet_info {
...
struct virtio_net_rss_config_trailer rss_trailer; /* 80 4 */
/* XXX last struct has 1 byte of padding */
u8 rss_hash_key_data[40]; /* 84 40 */
...
/* size: 832, cachelines: 13, members: 48 */
/* sum members: 801, holes: 8, sum holes: 31 */
/* paddings: 2, sum paddings: 5 */
};
After changes, those members are correctly aligned at offset 795:
struct virtnet_info {
...
union {
struct virtio_net_rss_config_trailer rss_trailer; /* 792 4 */
struct {
unsigned char __offset_to_hash_key_data[3]; /* 792 3 */
u8 rss_hash_key_data[40]; /* 795 40 */
}; /* 792 43 */
}; /* 792 44 */
...
/* size: 840, cachelines: 14, members: 47 */
/* sum members: 801, holes: 8, sum holes: 35 */
/* padding: 4 */
/* paddings: 1, sum paddings: 4 */
/* last cacheline: 8 bytes */
};
As a result, the RSS key passed to the device is shifted by 1
byte: the last byte is cut off, and instead a (possibly
uninitialized) byte is added at the beginning.
As a last note `struct virtio_net_rss_config_hdr *rss_hdr;` is also
moved to the end, since it seems those three members should stick
around together. :) |
| In the Linux kernel, the following vulnerability has been resolved:
mm/damon/sysfs: cleanup attrs subdirs on context dir setup failure
When a context DAMON sysfs directory setup is failed after setup of attrs/
directory, subdirectories of attrs/ directory are not cleaned up. As a
result, DAMON sysfs interface is nearly broken until the system reboots,
and the memory for the unremoved directory is leaked.
Cleanup the directories under such failures. |
| In the Linux kernel, the following vulnerability has been resolved:
ext4: fix iloc.bh leak in ext4_xattr_inode_update_ref
The error branch for ext4_xattr_inode_update_ref forget to release the
refcount for iloc.bh. Find this when review code. |
| In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: hci_uart: fix null-ptr-deref in hci_uart_write_work
hci_uart_set_proto() sets HCI_UART_PROTO_INIT before calling
hci_uart_register_dev(), which calls proto->open() to initialize
hu->priv. However, if a TTY write wakeup occurs during this window,
hci_uart_tx_wakeup() may schedule write_work before hu->priv is
initialized, leading to a NULL pointer dereference in
hci_uart_write_work() when proto->dequeue() accesses hu->priv.
The race condition is:
CPU0 CPU1
---- ----
hci_uart_set_proto()
set_bit(HCI_UART_PROTO_INIT)
hci_uart_register_dev()
tty write wakeup
hci_uart_tty_wakeup()
hci_uart_tx_wakeup()
schedule_work(&hu->write_work)
proto->open(hu)
// initializes hu->priv
hci_uart_write_work()
hci_uart_dequeue()
proto->dequeue(hu)
// accesses hu->priv (NULL!)
Fix this by moving set_bit(HCI_UART_PROTO_INIT) after proto->open()
succeeds, ensuring hu->priv is initialized before any work can be
scheduled. |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: zlib: fix the folio leak on S390 hardware acceleration
[BUG]
After commit aa60fe12b4f4 ("btrfs: zlib: refactor S390x HW acceleration
buffer preparation"), we no longer release the folio of the page cache
of folio returned by btrfs_compress_filemap_get_folio() for S390
hardware acceleration path.
[CAUSE]
Before that commit, we call kumap_local() and folio_put() after handling
each folio.
Although the timing is not ideal (it release previous folio at the
beginning of the loop, and rely on some extra cleanup out of the loop),
it at least handles the folio release correctly.
Meanwhile the refactored code is easier to read, it lacks the call to
release the filemap folio.
[FIX]
Add the missing folio_put() for copy_data_into_buffer(). |
| In the Linux kernel, the following vulnerability has been resolved:
nvmet: fix race in nvmet_bio_done() leading to NULL pointer dereference
There is a race condition in nvmet_bio_done() that can cause a NULL
pointer dereference in blk_cgroup_bio_start():
1. nvmet_bio_done() is called when a bio completes
2. nvmet_req_complete() is called, which invokes req->ops->queue_response(req)
3. The queue_response callback can re-queue and re-submit the same request
4. The re-submission reuses the same inline_bio from nvmet_req
5. Meanwhile, nvmet_req_bio_put() (called after nvmet_req_complete)
invokes bio_uninit() for inline_bio, which sets bio->bi_blkg to NULL
6. The re-submitted bio enters submit_bio_noacct_nocheck()
7. blk_cgroup_bio_start() dereferences bio->bi_blkg, causing a crash:
BUG: kernel NULL pointer dereference, address: 0000000000000028
#PF: supervisor read access in kernel mode
RIP: 0010:blk_cgroup_bio_start+0x10/0xd0
Call Trace:
submit_bio_noacct_nocheck+0x44/0x250
nvmet_bdev_execute_rw+0x254/0x370 [nvmet]
process_one_work+0x193/0x3c0
worker_thread+0x281/0x3a0
Fix this by reordering nvmet_bio_done() to call nvmet_req_bio_put()
BEFORE nvmet_req_complete(). This ensures the bio is cleaned up before
the request can be re-submitted, preventing the race condition. |
| In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: MGMT: Fix memory leak in set_ssp_complete
Fix memory leak in set_ssp_complete() where mgmt_pending_cmd structures
are not freed after being removed from the pending list.
Commit 302a1f674c00 ("Bluetooth: MGMT: Fix possible UAFs") replaced
mgmt_pending_foreach() calls with individual command handling but missed
adding mgmt_pending_free() calls in both error and success paths of
set_ssp_complete(). Other completion functions like set_le_complete()
were fixed correctly in the same commit.
This causes a memory leak of the mgmt_pending_cmd structure and its
associated parameter data for each SSP command that completes.
Add the missing mgmt_pending_free(cmd) calls in both code paths to fix
the memory leak. Also fix the same issue in set_advertising_complete(). |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: mac80211: correctly decode TTLM with default link map
TID-To-Link Mapping (TTLM) elements do not contain any link mapping
presence indicator if a default mapping is used and parsing needs to be
skipped.
Note that access points should not explicitly report an advertised TTLM
with a default mapping as that is the implied mapping if the element is
not included, this is even the case when switching back to the default
mapping. However, mac80211 would incorrectly parse the frame and would
also read one byte beyond the end of the element. |
| In the Linux kernel, the following vulnerability has been resolved:
net: fix segmentation of forwarding fraglist GRO
This patch enhances GSO segment handling by properly checking
the SKB_GSO_DODGY flag for frag_list GSO packets, addressing
low throughput issues observed when a station accesses IPv4
servers via hotspots with an IPv6-only upstream interface.
Specifically, it fixes a bug in GSO segmentation when forwarding
GRO packets containing a frag_list. The function skb_segment_list
cannot correctly process GRO skbs that have been converted by XLAT,
since XLAT only translates the header of the head skb. Consequently,
skbs in the frag_list may remain untranslated, resulting in protocol
inconsistencies and reduced throughput.
To address this, the patch explicitly sets the SKB_GSO_DODGY flag
for GSO packets in XLAT's IPv4/IPv6 protocol translation helpers
(bpf_skb_proto_4_to_6 and bpf_skb_proto_6_to_4). This marks GSO
packets as potentially modified after protocol translation. As a
result, GSO segmentation will avoid using skb_segment_list and
instead falls back to skb_segment for packets with the SKB_GSO_DODGY
flag. This ensures that only safe and fully translated frag_list
packets are processed by skb_segment_list, resolving protocol
inconsistencies and improving throughput when forwarding GRO packets
converted by XLAT. |
| In the Linux kernel, the following vulnerability has been resolved:
can: gs_usb: gs_usb_receive_bulk_callback(): fix error message
Sinc commit 79a6d1bfe114 ("can: gs_usb: gs_usb_receive_bulk_callback():
unanchor URL on usb_submit_urb() error") a failing resubmit URB will print
an info message.
In the case of a short read where netdev has not yet been assigned,
initialize as NULL to avoid dereferencing an undefined value. Also report
the error value of the failed resubmit. |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: do not strictly require dirty metadata threshold for metadata writepages
[BUG]
There is an internal report that over 1000 processes are
waiting at the io_schedule_timeout() of balance_dirty_pages(), causing
a system hang and trigger a kernel coredump.
The kernel is v6.4 kernel based, but the root problem still applies to
any upstream kernel before v6.18.
[CAUSE]
From Jan Kara for his wisdom on the dirty page balance behavior first.
This cgroup dirty limit was what was actually playing the role here
because the cgroup had only a small amount of memory and so the dirty
limit for it was something like 16MB.
Dirty throttling is responsible for enforcing that nobody can dirty
(significantly) more dirty memory than there's dirty limit. Thus when
a task is dirtying pages it periodically enters into balance_dirty_pages()
and we let it sleep there to slow down the dirtying.
When the system is over dirty limit already (either globally or within
a cgroup of the running task), we will not let the task exit from
balance_dirty_pages() until the number of dirty pages drops below the
limit.
So in this particular case, as I already mentioned, there was a cgroup
with relatively small amount of memory and as a result with dirty limit
set at 16MB. A task from that cgroup has dirtied about 28MB worth of
pages in btrfs btree inode and these were practically the only dirty
pages in that cgroup.
So that means the only way to reduce the dirty pages of that cgroup is
to writeback the dirty pages of btrfs btree inode, and only after that
those processes can exit balance_dirty_pages().
Now back to the btrfs part, btree_writepages() is responsible for
writing back dirty btree inode pages.
The problem here is, there is a btrfs internal threshold that if the
btree inode's dirty bytes are below the 32M threshold, it will not
do any writeback.
This behavior is to batch as much metadata as possible so we won't write
back those tree blocks and then later re-COW them again for another
modification.
This internal 32MiB is higher than the existing dirty page size (28MiB),
meaning no writeback will happen, causing a deadlock between btrfs and
cgroup:
- Btrfs doesn't want to write back btree inode until more dirty pages
- Cgroup/MM doesn't want more dirty pages for btrfs btree inode
Thus any process touching that btree inode is put into sleep until
the number of dirty pages is reduced.
Thanks Jan Kara a lot for the analysis of the root cause.
[ENHANCEMENT]
Since kernel commit b55102826d7d ("btrfs: set AS_KERNEL_FILE on the
btree_inode"), btrfs btree inode pages will only be charged to the root
cgroup which should have a much larger limit than btrfs' 32MiB
threshold.
So it should not affect newer kernels.
But for all current LTS kernels, they are all affected by this problem,
and backporting the whole AS_KERNEL_FILE may not be a good idea.
Even for newer kernels I still think it's a good idea to get
rid of the internal threshold at btree_writepages(), since for most cases
cgroup/MM has a better view of full system memory usage than btrfs' fixed
threshold.
For internal callers using btrfs_btree_balance_dirty() since that
function is already doing internal threshold check, we don't need to
bother them.
But for external callers of btree_writepages(), just respect their
requests and write back whatever they want, ignoring the internal
btrfs threshold to avoid such deadlock on btree inode dirty page
balancing. |
| In the Linux kernel, the following vulnerability has been resolved:
perf: sched: Fix perf crash with new is_user_task() helper
In order to do a user space stacktrace the current task needs to be a user
task that has executed in user space. It use to be possible to test if a
task is a user task or not by simply checking the task_struct mm field. If
it was non NULL, it was a user task and if not it was a kernel task.
But things have changed over time, and some kernel tasks now have their
own mm field.
An idea was made to instead test PF_KTHREAD and two functions were used to
wrap this check in case it became more complex to test if a task was a
user task or not[1]. But this was rejected and the C code simply checked
the PF_KTHREAD directly.
It was later found that not all kernel threads set PF_KTHREAD. The io-uring
helpers instead set PF_USER_WORKER and this needed to be added as well.
But checking the flags is still not enough. There's a very small window
when a task exits that it frees its mm field and it is set back to NULL.
If perf were to trigger at this moment, the flags test would say its a
user space task but when perf would read the mm field it would crash with
at NULL pointer dereference.
Now there are flags that can be used to test if a task is exiting, but
they are set in areas that perf may still want to profile the user space
task (to see where it exited). The only real test is to check both the
flags and the mm field.
Instead of making this modification in every location, create a new
is_user_task() helper function that does all the tests needed to know if
it is safe to read the user space memory or not.
[1] https://lore.kernel.org/all/20250425204120.639530125@goodmis.org/ |
| In the Linux kernel, the following vulnerability has been resolved:
octeon_ep: Fix memory leak in octep_device_setup()
In octep_device_setup(), if octep_ctrl_net_init() fails, the function
returns directly without unmapping the mapped resources and freeing the
allocated configuration memory.
Fix this by jumping to the unsupported_dev label, which performs the
necessary cleanup. This aligns with the error handling logic of other
paths in this function.
Compile tested only. Issue found using a prototype static analysis tool
and code review. |
| In the Linux kernel, the following vulnerability has been resolved:
mm/shmem, swap: fix race of truncate and swap entry split
The helper for shmem swap freeing is not handling the order of swap
entries correctly. It uses xa_cmpxchg_irq to erase the swap entry, but it
gets the entry order before that using xa_get_order without lock
protection, and it may get an outdated order value if the entry is split
or changed in other ways after the xa_get_order and before the
xa_cmpxchg_irq.
And besides, the order could grow and be larger than expected, and cause
truncation to erase data beyond the end border. For example, if the
target entry and following entries are swapped in or freed, then a large
folio was added in place and swapped out, using the same entry, the
xa_cmpxchg_irq will still succeed, it's very unlikely to happen though.
To fix that, open code the Xarray cmpxchg and put the order retrieval and
value checking in the same critical section. Also, ensure the order won't
exceed the end border, skip it if the entry goes across the border.
Skipping large swap entries crosses the end border is safe here. Shmem
truncate iterates the range twice, in the first iteration,
find_lock_entries already filtered such entries, and shmem will swapin the
entries that cross the end border and partially truncate the folio (split
the folio or at least zero part of it). So in the second loop here, if we
see a swap entry that crosses the end order, it must at least have its
content erased already.
I observed random swapoff hangs and kernel panics when stress testing
ZSWAP with shmem. After applying this patch, all problems are gone. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/xe/nvm: Fix double-free on aux add failure
After a successful auxiliary_device_init(), aux_dev->dev.release
(xe_nvm_release_dev()) is responsible for the kfree(nvm). When
there is failure with auxiliary_device_add(), driver will call
auxiliary_device_uninit(), which call put_device(). So that the
.release callback will be triggered to free the memory associated
with the auxiliary_device.
Move the kfree(nvm) into the auxiliary_device_init() failure path
and remove the err goto path to fix below error.
"
[ 13.232905] ==================================================================
[ 13.232911] BUG: KASAN: double-free in xe_nvm_init+0x751/0xf10 [xe]
[ 13.233112] Free of addr ffff888120635000 by task systemd-udevd/273
[ 13.233120] CPU: 8 UID: 0 PID: 273 Comm: systemd-udevd Not tainted 6.19.0-rc2-lgci-xe-kernel+ #225 PREEMPT(voluntary)
...
[ 13.233125] Call Trace:
[ 13.233126] <TASK>
[ 13.233127] dump_stack_lvl+0x7f/0xc0
[ 13.233132] print_report+0xce/0x610
[ 13.233136] ? kasan_complete_mode_report_info+0x5d/0x1e0
[ 13.233139] ? xe_nvm_init+0x751/0xf10 [xe]
...
"
v2: drop err goto path. (Alexander)
(cherry picked from commit a3187c0c2bbd947ffff97f90d077ac88f9c2a215) |