vLLM is an inference and serving engine for large language models (LLMs). From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels (csrc/quantization/gguf/gguf_kernel.cu) causes partial tensor processing. The output tensor is allocated at full size via torch::empty (uninitialized memory), but the dequantize CUDA kernel processes only a truncated number of elements. The unfilled portion of the output tensor retains whatever was previously in GPU memory. In multi-tenant inference deployments, this residual GPU memory may contain tensor data from other users' inference requests, constituting information disclosure. This vulnerability is fixed in 0.23.1rc0.
Metrics
Affected Vendors & Products
References
History
Tue, 23 Jun 2026 01:30:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| First Time appeared |
Vllm-project
Vllm-project vllm |
|
| Vendors & Products |
Vllm-project
Vllm-project vllm |
Mon, 22 Jun 2026 22:45:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| Description | vLLM is an inference and serving engine for large language models (LLMs). From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels (csrc/quantization/gguf/gguf_kernel.cu) causes partial tensor processing. The output tensor is allocated at full size via torch::empty (uninitialized memory), but the dequantize CUDA kernel processes only a truncated number of elements. The unfilled portion of the output tensor retains whatever was previously in GPU memory. In multi-tenant inference deployments, this residual GPU memory may contain tensor data from other users' inference requests, constituting information disclosure. This vulnerability is fixed in 0.23.1rc0. | |
| Title | vLLM GGUF Kernels: int64_t to int truncation of tensor dimensions causes GPU buffer overflow | |
| Weaknesses | CWE-200 CWE-681 |
|
| References |
| |
| Metrics |
cvssV4_0
|
Status: PUBLISHED
Assigner: GitHub_M
Published:
Updated: 2026-06-22T21:55:42.001Z
Reserved: 2026-06-11T15:46:12.316Z
Link: CVE-2026-53923
No data.
No data.
No data.
OpenCVE Enrichment
Updated: 2026-06-23T01:15:16Z