Vulnerability-Lookup

CVE-2025-46560 (GCVE-0-2025-46560)

Vulnerability from cvelistv5 – Published: 2025-04-30 00:24 – Updated: 2025-04-30 13:09

Title

vLLM phi4mm: Quadratic Time Complexity in Input Token Processing leads to denial of service

Summary

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and prior to 0.8.5 are affected by a critical performance vulnerability in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces placeholder tokens (e.g., <|audio_|>, <|image_|>) with repeated tokens based on precomputed lengths. Due to inefficient list concatenation operations, the algorithm exhibits quadratic time complexity (O(n²)), allowing malicious actors to trigger resource exhaustion via specially crafted inputs. This issue has been patched in version 0.8.5.

Severity ?

6.5 (Medium)


                        
                          CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

CWE

CWE-1333 - Inefficient Regular Expression Complexity

Assigner

GitHub_M

References

URL

Tags

	https://github.com/vllm-project/vllm/security/adv…	x_refsource_CONFIRM
	https://github.com/vllm-project/vllm/blob/8cac35b…	x_refsource_MISC

Impacted products

	Vendor	Product	Version
	vllm-project	vllm	Affected: >= 0.8.0, < 0.8.5

Show details on NVD website

JSON

To clipboard

{
  "containers": {
    "adp": [
      {
        "metrics": [
          {
            "other": {
              "content": {
                "id": "CVE-2025-46560",
                "options": [
                  {
                    "Exploitation": "poc"
                  },
                  {
                    "Automatable": "no"
                  },
                  {
                    "Technical Impact": "partial"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2025-04-30T13:09:10.349287Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2025-04-30T13:09:13.422Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "references": [
          {
            "tags": [
              "exploit"
            ],
            "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg"
          }
        ],
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "product": "vllm",
          "vendor": "vllm-project",
          "versions": [
            {
              "status": "affected",
              "version": "\u003e= 0.8.0, \u003c 0.8.5"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and prior to 0.8.5 are affected by a critical performance vulnerability in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces placeholder tokens (e.g., \u003c|audio_|\u003e, \u003c|image_|\u003e) with repeated tokens based on precomputed lengths. Due to \u200b\u200binefficient list concatenation operations\u200b\u200b, the algorithm exhibits \u200b\u200bquadratic time complexity (O(n\u00b2))\u200b\u200b, allowing malicious actors to trigger resource exhaustion via specially crafted inputs. This issue has been patched in version 0.8.5."
        }
      ],
      "metrics": [
        {
          "cvssV3_1": {
            "attackComplexity": "LOW",
            "attackVector": "NETWORK",
            "availabilityImpact": "HIGH",
            "baseScore": 6.5,
            "baseSeverity": "MEDIUM",
            "confidentialityImpact": "NONE",
            "integrityImpact": "NONE",
            "privilegesRequired": "LOW",
            "scope": "UNCHANGED",
            "userInteraction": "NONE",
            "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
            "version": "3.1"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-1333",
              "description": "CWE-1333: Inefficient Regular Expression Complexity",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2025-04-30T00:24:53.750Z",
        "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
        "shortName": "GitHub_M"
      },
      "references": [
        {
          "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg",
          "tags": [
            "x_refsource_CONFIRM"
          ],
          "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg"
        },
        {
          "name": "https://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197"
        }
      ],
      "source": {
        "advisory": "GHSA-vc6m-hm49-g9qg",
        "discovery": "UNKNOWN"
      },
      "title": "vLLM phi4mm: Quadratic Time Complexity in Input Token Processing\u200b leads to denial of service"
    }
  },
  "cveMetadata": {
    "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
    "assignerShortName": "GitHub_M",
    "cveId": "CVE-2025-46560",
    "datePublished": "2025-04-30T00:24:53.750Z",
    "dateReserved": "2025-04-24T21:10:48.174Z",
    "dateUpdated": "2025-04-30T13:09:13.422Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.1",
  "vulnerability-lookup:meta": {
    "vulnrichment": {
      "containers": "{\"adp\": [{\"title\": \"CISA ADP Vulnrichment\", \"metrics\": [{\"other\": {\"type\": \"ssvc\", \"content\": {\"id\": \"CVE-2025-46560\", \"role\": \"CISA Coordinator\", \"options\": [{\"Exploitation\": \"poc\"}, {\"Automatable\": \"no\"}, {\"Technical Impact\": \"partial\"}], \"version\": \"2.0.3\", \"timestamp\": \"2025-04-30T13:09:10.349287Z\"}}}], \"references\": [{\"url\": \"https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg\", \"tags\": [\"exploit\"]}], \"providerMetadata\": {\"orgId\": \"134c704f-9b21-4f2e-91b3-4a467353bcc0\", \"shortName\": \"CISA-ADP\", \"dateUpdated\": \"2025-04-30T13:09:04.715Z\"}}], \"cna\": {\"title\": \"vLLM phi4mm: Quadratic Time Complexity in Input Token Processing\\u200b leads to denial of service\", \"source\": {\"advisory\": \"GHSA-vc6m-hm49-g9qg\", \"discovery\": \"UNKNOWN\"}, \"metrics\": [{\"cvssV3_1\": {\"scope\": \"UNCHANGED\", \"version\": \"3.1\", \"baseScore\": 6.5, \"attackVector\": \"NETWORK\", \"baseSeverity\": \"MEDIUM\", \"vectorString\": \"CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H\", \"integrityImpact\": \"NONE\", \"userInteraction\": \"NONE\", \"attackComplexity\": \"LOW\", \"availabilityImpact\": \"HIGH\", \"privilegesRequired\": \"LOW\", \"confidentialityImpact\": \"NONE\"}}], \"affected\": [{\"vendor\": \"vllm-project\", \"product\": \"vllm\", \"versions\": [{\"status\": \"affected\", \"version\": \"\u003e= 0.8.0, \u003c 0.8.5\"}]}], \"references\": [{\"url\": \"https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg\", \"name\": \"https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg\", \"tags\": [\"x_refsource_CONFIRM\"]}, {\"url\": \"https://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197\", \"name\": \"https://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197\", \"tags\": [\"x_refsource_MISC\"]}], \"descriptions\": [{\"lang\": \"en\", \"value\": \"vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and prior to 0.8.5 are affected by a critical performance vulnerability in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces placeholder tokens (e.g., \u003c|audio_|\u003e, \u003c|image_|\u003e) with repeated tokens based on precomputed lengths. Due to \\u200b\\u200binefficient list concatenation operations\\u200b\\u200b, the algorithm exhibits \\u200b\\u200bquadratic time complexity (O(n\\u00b2))\\u200b\\u200b, allowing malicious actors to trigger resource exhaustion via specially crafted inputs. This issue has been patched in version 0.8.5.\"}], \"problemTypes\": [{\"descriptions\": [{\"lang\": \"en\", \"type\": \"CWE\", \"cweId\": \"CWE-1333\", \"description\": \"CWE-1333: Inefficient Regular Expression Complexity\"}]}], \"providerMetadata\": {\"orgId\": \"a0819718-46f1-4df5-94e2-005712e83aaa\", \"shortName\": \"GitHub_M\", \"dateUpdated\": \"2025-04-30T00:24:53.750Z\"}}}",
      "cveMetadata": "{\"cveId\": \"CVE-2025-46560\", \"state\": \"PUBLISHED\", \"dateUpdated\": \"2025-04-30T13:09:13.422Z\", \"dateReserved\": \"2025-04-24T21:10:48.174Z\", \"assignerOrgId\": \"a0819718-46f1-4df5-94e2-005712e83aaa\", \"datePublished\": \"2025-04-30T00:24:53.750Z\", \"assignerShortName\": \"GitHub_M\"}",
      "dataType": "CVE_RECORD",
      "dataVersion": "5.1"
    }
  }
}

FKIE_CVE-2025-46560

Vulnerability from fkie_nvd - Published: 2025-04-30 01:15 - Updated: 2025-05-28 19:15

Severity ?

6.5 (Medium) - CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H
7.5 (High) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

Summary

References

URL	Tags
security-advisories@github.com	https://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197	Product
security-advisories@github.com	https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg	Exploit, Vendor Advisory
134c704f-9b21-4f2e-91b3-4a467353bcc0	https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg	Exploit, Vendor Advisory

Impacted products

	Vendor	Product	Version
	vllm	vllm	*

JSON

To clipboard

{
  "configurations": [
    {
      "nodes": [
        {
          "cpeMatch": [
            {
              "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*",
              "matchCriteriaId": "19C6D0C7-632B-4AA7-97E5-CCF21EC350E5",
              "versionEndExcluding": "0.8.5",
              "versionStartIncluding": "0.8.0",
              "vulnerable": true
            }
          ],
          "negate": false,
          "operator": "OR"
        }
      ]
    }
  ],
  "cveTags": [],
  "descriptions": [
    {
      "lang": "en",
      "value": "vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and prior to 0.8.5 are affected by a critical performance vulnerability in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces placeholder tokens (e.g., \u003c|audio_|\u003e, \u003c|image_|\u003e) with repeated tokens based on precomputed lengths. Due to \u200b\u200binefficient list concatenation operations\u200b\u200b, the algorithm exhibits \u200b\u200bquadratic time complexity (O(n\u00b2))\u200b\u200b, allowing malicious actors to trigger resource exhaustion via specially crafted inputs. This issue has been patched in version 0.8.5."
    },
    {
      "lang": "es",
      "value": "vLLM es un motor de inferencia y servicio de alto rendimiento y eficiente en memoria para LLM. Las versiones a partir de la 0.8.0 y anteriores a la 0.8.5 se ven afectadas por una vulnerabilidad cr\u00edtica de rendimiento en la l\u00f3gica de preprocesamiento de entrada del tokenizador multimodal. El c\u00f3digo reemplaza din\u00e1micamente los tokens de marcador de posici\u00f3n (p. ej., \u0026lt;|audio_|\u0026gt;, \u0026lt;|image_|\u0026gt;) con tokens repetidos basados ??en longitudes precalculadas. Debido a las ineficientes operaciones de concatenaci\u00f3n de listas, el algoritmo presenta una complejidad temporal cuadr\u00e1tica (O(n\u00b2)), lo que permite a los actores maliciosos activar el agotamiento de recursos mediante entradas especialmente manipuladas. Este problema se ha corregido en la versi\u00f3n 0.8.5."
    }
  ],
  "id": "CVE-2025-46560",
  "lastModified": "2025-05-28T19:15:56.887",
  "metrics": {
    "cvssMetricV31": [
      {
        "cvssData": {
          "attackComplexity": "LOW",
          "attackVector": "NETWORK",
          "availabilityImpact": "HIGH",
          "baseScore": 6.5,
          "baseSeverity": "MEDIUM",
          "confidentialityImpact": "NONE",
          "integrityImpact": "NONE",
          "privilegesRequired": "LOW",
          "scope": "UNCHANGED",
          "userInteraction": "NONE",
          "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
          "version": "3.1"
        },
        "exploitabilityScore": 2.8,
        "impactScore": 3.6,
        "source": "security-advisories@github.com",
        "type": "Secondary"
      },
      {
        "cvssData": {
          "attackComplexity": "LOW",
          "attackVector": "NETWORK",
          "availabilityImpact": "HIGH",
          "baseScore": 7.5,
          "baseSeverity": "HIGH",
          "confidentialityImpact": "NONE",
          "integrityImpact": "NONE",
          "privilegesRequired": "NONE",
          "scope": "UNCHANGED",
          "userInteraction": "NONE",
          "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H",
          "version": "3.1"
        },
        "exploitabilityScore": 3.9,
        "impactScore": 3.6,
        "source": "nvd@nist.gov",
        "type": "Primary"
      }
    ]
  },
  "published": "2025-04-30T01:15:52.097",
  "references": [
    {
      "source": "security-advisories@github.com",
      "tags": [
        "Product"
      ],
      "url": "https://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197"
    },
    {
      "source": "security-advisories@github.com",
      "tags": [
        "Exploit",
        "Vendor Advisory"
      ],
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg"
    },
    {
      "source": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
      "tags": [
        "Exploit",
        "Vendor Advisory"
      ],
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg"
    }
  ],
  "sourceIdentifier": "security-advisories@github.com",
  "vulnStatus": "Analyzed",
  "weaknesses": [
    {
      "description": [
        {
          "lang": "en",
          "value": "CWE-1333"
        }
      ],
      "source": "security-advisories@github.com",
      "type": "Secondary"
    }
  ]
}

GHSA-VC6M-HM49-G9QG

Vulnerability from github – Published: 2025-04-29 16:43 – Updated: 2025-04-30 17:27

Summary

phi4mm: Quadratic Time Complexity in Input Token Processing leads to denial of service

Details

Summary

A critical performance vulnerability has been identified in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces placeholder tokens (e.g., <|audio_|>, <|image_|>) with repeated tokens based on precomputed lengths. Due to inefficient list concatenation operations, the algorithm exhibits quadratic time complexity (O(n²)), allowing malicious actors to trigger resource exhaustion via specially crafted inputs.

Details

Affected Component: input_processor_for_phi4mm function. https://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197

The code modifies the input_ids list in-place using input_ids = input_ids[:i] + tokens + input_ids[i+1:]. Each concatenation operation copies the entire list, leading to O(n) operations per replacement. For k placeholders expanding to m tokens, total time becomes O(kmn), approximating O(n²) in worst-case scenarios.

PoC

Test data demonstrates exponential time growth:

test_cases = [100, 200, 400, 800, 1600, 3200, 6400]
run_times = [0.002, 0.007, 0.028, 0.136, 0.616, 2.707, 11.854]  # seconds

Doubling input size increases runtime by ~4x (consistent with O(n²)).

Impact

Denial-of-Service (DoS): An attacker could submit inputs with many placeholders (e.g., 10,000 <|audio_1|> tokens), causing CPU/memory exhaustion. Example: 10,000 placeholders → ~100 million operations.

Remediation Recommendations

Precompute all placeholder positions and expansion lengths upfront. Replace dynamic list concatenation with a single preallocated array.

# Pseudocode for O(n) solution
new_input_ids = []
for token in input_ids:
    if token is placeholder:
        new_input_ids.extend([token] * precomputed_length)
    else:
        new_input_ids.append(token)

Severity ?

6.5 (Medium)


                  
                    CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Show details on source website

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "vllm"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0.8.0"
            },
            {
              "fixed": "0.8.5"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2025-46560"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-1333"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2025-04-29T16:43:10Z",
    "nvd_published_at": "2025-04-30T01:15:52Z",
    "severity": "MODERATE"
  },
  "details": "### Summary\nA critical performance vulnerability has been identified in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces placeholder tokens (e.g., \u003c|audio_*|\u003e, \u003c|image_*|\u003e) with repeated tokens based on precomputed lengths. Due to \u200b\u200binefficient list concatenation operations\u200b\u200b, the algorithm exhibits \u200b\u200bquadratic time complexity (O(n\u00b2))\u200b\u200b, allowing malicious actors to trigger resource exhaustion via specially crafted inputs.\n\n### Details\n\u200b\u200bAffected Component\u200b\u200b: input_processor_for_phi4mm function.\nhttps://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197\n\nThe code modifies the input_ids list in-place using input_ids = input_ids[:i] + tokens + input_ids[i+1:]. Each concatenation operation copies the entire list, leading to O(n) operations per replacement. For k placeholders expanding to m tokens, total time becomes O(kmn), approximating O(n\u00b2) in worst-case scenarios.\n\n### PoC\nTest data demonstrates exponential time growth:\n```python\ntest_cases = [100, 200, 400, 800, 1600, 3200, 6400]\nrun_times = [0.002, 0.007, 0.028, 0.136, 0.616, 2.707, 11.854]  # seconds\n```\nDoubling input size increases runtime by ~4x (consistent with O(n\u00b2)).\n\n### Impact\n\u200b\u200bDenial-of-Service (DoS):\u200b\u200b An attacker could submit inputs with many placeholders (e.g., 10,000 \u003c|audio_1|\u003e tokens), causing CPU/memory exhaustion.\nExample: 10,000 placeholders \u2192 ~100 million operations.\n\n\n### Remediation Recommendations\u200b\nPrecompute all placeholder positions and expansion lengths upfront.\nReplace dynamic list concatenation with a single preallocated array.\n```python\n# Pseudocode for O(n) solution\nnew_input_ids = []\nfor token in input_ids:\n    if token is placeholder:\n        new_input_ids.extend([token] * precomputed_length)\n    else:\n        new_input_ids.append(token)\n```",
  "id": "GHSA-vc6m-hm49-g9qg",
  "modified": "2025-04-30T17:27:16Z",
  "published": "2025-04-29T16:43:10Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2025-46560"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/vllm-project/vllm"
    },
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
      "type": "CVSS_V3"
    }
  ],
  "summary": "phi4mm: Quadratic Time Complexity in Input Token Processing\u200b leads to denial of service"
}

Sightings

Author	Source	Type	Date

Nomenclature

Seen: The vulnerability was mentioned, discussed, or observed by the user.
Confirmed: The vulnerability has been validated from an analyst's perspective.
Published Proof of Concept: A public proof of concept is available for this vulnerability.
Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
Not confirmed: The user expressed doubt about the validity of the vulnerability.
Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Detection rules are retrieved from Rulezet.

Action not permitted

CVE-2025-46560 (GCVE-0-2025-46560)

FKIE_CVE-2025-46560

GHSA-VC6M-HM49-G9QG

Summary

Details

PoC

Impact

Remediation Recommendations​

Tags

Sightings

Nomenclature

Remediation Recommendations