The growing number of reported software vulnerabilities underscores the need for efficient detection methods, especially for resource-limited organizations. While traditional techniques like fuzzing and symbolic execution are effective, they require significant manual effort. Recent advances in Large Language Models (LLMs) show promise for zero-shot learning, leveraging pre-training on diverse datasets to detect vulnerabilities without fine-tuning. This study evaluates quantized models (e.g., Mistral v0.3), code-specialized models (e.g., CodeQwen 1.5), and fine-tuned approaches like PDBERT. Zero-shot models perform poorly, with a precision below 0.46, and even PDBERT’s high metrics (precision 0.91, specificity 0.99) are undermined by overfitting. These findings emphasize the limitations of current AI solutions and the necessity for approaches tailored to the specific problem.
Guessing as a service: large language models are not yet ready for vulnerability detection
F. Panebianco;S. Longari;S. Zanero;M. Carminati
2025-01-01
Abstract
The growing number of reported software vulnerabilities underscores the need for efficient detection methods, especially for resource-limited organizations. While traditional techniques like fuzzing and symbolic execution are effective, they require significant manual effort. Recent advances in Large Language Models (LLMs) show promise for zero-shot learning, leveraging pre-training on diverse datasets to detect vulnerabilities without fine-tuning. This study evaluates quantized models (e.g., Mistral v0.3), code-specialized models (e.g., CodeQwen 1.5), and fine-tuned approaches like PDBERT. Zero-shot models perform poorly, with a precision below 0.46, and even PDBERT’s high metrics (precision 0.91, specificity 0.99) are undermined by overfitting. These findings emphasize the limitations of current AI solutions and the necessity for approaches tailored to the specific problem.| File | Dimensione | Formato | |
|---|---|---|---|
|
_ITASEC__Survey_LLMs_for_Vulnerability_Detection.pdf
accesso aperto
Descrizione: The paper highlights the challenges of AI-driven vulnerability detection, showing that zero-shot models perform poorly while fine-tuned models like PDBERT suffer from overfitting, emphasizing the need for specialized approaches.
:
Pre-Print (o Pre-Refereeing)
Dimensione
496.12 kB
Formato
Adobe PDF
|
496.12 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


