AN EXPLORATORY EXAMINATION OF SOFTWARE VULNERABILITY CLASSIFICATION USING LARGE LANGUAGE MODELS
dc.contributor.advisor | Peruma, Anthony | |
dc.contributor.author | Oliveira Araujo, Ana Catarina | |
dc.contributor.department | Computer Science | |
dc.date.accessioned | 2024-07-02T23:41:16Z | |
dc.date.issued | 2024 | |
dc.description.degree | M.S. | |
dc.embargo.liftdate | 2025-06-24 | |
dc.identifier.uri | https://hdl.handle.net/10125/108322 | |
dc.subject | Computer science | |
dc.subject | CVE | |
dc.subject | cybersecurity | |
dc.subject | LLMs | |
dc.subject | software vulnerability | |
dc.subject | VDO | |
dc.subject | vulnerability classification | |
dc.title | AN EXPLORATORY EXAMINATION OF SOFTWARE VULNERABILITY CLASSIFICATION USING LARGE LANGUAGE MODELS | |
dc.type | Thesis | |
dcterms.abstract | Software vulnerabilities are critical weaknesses that can compromise the security of a system. While current research primarily focuses on automating the classification and detection of them using a range of machine learning models, there remains a notable gap in integrating ontologies like the Vulnerability Description Ontology with Large Language Models (LLMs) for enhanced classification accuracy. Our study utilizes the National Vulnerability Database (NVD) and the National Institute of Standards and Technology’s Vulnerability Description Ontology framework to enhance the clas- sification of these vulnerabilities. The methodology involves an in-depth analysis of NVD data and an investigation of the effectiveness of various LLMs to analyze vulnerability descriptions across 27 vulnerability categories in 5 noun groups. Our findings reveal that LLMs, particularly BERT and DistilBERT, demonstrate stronger performance when compared to traditional machine learn- ing models and entropy-based methods. Moreover, while expanding the dataset aims to capture a broader range of vulnerabilities, its effectiveness varies, highlighting the crucial role of annotation quality. This research emphasizes the importance of advanced machine learning techniques and quality data annotation in optimizing vulnerability assessment processes in cybersecurity. | |
dcterms.extent | 64 pages | |
dcterms.language | en | |
dcterms.publisher | University of Hawai'i at Manoa | |
dcterms.rights | All UHM dissertations and theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission from the copyright owner. | |
dcterms.type | Text | |
local.identifier.alturi | http://dissertations.umi.com/hawii:12160 |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- OliveiraAraujo_hawii_0085O_12160.pdf
- Size:
- 2.14 MB
- Format:
- Adobe Portable Document Format