Tracing CAPEC Attack Patterns from CVE Vulnerability Information using Natural Language Processing Technique

Date
2021-01-05
Authors
Kanakogi, Kenta
Washizaki, Hironori
Fukazawa, Yoshiaki
Ogata, Shinpei
Okubo , Takao
Kato, Takehisa
Kanuka, Hideyuki
Hazeyama, Atsuo
Yoshioka, Nobukazu
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
6996
Ending Page
Alternative Title
Abstract
To effectively respond to vulnerabilities, information must not only be collected efficiently and quickly but also the vulnerability and the attack techniques must be understood. A security knowledge repository can collect such information. The Common Vulnerabilities and Exposures (CVE) provides known vulnerabilities of products, while the Common Attack Pattern Enumeration and Classification (CAPEC) stores attack patterns, which are descriptions of the common attributes and approaches employed by adversaries to exploit known weaknesses. Because the information in these two repositories is not directly related, identifying the related CAPEC attack information from the CVE vulnerability information is challenging. One proposed method traces some related CAPEC-ID from CVE-ID through Common Weakness Enumeration (CWE). However, it is not applicable to all patterns. Here, we propose a method to automatically trace the related CAPEC-IDs from CVE-ID using TF-IDF and Doc2Vec. Additionally, we experimentally confirm that TF-IDF is more accurate than Doc2vec.
Description
Keywords
Cybersecurity and Software Assurance, common attack pattern enumeration and classification, common vulnerabilities and exposures, natural language processing, security
Citation
Extent
9 pages
Format
Geographic Location
Time Period
Related To
Proceedings of the 54th Hawaii International Conference on System Sciences
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International
Rights Holder
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.