BERT-Cuckoo15: A Comprehensive Framework for Malware Detection Using 15 Dynamic Feature Types

Date

2025-01-07

Contributor

Advisor

Department

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

Volume

Number/Issue

Starting Page

388

Ending Page

Alternative Title

Abstract

Malware detection presents significant challenges due to the need to select features from diverse data sources, such as system calls and registry keys, impacting model accuracy. Existing techniques often rely on a single feature type to reduce feature numbers or require extensive feature engineering, potentially failing to capture intricate relationships between various features. Moreover, these methods usually assume that features are independent, which is not true for complex malware behavior. Despite their success, the reliance on handcrafted features and inability to fully leverage contextual information limits their effectiveness against sophisticated malware. To address these constraints, we introduce BERT-Cuckoo15, a malware detection model that leverages Bidirectional Encoder Representations from Transformers (BERT), to analyze relationships between diverse features derived from the dynamic analysis of samples in the Cuckoo sandbox. The model processes and encodes these features into chunks, allowing for the aggregation of contextual information across different system activities. Our evaluation, conducted on a comprehensive and balanced dataset of 36,770 samples across nine malware types, demonstrates the efficacy of our approach. BERT-Cuckoo15 achieves an accuracy of 97.61%, showcasing its ability to capture complex feature interdependencies and improve malware detection accuracy.

Description

Keywords

Cybersecurity in the Age of Artificial Intelligence, AI for Cybersecurity, and Cybersecurity for AI, malware detection; bert; natural language processing; transformers; feature generation; malicious behavior analysis; dynamic analysis

Citation

Extent

10

Format

Geographic Location

Time Period

Related To

Proceedings of the 58th Hawaii International Conference on System Sciences

Related To (URI)

Table of Contents

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International

Rights Holder

Local Contexts

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.