HoneyCode: Automating Deceptive Software Repositories with Deep Generative Models

Date
2021-01-05
Authors
Nguyen, David
Liebowitz, David
Nepal, Surya
Kanhere, Salil
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
6945
Ending Page
Alternative Title
Abstract
We propose HoneyCode, an architecture for the generation of synthetic software repositories for cyber deception. The synthetic repositories have the characteristics of real software, including language features, file names and extensions, but contain no real intellectual property. The fake software can be used as a honeypot or form part of a deceptive environment. Existing approaches to software repository generation lack scalability due to reliance on hand-crafted structures for specific languages. Our approach is language agnostic and learns the underlying representations of repository structures, filenames and file content through a novel Tree Recurrent Network (TRN) and two recurrent networks (RNN) respectively. Each stage of the sequential generation process utilises features from prior steps, which increases the honey repository’s authenticity and consistency. Experiments show TRN generates tree samples that reduce degree mean maximal distance (MMD) by 90-92% and depth MMD by 75-86% to a held out test data set in comparison to recent deep graph generators and a baseline random tree generator. In addition, our RNN models generate convincing filenames with authentic syntax and realistic file content.
Description
Keywords
Cyber Operations, Defence, and Forensics, cyber defence, deception, generative models, honeypot, neural networks
Citation
Extent
10 pages
Format
Geographic Location
Time Period
Related To
Proceedings of the 54th Hawaii International Conference on System Sciences
Table of Contents
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International
Rights Holder
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.