Interpretability of API Call Topic Models: An Exploratory Study

Glendowne, Puntitra; Glendowne, Dae

Interpretability of API Call Topic Models: An Exploratory Study

Files

0640.pdf (710.77 KB)

Date

2020-01-07

Authors

Glendowne, Puntitra

Glendowne, Dae

Abstract

Topic modeling is an unsupervised method for discovering semantically coherent combinations of words, called topics, in unstructured text. However, the human interpretability of topics discovered from non-natural language corpora, specifically Windows API call logs, is unknown. Our objective is to explore the coherence of topics and their ability to represent the themes of API calls from malware analysts’ perspective. Three Latent Dirichlet Allocation (LDA) models were fit to a collection of dynamic API call logs. Topics, or behavioral themes, were manually evaluated by malware analysts. The results were compared to existing automated quality measures. Participants were able to accurately determine API calls that did not belong in behavioral themes learned by the 20 topic model. Our results agree with topic coherence measures in terms of highest interpretable topics. The results are not compatible with log-perplexity, which concur with the findings of topic evaluation literature on natural language corpora.

Keywords

Machine Learning and Cyber Threat Intelligence and Analytics, api call, malware analysis, malware behaviors, topic model

URI

http://hdl.handle.net/10125/64535

Extent

10 pages

Related To

Proceedings of the 53rd Hawaii International Conference on System Sciences

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International

Collections

Machine Learning and Cyber Threat Intelligence and Analytics

Full item page

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.

Interpretability of API Call Topic Models: An Exploratory Study

Files

Date

Authors

Contributor

Advisor

Department

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

Volume

Number/Issue

Starting Page

Ending Page

Alternative Title

Abstract

Description

Keywords

Citation

URI

Extent

Format

Geographic Location

Time Period

Related To

Related To (URI)

Table of Contents

Rights

Rights Holder

Local Contexts

Collections