Interpretability of API Call Topic Models: An Exploratory Study

dc.contributor.authorGlendowne, Puntitra
dc.contributor.authorGlendowne, Dae
dc.date.accessioned2020-01-04T08:31:47Z
dc.date.available2020-01-04T08:31:47Z
dc.date.issued2020-01-07
dc.description.abstractTopic modeling is an unsupervised method for discovering semantically coherent combinations of words, called topics, in unstructured text. However, the human interpretability of topics discovered from non-natural language corpora, specifically Windows API call logs, is unknown. Our objective is to explore the coherence of topics and their ability to represent the themes of API calls from malware analysts’ perspective. Three Latent Dirichlet Allocation (LDA) models were fit to a collection of dynamic API call logs. Topics, or behavioral themes, were manually evaluated by malware analysts. The results were compared to existing automated quality measures. Participants were able to accurately determine API calls that did not belong in behavioral themes learned by the 20 topic model. Our results agree with topic coherence measures in terms of highest interpretable topics. The results are not compatible with log-perplexity, which concur with the findings of topic evaluation literature on natural language corpora.
dc.format.extent10 pages
dc.identifier.doi10.24251/HICSS.2020.793
dc.identifier.isbn978-0-9981331-3-3
dc.identifier.urihttp://hdl.handle.net/10125/64535
dc.language.isoeng
dc.relation.ispartofProceedings of the 53rd Hawaii International Conference on System Sciences
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectMachine Learning and Cyber Threat Intelligence and Analytics
dc.subjectapi call
dc.subjectmalware analysis
dc.subjectmalware behaviors
dc.subjecttopic model
dc.titleInterpretability of API Call Topic Models: An Exploratory Study
dc.typeConference Paper
dc.type.dcmiText

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
0640.pdf
Size:
710.77 KB
Format:
Adobe Portable Document Format