Machine Learning and Cyber Threat Intelligence and Analytics

Permanent URI for this collection


Recent Submissions

Now showing 1 - 5 of 9
  • Item
    Attack Modeling and Mitigation Strategies for Risk-Based Analysis of Networked Medical Devices
    ( 2020-01-07) Hodges, Bronwyn ; Mcdonald, Jeffrey ; Glisson, William ; Jacobs, Mike ; Van Devender, Maureen ; Pardue, Harold
    The escalating integration of network-enabled medical devices raises concerns for both practitioners and academics in terms of introducing new vulnerabilities and attack vectors. This prompts the idea that combining medical device data, security vulnerability enumerations, and attack-modeling data into a single database could enable security analysts to proactively identify potential security weaknesses in medical devices and formulate appropriate mitigation and remediation plans. This study introduces a novel extension to a relational database risk assessment framework by using the open-source tool OVAL to capture device states and compare them to security advisories that warn of threats and vulnerabilities, and where threats and vulnerabilities exist provide mitigation recommendations. The contribution of this research is a proof of concept evaluation that demonstrates the integration of OVAL and CAPEC attack patterns for analysis using a database-driven risk assessment framework.
  • Item
    Network Attack Detection Using an Unsupervised Machine Learning Algorithm
    ( 2020-01-07) Kumar, Avinash ; Glisson, William ; Benton, Ryan
    With the increase in network connectivity in today's web-enabled environments, there is an escalation in cyber-related crimes. This increase in illicit activity prompts organizations to address network security risk issues by attempting to detect malicious activity. This research investigates the application of a MeanShift algorithm to detect an attack on a network. The algorithm is validated against the KDD 99 dataset and presents an accuracy of 81.2% and detection rate of 79.1%. The contribution of this research is two-fold. First, it provides an initial application of a MeanShift algorithm on a network traffic dataset to detect an attack. Second, it provides the foundation for future research involving the application of MeanShift algorithm in the area of network attack detection.
  • Item
    Phishing Sites Detection from a Web Developer’s Perspective Using Machine Learning
    ( 2020-01-07) Zhou, Xin ; Verma, Rakesh
    The Internet has enabled unprecedented communication and new technologies. Concomitantly, it has brought the bane of phishing and exacerbated vulnerabilities. In this paper, we propose a model to detect phishing webpages from a web developer’s perspective. From this standpoint, we design 120 novel features based on content from a webpage, four time-based and two search-based novel features, plus we use 34 other content-based and 11 heuristic features to optimize the model. Moreover, we select Random Committee (Base learner: Random Tree) for our framework since it has the best performance after comparing with six other algorithms: Hellinger Distance Decision Tree, SVM, Logistic Regression, J48, Naive Bayes, and Random Forest. In real-time experiments, the model achieved 99.4% precision and 98.3% MCC with 0.1% false positive rate in 5-fold crossvalidation using the realistic scenario of an unbalanced dataset.
  • Item
    Interpretability of API Call Topic Models: An Exploratory Study
    ( 2020-01-07) Glendowne, Puntitra ; Glendowne, Dae
    Topic modeling is an unsupervised method for discovering semantically coherent combinations of words, called topics, in unstructured text. However, the human interpretability of topics discovered from non-natural language corpora, specifically Windows API call logs, is unknown. Our objective is to explore the coherence of topics and their ability to represent the themes of API calls from malware analysts’ perspective. Three Latent Dirichlet Allocation (LDA) models were fit to a collection of dynamic API call logs. Topics, or behavioral themes, were manually evaluated by malware analysts. The results were compared to existing automated quality measures. Participants were able to accurately determine API calls that did not belong in behavioral themes learned by the 20 topic model. Our results agree with topic coherence measures in terms of highest interpretable topics. The results are not compatible with log-perplexity, which concur with the findings of topic evaluation literature on natural language corpora.
  • Item
    An Unsupervised Approach to DDoS Attack Detection and Mitigation in Near-Real Time
    ( 2020-01-07) Mcandrew, Robert ; Hayne, Stephen ; Wang, Haonan
    We present an approach for Distributed Denial of Service (DDoS) attack detection and mitigation in near-real time. The adaptive unsupervised machine learning methodology is based on volumetric thresholding, Functional Principal Component Analysis, and K-means clustering (with tuning parameters for flexibility), which dissects the dataset into categories of outlier source IP addresses. A probabilistic risk assessment technique is used to assign “threat levels” to potential malicious actors. We use our approach to analyze a synthetic DDoS attack with ground truth, as well as the Network Time Protocol (NTP) amplification attack that occurred during January of 2014 at a large mountain-range university. We demonstrate the speed and capabilities of our technique through replay of the NTP attack. We show that we can detect and attenuate the DDoS within two minutes with significantly reduced volume throughout the six waves of the attack.