Machine Learning and Cyber Threat Intelligence and Analytics

Permanent URI for this collection

https://hdl.handle.net/10125/63731

Browse

Now showing 1 - 9 of 9

Attack Modeling and Mitigation Strategies for Risk-Based Analysis of Networked Medical Devices
(2020-01-07) Hodges, Bronwyn; Mcdonald, Jeffrey; Glisson, William; Jacobs, Mike; Van Devender, Maureen; Pardue, Harold
The escalating integration of network-enabled medical devices raises concerns for both practitioners and academics in terms of introducing new vulnerabilities and attack vectors. This prompts the idea that combining medical device data, security vulnerability enumerations, and attack-modeling data into a single database could enable security analysts to proactively identify potential security weaknesses in medical devices and formulate appropriate mitigation and remediation plans. This study introduces a novel extension to a relational database risk assessment framework by using the open-source tool OVAL to capture device states and compare them to security advisories that warn of threats and vulnerabilities, and where threats and vulnerabilities exist provide mitigation recommendations. The contribution of this research is a proof of concept evaluation that demonstrates the integration of OVAL and CAPEC attack patterns for analysis using a database-driven risk assessment framework.
Network Attack Detection Using an Unsupervised Machine Learning Algorithm
(2020-01-07) Kumar, Avinash; Glisson, William; Benton, Ryan
With the increase in network connectivity in today's web-enabled environments, there is an escalation in cyber-related crimes. This increase in illicit activity prompts organizations to address network security risk issues by attempting to detect malicious activity. This research investigates the application of a MeanShift algorithm to detect an attack on a network. The algorithm is validated against the KDD 99 dataset and presents an accuracy of 81.2% and detection rate of 79.1%. The contribution of this research is two-fold. First, it provides an initial application of a MeanShift algorithm on a network traffic dataset to detect an attack. Second, it provides the foundation for future research involving the application of MeanShift algorithm in the area of network attack detection.
Phishing Sites Detection from a Web Developer’s Perspective Using Machine Learning
(2020-01-07) Zhou, Xin; Verma, Rakesh
The Internet has enabled unprecedented communication and new technologies. Concomitantly, it has brought the bane of phishing and exacerbated vulnerabilities. In this paper, we propose a model to detect phishing webpages from a web developer’s perspective. From this standpoint, we design 120 novel features based on content from a webpage, four time-based and two search-based novel features, plus we use 34 other content-based and 11 heuristic features to optimize the model. Moreover, we select Random Committee (Base learner: Random Tree) for our framework since it has the best performance after comparing with six other algorithms: Hellinger Distance Decision Tree, SVM, Logistic Regression, J48, Naive Bayes, and Random Forest. In real-time experiments, the model achieved 99.4% precision and 98.3% MCC with 0.1% false positive rate in 5-fold crossvalidation using the realistic scenario of an unbalanced dataset.
Interpretability of API Call Topic Models: An Exploratory Study
(2020-01-07) Glendowne, Puntitra; Glendowne, Dae
Topic modeling is an unsupervised method for discovering semantically coherent combinations of words, called topics, in unstructured text. However, the human interpretability of topics discovered from non-natural language corpora, specifically Windows API call logs, is unknown. Our objective is to explore the coherence of topics and their ability to represent the themes of API calls from malware analysts’ perspective. Three Latent Dirichlet Allocation (LDA) models were fit to a collection of dynamic API call logs. Topics, or behavioral themes, were manually evaluated by malware analysts. The results were compared to existing automated quality measures. Participants were able to accurately determine API calls that did not belong in behavioral themes learned by the 20 topic model. Our results agree with topic coherence measures in terms of highest interpretable topics. The results are not compatible with log-perplexity, which concur with the findings of topic evaluation literature on natural language corpora.
An Unsupervised Approach to DDoS Attack Detection and Mitigation in Near-Real Time
(2020-01-07) Mcandrew, Robert; Hayne, Stephen; Wang, Haonan
We present an approach for Distributed Denial of Service (DDoS) attack detection and mitigation in near-real time. The adaptive unsupervised machine learning methodology is based on volumetric thresholding, Functional Principal Component Analysis, and K-means clustering (with tuning parameters for flexibility), which dissects the dataset into categories of outlier source IP addresses. A probabilistic risk assessment technique is used to assign “threat levels” to potential malicious actors. We use our approach to analyze a synthetic DDoS attack with ground truth, as well as the Network Time Protocol (NTP) amplification attack that occurred during January of 2014 at a large mountain-range university. We demonstrate the speed and capabilities of our technique through replay of the NTP attack. We show that we can detect and attenuate the DDoS within two minutes with significantly reduced volume throughout the six waves of the attack.
Knock! Knock! Who Is There? Investigating Data Leakage from a Medical Internet of Things Hijacking Attack
(2020-01-07) Flynn, Talon; Grispos, George; Glisson, William; Mahoney, William
The amalgamation of Medical Internet of Things (MIoT) devices into everyday life is influencing the landscape of modern medicine. The implementation of these devices potentially alleviates the pressures and physical demands of healthcare systems through the remote monitoring of patients. However, there are concerns that the emergence of MIoT ecosystems is introducing an assortment of security and privacy challenges. While previous research has shown that multiple vulnerabilities exist within MIoT devices, minimal research investigates potential data leakage from MIoT devices through hijacking attacks. The research contribution of this paper is twofold. First, it provides a proof of concept that certain MIoT devices and their accompanying smartphone applications are vulnerable to hijacking attacks. Second, it highlights the effectiveness of using digital forensics tools as a lens to identify patient and medical device information on a hijacker’s smartphone.
Digit Recognition From Wrist Movements and Security Concerns with Smart Wrist Wearable IOT Devices
(2020-01-07) Leong, Lambert; Wiere, Sean
In this paper, we investigate a potential security vulnerability associated with wrist wearable devices. Hardware components on common wearable devices include an accelerometer and gyroscope, among other sensors. We demonstrate that an accelerometer and gyroscope can pick up enough unique wrist movement information to identify digits being written by a user. With a data set of 400 writing samples, of either the digit zero or the digit one, we constructed a machine learning model to correctly identify the digit being written based on the movements of the wrist. Our model’s performance on an unseen test set resulted in an area under the receiver operating characteristic (AUROC) curve of 1.00. Loading our model onto our fabricated device resulted in 100% accuracy when predicting ten writing samples in real-time. The model’s ability to correctly identify all digits via wrist movement and orientation changes raises security concerns. Our results imply that nefarious individuals may be able to gain sensitive digit based information such as social security, credit card, and medical record numbers from wrist wearable devices.
A Model for Predicting the Likelihood of Successful Exploitation
(2020-01-07) Holm, Hannes; Rodhe, Ioana
This paper presents a model that estimates the likelihood that a detected vulnerability can be exploited. The data used to produce the model was obtained by carrying out an experiment that involved exploit attempts against 1179 different machines within a cyber range. Three machine learning algorithms were tested: support vector machines, random forests and neural networks. The best results were provided by a random forest model. This model has a mean cross-validation accuracy of 98.2% and an F1 score of 0.73.
Introduction to the Minitrack on Machine Learning and Cyber Threat Intelligence and Analytics
(2020-01-07) Choo, Kim-Kwang Raymond; Dehghantanha, Ali

Browse

Recent Submissions