Cybersecurity in the Age of Artificial Intelligence, AI for Cybersecurity, and Cybersecurity for AI

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 6 of 6
  • Item
    Improving the Adversarial Robustness of Machine Learning-based Phishing Website Detectors: An Autoencoder-based Auxiliary Approach
    (2025-01-07) Gao, Yang; Samtani, Sagar; Shah, Ankit
    Anti-phishing research relies on collaboration between defensive and offensive efforts. The defensive side develops machine learning-based phishing website detectors to protect users from phishing attacks. However, adversaries can manipulate detectable phishing websites into evasive ones as adversarial examples, misleading detectors into classifying them as legitimate. Therefore, offensive efforts are vital to examine the threats posed by adversaries and inform the defensive side to improve the adversarial robustness of detectors. Prevailing approaches to improve adversarial robustness may compromise a detector’s original high performance on clean data (nonadversarial websites) as it becomes more accurate at detecting adversarial examples. To address this, we propose a novel approach using a Graph Convolutional Autoencoder as an auxiliary model to make collaborative decisions with the original detector in distinguishing evasive phishing websites from legitimate ones. We evaluate our approach by enhancing a CNN-based detector against adversarial attacks. Our approach achieves high adversarial robustness while maintaining high performance on clean data compared to retraining and fine-tuning benchmarks.
  • Item
    Depressive Behavior Detection Using Sensor Signal Data: An Attention-based Privacy-Preserving Approach
    (2025-01-07) Yuan, Aijia; Garcia, Edlin; Zhu, Hongyi; Samtani, Sagar
    Security concerns around using personally identifiable information (PII) introduces notable privacy concerns in sensor signal-based depression detection. In this study, we propose a novel attention-based privacy-preserving model that mitigates these concerns. It assigns greater weights to non-PII-releasing sensors and lesser to high-privacy risk sensors, leveraging the principles of differential privacy (DP). We compare the performance of machine learning and deep learning benchmark models with and without PII-releasing sensors. Our results underline a significant performance discrepancy, suggesting potential instability in prediction performance without these sensors. Our proposed model, with a recall, precision, F1 of 0.889, and an AUC of 0.9, illustrates that high-quality results are achievable while considering privacy. This privacy-conscious model holds substantial implications for promoting a more unobtrusive approach to mental healthcare. Furthermore, the model’s potential for secure deployment in wide-reaching digital health applications and collaborative settings enhances its relevance for large-scale mental monitoring while preserving privacy.
  • Item
    Collecting, Linking, and Assessing Machine Learning Open-Source Software: A Large Scale Collection and Vulnerability Assessment Pipeline
    (2025-01-07) Lazarine, Ben; Pulipaka, Srikar; Samtani, Sagar; Venkataraman, Ramesh
    In recent years, Artificial Intelligence (AI) has seen rapid advances in performance and impact,disrupting major industries, including finance and healthcare. Machine learning open-source software(MLOSS) platforms such as GitHub and Hugging Face have contributed significantly to this advancement,enabling AI developers to share, reuse, and collaborate on AI development. While these platforms accelerate AI development, the MLOSS assets they host also contain vulnerabilities that can impact applications that leverage them. To map the MLOSS landscape and understand the vulnerabilities contained within MLOSS on platforms such as GitHub and Hugging Face,we have developed an MLOSS Collection Pipeline.Our pipeline has collected 373,634 models from Hugging Face and 39,115 repositories from GitHub and identified 6,751,739 vulnerabilities. The results of our pipeline offer several promising directions for future research, including vulnerability linking analysis and cross-platform vulnerability propagation identification.
  • Item
    BERT-Cuckoo15: A Comprehensive Framework for Malware Detection Using 15 Dynamic Feature Types
    (2025-01-07) Rabadi, Dima; Y. Loo, Jia; G. Teo, Sin
    Malware detection presents significant challenges due to the need to select features from diverse data sources, such as system calls and registry keys, impacting model accuracy. Existing techniques often rely on a single feature type to reduce feature numbers or require extensive feature engineering, potentially failing to capture intricate relationships between various features. Moreover, these methods usually assume that features are independent, which is not true for complex malware behavior. Despite their success, the reliance on handcrafted features and inability to fully leverage contextual information limits their effectiveness against sophisticated malware. To address these constraints, we introduce BERT-Cuckoo15, a malware detection model that leverages Bidirectional Encoder Representations from Transformers (BERT), to analyze relationships between diverse features derived from the dynamic analysis of samples in the Cuckoo sandbox. The model processes and encodes these features into chunks, allowing for the aggregation of contextual information across different system activities. Our evaluation, conducted on a comprehensive and balanced dataset of 36,770 samples across nine malware types, demonstrates the efficacy of our approach. BERT-Cuckoo15 achieves an accuracy of 97.61%, showcasing its ability to capture complex feature interdependencies and improve malware detection accuracy.
  • Item
    Towards Attribution in Network Attacks: A Deep Learning-Based Robust Framework for Intrusion Detection and Adversarial Toolchain Identification
    (2025-01-07) Bakht, Ahtesham; Shah, Ankit; Bastian, Nathaniel
    Network intrusion detection systems (NIDS) are pivotal in cybersecurity operations centers (CSOCs) for detecting malicious activities. While signature-based NIDS rely on predefined rules, anomaly-based NIDS utilize machine learning (ML) and deep learning (DL) to detect anomalies. However, these models face challenges such as susceptibility to evasion attacks and high false positives and negatives. This study proposes a novel defense framework integrating supervised and unsupervised learning paradigms to enhance NIDS capabilities. The framework accurately identifies known attacks, detects adversarial attacks and their toolchains, and distinguishes novel attacks. Experimental evaluations on benchmark network intrusion data sets demonstrate high detection accuracies. Motivated by the need to attribute attacks and understand adversary motivations, the framework includes a toolchain detection component, crucial for developing comprehensive threat intelligence and improving incident response in CSOCs.