Artificial Intelligence Security: Ensuring Safety, Trustworthiness, and Responsibility in AI Systems
Permanent URI for this collectionhttps://hdl.handle.net/10125/112556
Browse
Recent Submissions
Item type: Item , Detecting Synthetic Text Profiles: Human Discernment Versus AI Analytics(2026-01-06) Luttrell, Regina; Davis, Jason; Welch, CarrieThis study evaluated human vulnerability to synthetic text profiles generated by artificial intelligence. As large language models become more sophisticated, distinguishing between human and AI-generated content grows increasingly difficult. Our findings show that humans consistently struggled to detect synthetic profiles, which highlights the importance of developing AI tools not only for detection but also for supporting human discernment. By establishing a benchmark of current human capabilities, this research offers a snapshot in time - a necessary step toward tracking how human judgment evolves alongside advancing AI systems through a direct comparison of human performance vs analytic detection approaches. Without such records, it becomes difficult to support users in keeping pace with synthetic media and maintaining control over how credibility is evaluated in digital environments.Item type: Item , Quantifying True Robustness: Synonymity-Weighted Similarity for Trustworthy XAI Evaluation(2026-01-06) Burger, ChristopherAdversarial attacks challenge the reliability of Explainable AI (XAI) by altering explanations while the model's output remains unchanged. The success of these attacks on text-based XAI is often judged using standard information retrieval metrics. We argue these measures are poorly suited in the evaluation of trustworthiness, as they treat all word perturbations equally while ignoring synonymity, which can misrepresent an attack's true impact. To address this, we apply synonymity weighting, a method that amends these measures by incorporating the semantic similarity of perturbed words. This produces more accurate vulnerability assessments and provides an important tool for assessing the robustness of AI systems. Our approach prevents the overestimation of attack success, leading to a more faithful understanding of an XAI system’s true resilience against adversarial manipulation.Item type: Item , Detecting Data Poisoning Attacks in Image Datasets Using a Vision-Language Hybrid Pipeline(2026-01-06) Perry, Sabrina; Perry, Sam; Jiang, Yili; Walter, CharlesImage classification systems play a crucial role in real-world applications. However, these systems are vulnerable to data poisoning attacks, where attackers manipulate training data to degrade model accuracy. Existing defenses often struggle to detect manipulations. In this work, we build on a prior model, DynaDetect2.0, which uses Convolutional Neural Network (CNN) based feature extraction and statistical outlier detection. We extend this framework by integrating a Vision Language Model (VLM) to support semantic reasoning during detection. The VLM operates at two stages: it first pre-screens image-label pairs for semantic alignment and later provides natural-language explanations for samples flagged by statistical methods. This hybrid approach combines statistical rigor with semantic interpretability. We evaluate the proposed method using the Imagenette-160 and CIFAR-100 datasets with both clean and poisoned data. Results show that our VLM-assisted detection reduces false positives and improves the identification of mislabeled or anomalous data compared to statistical models alone.Item type: Item , Introduction to the Minitrack on Artificial Intelligence Security: Ensuring Safety, Trustworthiness, and Responsibility in AI Systems(2026-01-06) Devendorf, Erich; Brooks, Tyson; Chin, Shiu-Kai; Sarathy, Sriprakash
