AI Safety, Cybersecurity, and Inclusion through Advanced Text Analytics
Permanent URI for this collection
Browse
Recent Submissions
Item Adversarial Natural Language Processing: Overview, Challenges and Future Directions(2025-01-07) Shaw, Laxmi; Ansari, Mohammed Wasim; Ekin, TahirNatural language processing (NLP) has gained wider utilization with the emergence of large language models. However, adversarial attacks threaten their reliability. We present an overview of adversarial NLP with an emphasis on challenges, emerging areas and future directions. First, we review attack methods and evaluate the vulnerabilities of popular NLP models. Then, we review defense strategies including adversarial training. We identify key trends and suggest future directions such as the use of Bayesian methods to improve the security and robustness of NLP systems.Item De-Identification of Privacy Sensitive Information in Resumes with GPT-4: An Utility Analysis for Automated Job Role Classification(2025-01-07) Löbner, Sascha; Tronnier, Frederic; Linke, DariaAs organizations face the challenge of managing large amounts of data, privacy concerns have become increasingly prevalent when sharing sensitive privacy information with machine learning experts. This paper addresses the fundamental issue of privacy-sensitive information de-identification by introducing in-prompt de-identification, an approach that exploits the capabilities of large language models. Existing de-identification techniques often struggle to ensure complete privacy, and methods with higher privacy often result in a loss of data utility. In contrast, in-prompt de-identification is capable of generating synthetic, human-readable data samples from given inputs and bridges the gap between privacy and utility. With this article, we contribute to the de-identification of real-world resume data using in-prompt de-identification based on OpenAI’s GPT-4. Notably, our classification model, trained on GPT-4 generated data, shows no significant loss in performance compared to our baseline model trained on the original data.Item Blockchain Based Information Security and Privacy Protection: Challenges and Future Directions using Computational Literature Review(2025-01-07) Shankar, Gauri; Uddin, Md Raihan; Mukta, Saddam; Kumar, Prabhat; Islam, Shareeful; Islam, A.K.M. NajmulBlockchain technology is an emerging digital innovation that has gained immense popularity in enhancing individual security and privacy within information systems. This surge in interest is reflected in the exponential increase in research articles published on blockchain technology, highlighting its growing significance in the digital landscape. However, the rapid proliferation of published research presents significant challenges for manual analysis and synthesis due to the vast volume of information. The complexity and breadth of topics, combined with the inherent limitations of human data processing capabilities, make it difficult to comprehensively analyze and draw meaningful insights from the literature. To this end, we adopted the Computational Literature Review (CLR) to analyse pertinent literature’s impact and topic modelling using the Latent Dirichlet Allocation (LDA) technique. We identified 10 topics related to security and privacy and provided a detailed description of each topic. From the critical analysis, we have observed several limitations and several future directions are provided as an outcome of this review.Item Introduction to the Minitrack on AI Safety, Cybersecurity, and Inclusion through Advanced Text Analytics(2025-01-07) Ochieng, Theodore; Cogburn, Derrick; Wong, Haiman