Support Ticket Anonymization: Advancing Data Privacy with Transformer-Based Named Entity Recognition
Files
Date
2025-01-07
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Narrator
Transcriber
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
1551
Ending Page
Alternative Title
Abstract
Organizations are recognizing the inherent potential of tapping into their existing knowledge base of historical data, employing data-centric and AI-driven systems to ameliorate their customer support process. However, as is often the case with transformative advancements, this vision poses its challenges. Central among them is the concern for privacy and data protection. Before an AI-driven system can be utilized, it is crucial to ensure that the contents of the knowledge base, which often carries sensitive personally identifiable information (PII), are thoroughly anonymized. This paper proposes an anonymization solution tailored for the support ticket data of an industrial automation company. The anonymization solution was developed by comparatively evaluating machine-learning-based approaches based on state-of-the-art transformer architectures. According to the evaluations and experiments in the domain-specific context, the best-performing architectural approach is an ensemble approach, combining multiple transformer-based language models trained to perform Named Entity Recognition with static, pattern-based approaches. Satisfactory results for the use case with an overall recall of PII entities of more than 97%, which therefore come close to state-of-the-art performance from other domains, have been achieved by this approach, which also involved fine-tuning language models on domain data to further improve the performance.
Description
Keywords
Natural Language Processing and Large Language Models Supporting Data Analytics for System Sciences, anonymization, data privacy, pii detection
Citation
Extent
10
Format
Geographic Location
Time Period
Related To
Proceedings of the 58th Hawaii International Conference on System Sciences
Related To (URI)
Table of Contents
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International
Rights Holder
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.