Towards Automated Moderation: Enabling Toxic Language Detection with Transfer Learning and Attention-Based Models

Caron, Matthew
Bäumer, Frederik S.
Müller, Oliver
Journal Title
Journal ISSN
Volume Title
Our world is more connected than ever before. Sadly, however, this highly connected world has made it easier to bully, insult, and propagate hate speech on the cyberspace. Even though researchers and companies alike have started investigating this real-world problem, the question remains as to why users are increasingly being exposed to hate and discrimination online. In fact, the noticeable and persistent increase in harmful language on social media platforms indicates that the situation is, actually, only getting worse. Hence, in this work, we show that contemporary ML methods can help tackle this challenge in an accurate and cost-effective manner. Our experiments demonstrate that a universal approach combining transfer learning methods and state-of-the-art Transformer architectures can trigger the efficient development of toxic language detection models. Consequently, with this universal approach, we provide platform providers with a simplistic approach capable of enabling the automated moderation of user-generated content, and as a result, hope to contribute to making the web a safer place.
Text Analytics, hate speech detection, machine learning, natural language processing, text analytics, toxic language identification
Access Rights
Email if you need this content in ADA-compliant format.