Please use this identifier to cite or link to this item:

WeSAL: Applying Active Supervision to Find High-quality Labels at Industrial Scale

File Size Format  
0023.pdf 1.45 MB Adobe PDF View/Open

Item Summary

Title:WeSAL: Applying Active Supervision to Find High-quality Labels at Industrial Scale
Authors:Nashaat, Mona
Ghosh, Aindrila
Miller, James
Quader, Shaikh
Keywords:Collaboration for Data Science
active learning
machine learning
supervised learning
show 1 moreweak supervision
show less
Date Issued:07 Jan 2020
Abstract:Obtaining hand-labeled training data is one of the most tedious and expensive parts of the machine learning pipeline. Previous approaches, such as active learning aim at optimizing user engagement to acquire accurate labels. Other methods utilize weak supervision to generate low-quality labels at scale. In this paper, we propose a new hybrid method named WeSAL that incorporates Weak Supervision sources with Active Learning to keep humans in the loop. The method aims to generate large-scale training labels while enhancing its quality by involving domain experience. To evaluate WeSAL, we compare it with two-state-of-the-art labeling techniques, Active Learning and Data Programming. The experiments use five publicly available datasets and a real-world dataset of 1.5M records provided by our industrial partner, IBM. The results indicate that WeSAL can generate large-scale, high-quality labels while reducing the labeling cost by up to 68% compared to active learning.
Pages/Duration:10 pages
Rights:Attribution-NonCommercial-NoDerivatives 4.0 International
Appears in Collections: Collaboration for Data Science

Please email if you need this content in ADA-compliant format.

This item is licensed under a Creative Commons License Creative Commons