WeSAL: Applying Active Supervision to Find High-quality Labels at Industrial Scale

dc.contributor.author Nashaat, Mona
dc.contributor.author Ghosh, Aindrila
dc.contributor.author Miller, James
dc.contributor.author Quader, Shaikh
dc.date.accessioned 2020-01-04T07:10:47Z
dc.date.available 2020-01-04T07:10:47Z
dc.date.issued 2020-01-07
dc.description.abstract Obtaining hand-labeled training data is one of the most tedious and expensive parts of the machine learning pipeline. Previous approaches, such as active learning aim at optimizing user engagement to acquire accurate labels. Other methods utilize weak supervision to generate low-quality labels at scale. In this paper, we propose a new hybrid method named WeSAL that incorporates Weak Supervision sources with Active Learning to keep humans in the loop. The method aims to generate large-scale training labels while enhancing its quality by involving domain experience. To evaluate WeSAL, we compare it with two-state-of-the-art labeling techniques, Active Learning and Data Programming. The experiments use five publicly available datasets and a real-world dataset of 1.5M records provided by our industrial partner, IBM. The results indicate that WeSAL can generate large-scale, high-quality labels while reducing the labeling cost by up to 68% compared to active learning.
dc.format.extent 10 pages
dc.identifier.doi 10.24251/HICSS.2020.028
dc.identifier.isbn 978-0-9981331-3-3
dc.identifier.uri http://hdl.handle.net/10125/63767
dc.language.iso eng
dc.relation.ispartof Proceedings of the 53rd Hawaii International Conference on System Sciences
dc.rights Attribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject Collaboration for Data Science
dc.subject active learning
dc.subject human-in-the-loop
dc.subject machine learning
dc.subject supervised learning
dc.subject weak supervision
dc.title WeSAL: Applying Active Supervision to Find High-quality Labels at Industrial Scale
dc.type Conference Paper
dc.type.dcmi Text
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
1.42 MB
Adobe Portable Document Format