Sliding Reservoir Approach for Delayed Labeling in Streaming Data Classification

Date
2017-01-04
Authors
Hu, Hanqing
Kantardzic, Mehmed
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
Ending Page
Alternative Title
Abstract
When concept drift occurs within streaming data, a streaming data classification framework needs to update the learning model to maintain its performance. Labeled samples required for training a new model are often unavailable immediately in real world applications. This delay of labels might negatively impact the performance of traditional streaming data classification frameworks. To solve this problem, we propose Sliding Reservoir Approach for Delayed Labeling (SRADL). By combining chunk based semi-supervised learning with a novel approach to manage labeled data, SRADL does not need to wait for the labeling process to finish before updating the learning model. Experiments with two delayed-label scenarios show that SRADL improves prediction performance over the naïve approach by as much as 7.5% in certain cases. The most gain comes from 18-chunk labeling delay time with continuous labeling delivery scenario in real world data experiments.
Description
Keywords
delayed labeling, streaming data mining, concept drift, semi-supervised learning, classification
Citation
Extent
10 pages
Format
Geographic Location
Time Period
Related To
Proceedings of the 50th Hawaii International Conference on System Sciences
Table of Contents
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International
Rights Holder
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.