Please use this identifier to cite or link to this item:

Reliability of Training Data Sets for ML Classifiers: A Lesson Learned from Mechanical Engineering

File Size Format  
0089.pdf 383.12 kB Adobe PDF View/Open

Item Summary

Title:Reliability of Training Data Sets for ML Classifiers: A Lesson Learned from Mechanical Engineering
Authors:Juric, Radmila
Danilchanka, Natallia
Mousavi, Mehdi Gebreil
Keywords:Accountability and Evaluation of AI Algorithms
data set
Date Issued:07 Jan 2020
Abstract:The popularity of learning and predictive technologies, across many problem domains, is unprecedented and it is often underpinned with the fact that we efficiently compute with vast amounts of data and data types, and thus should be able to resolve problems, which we could not in the past. This view is particularly common among scientists who believe that the excessive amount of data, we generate in real life, is ideal for performing predictions and training algorithms. However, the truth might be quite different. The paper illustrates the process of preparing a training data set for an ML classifier, which should predict certain conditions in mechanical engineering. It was not the case that it was difficult to define and choose classifiers, in order to secure safe predictions. It was our inability to create a safe, reliable and trustworthy training data set, from scientifically proven experiments, which created the problem. This places serious doubts on the way we use learning and predictive technologies today. It remains debatable what the next step should be. However, if in ML algorithms, and classifiers in particular, the semantic which is built-in data sets, influences classifier’s definition, it would be very difficult to evaluate and rely on them, before we understand data semantics fully. In other words, we still do not know how the semantic, sometimes hidden in a data set, can adversely affect algorithms trained by them.
Pages/Duration:10 pages
Rights:Attribution-NonCommercial-NoDerivatives 4.0 International
Appears in Collections: Accountability and Evaluation of AI Algorithms

Please email if you need this content in ADA-compliant format.

This item is licensed under a Creative Commons License Creative Commons