Classifying Imbalanced Data: The Relevance of Accuracy and Feature Importance
| dc.contributor.author | Widmann, Torben | |
| dc.date.accessioned | 2023-12-26T18:36:49Z | |
| dc.date.available | 2023-12-26T18:36:49Z | |
| dc.date.issued | 2024-01-03 | |
| dc.identifier.doi | 10.24251/HICSS.2024.144 | |
| dc.identifier.isbn | 978-0-9981331-7-1 | |
| dc.identifier.other | 96b94ac5-1ff0-451a-b914-db4fe36607af | |
| dc.identifier.uri | https://hdl.handle.net/10125/106521 | |
| dc.language.iso | eng | |
| dc.relation.ispartof | Proceedings of the 57th Hawaii International Conference on System Sciences | |
| dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | |
| dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |
| dc.subject | Decision Making with Sustainable, Fair and Trustworthy AI | |
| dc.subject | accuracy | |
| dc.subject | data quality | |
| dc.subject | feature importance | |
| dc.subject | imbalanced data classification | |
| dc.title | Classifying Imbalanced Data: The Relevance of Accuracy and Feature Importance | |
| dc.type | Conference Paper | |
| dc.type.dcmi | Text | |
| dcterms.abstract | The use of AI and ML algorithms can only contribute successfully to data-driven decision making if the underlying data is of sufficiently good quality. However, the effort of ensuring good data quality (DQ) must be proportionate to the potential impact of poor DQ. In this work, we therefore investigate the impact of DQ defects on the common and challenging task of classifying imbalanced data. We contribute to theory and practice by being the first to investigate the impact of DQ according to the particular DQ dimension accuracy and by examining the relevance of the importance of attributes with respect to the classification. Underpinning the significance of DQ, our experiments show that already few inaccuracies can lead to a considerably worse classification, that efficient data cleaning can be limited to a few attributes, and that distance-based algorithms are more affected by defects in less important attributes. | |
| dcterms.extent | 10 pages | |
| prism.startingpage | 1190 |
Files
Original bundle
1 - 1 of 1
