Clustering and Topological Data Analysis: Comparison and Application

Date
2023-01-03
Authors
Combs, Kara
Bihl, Trevor
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
815
Ending Page
Alternative Title
Abstract
Clustering is common technique used to demonstrate relationships between data and information. Of recent interest is topological data analysis (TDA), which can represent and cluster data through persistent homology. The TDA algorithms used include the Topological Mode Analysis Tool (ToMATo) algorithm, Garin and Tauzin’s TDA Pipeline, and the Mapper algorithm. First, TDA is compared to ten other clustering algorithms on artificial 2D data where it ranked third overall. TDA had the second-highest performance in terms of average accuracy (97.9%); however, its computation-time performance ranked in the middle of the algorithms. TDA ranked fourth on the qualitative “visual trustworthiness” metric. On real-world data, TDA showed promising classification results (accuracy between 80-95%). Overall, this paper shows TDA is a competitive algorithm performance-wise, though computationally expensive. When TDA is used for visualization, the Mapper algorithm allows for unique alternative views especially effective for visualizing highly dimensional data.
Description
Keywords
Big Data and Analytics: Pathways to Maturity, algorithm trust, big data, clustering, heuristic analysis, topological data analysis
Citation
Extent
10
Format
Geographic Location
Time Period
Related To
Proceedings of the 56th Hawaii International Conference on System Sciences
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International
Rights Holder
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.