Clustering and Topological Data Analysis: Comparison and Application Combs, Kara Bihl, Trevor 2022-12-27T18:55:26Z 2022-12-27T18:55:26Z 2023-01-03
dc.description.abstract Clustering is common technique used to demonstrate relationships between data and information. Of recent interest is topological data analysis (TDA), which can represent and cluster data through persistent homology. The TDA algorithms used include the Topological Mode Analysis Tool (ToMATo) algorithm, Garin and Tauzin’s TDA Pipeline, and the Mapper algorithm. First, TDA is compared to ten other clustering algorithms on artificial 2D data where it ranked third overall. TDA had the second-highest performance in terms of average accuracy (97.9%); however, its computation-time performance ranked in the middle of the algorithms. TDA ranked fourth on the qualitative “visual trustworthiness” metric. On real-world data, TDA showed promising classification results (accuracy between 80-95%). Overall, this paper shows TDA is a competitive algorithm performance-wise, though computationally expensive. When TDA is used for visualization, the Mapper algorithm allows for unique alternative views especially effective for visualizing highly dimensional data.
dc.format.extent 10
dc.identifier.doi 10.24251/HICSS.2023.102
dc.identifier.isbn 978-0-9981331-6-4
dc.language.iso eng
dc.relation.ispartof Proceedings of the 56th Hawaii International Conference on System Sciences
dc.rights Attribution-NonCommercial-NoDerivatives 4.0 International
dc.subject Big Data and Analytics: Pathways to Maturity
dc.subject algorithm trust
dc.subject big data
dc.subject clustering
dc.subject heuristic analysis
dc.subject topological data analysis
dc.title Clustering and Topological Data Analysis: Comparison and Application
dc.type.dcmi text
prism.startingpage 815
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
825.93 KB
Adobe Portable Document Format