Archetype Discovery from Taxonomies: A Method to Cluster Small Datasets of Categorical Data
Loading...
Files
Date
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Interviewee
Narrator
Transcriber
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
1223
Ending Page
Alternative Title
Abstract
This study investigates the challenges of clustering small categorical datasets, particularly in the context of taxonomy-based archetype formation. Taxonomies, such as the Linnaean system, are vital for organizing knowledge across diverse domains and can be used as code books. Archetypes then represent common patterns across the entities. While cluster analysis is a powerful tool for uncovering unknown patterns, traditional clustering methods are predominantly distance-based and optimized for continuous data, which is inadequate for categorical data where similarity is not easily quantifiable. Common distance measures, like Euclidean and Manhattan distances, fail to capture meaningful relationships in categorical datasets. This work addresses this gap by exploring information-theoretic approaches to develop a novel clustering method CatRED tailored for small categorical datasets such as taxonomy data. We evaluate our method through its application to two taxonomy datasets, demonstrating its effectiveness in generating archetypes.
Description
Citation
Extent
10
Format
Geographic Location
Time Period
Related To
Proceedings of the 58th Hawaii International Conference on System Sciences
Related To (URI)
Table of Contents
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International
Rights Holder
Catalog Record
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.
