Please use this identifier to cite or link to this item: http://hdl.handle.net/10125/75925

EXPLORATORY ANALYSIS OF RESEARCH PUBLICATIONS COLLECTIONS WITH HUMAN STEERABLE BLACK-BOX MODELS. TOWARDS GENERALIZING INVERSE COMPUTATIONS FOR SEMANTIC INTERACTION.

File Size Format  
GonzalezMartinez hawii 0085A 10921.pdf 4.27 MB Adobe PDF View/Open

Item Summary

Title:EXPLORATORY ANALYSIS OF RESEARCH PUBLICATIONS COLLECTIONS WITH HUMAN STEERABLE BLACK-BOX MODELS. TOWARDS GENERALIZING INVERSE COMPUTATIONS FOR SEMANTIC INTERACTION.
Análisis de publicaciones de investigación con modelos manipulables. Generalizando computaciones inversas en modelos de Inteligencia Artificial.
Authors:Gonzalez Martinez, Alberto
Contributors:Leigh, Jason (advisor)
Computer Science (department)
Keywords:Computer science
Analytics
Human in the Loop
Machine Learning
Semantic Interaction
show 2 moreVisual Analytics
Visualization
show less
Date Issued:2021
Publisher:University of Hawai'i at Manoa
Abstract:Understanding highly-dimensional data sets is a complex task for many scientists, engineers, and intelligence analysts. Traditionally, this problem has been tackled with linear pipelines that rely on mathematical models and algorithms to summarize relationships and structure, producing a visual representation of the data in a collapsed, low-dimensional form. The main issue with these traditional pipelines is that they are driven solely by algorithms or models, and without a human in the loop, they can potentially limit sense-making by masking expected or known structure in the data.
In recent years, Semantic Interaction has become a promising approach as a user interaction methodology for model steering in Visual Analytics systems, as it provides mechanisms with which to adjust the parameter space, explore data, and test hypotheses. Under the paradigm of Semantic Interaction, users can steer model parameters and explore data representations without leaving the visual space, thus combining algorithms and models with expert human judgment. Semantic Interaction systems need to invert the computation of one or more mathematical models to support a bidirectional structure within their pipelines to facilitate this interaction modality. For example, dimensionality reduction and clustering are frequently used to explore multidimensional data in Visual Analytic systems and are typically always present in Semantic Interaction systems. Since users interact with clustered data in its compressed form, the system needs to link this compressed form to the original high dimensional representation to affect the model and algorithms from within the visualization. The necessity of this reverse link from the low-dimensional representation to the high-dimensional input space requires that Semantic Interaction pipelines be bidirectional.

Most examples of Semantic Interaction systems make use of simple and interpretable linear models for dimensionality reduction and clusterings such as LDA (Latent Dirichlet Allocation) and PCA (Principal Component Analysis) to be able to provide a straightforward bidirectional pipeline. By contrast, the state-of-the-art techniques for dimensionality reduction and clustering in visual analytics, such as t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP), are "black-box" models, which are neither linear nor directly interpretable. Furthermore, these techniques are computationally expensive, suffer from out-of-sample stability problems, and are complex to retrain for new instances, requiring precise hyper-parameter tuning.

A novel Deep Surrogate model approach is proposed in this thesis to perform backward and forward computations within semantic interaction pipelines that were previously implemented with "black-box" models. This approach allows for the efficient "merging" of new instances into a previously trained model without retraining. It also provides a reverse link, allowing a trained model's parameters to be affected by user interactions with the visual representation of data. To demonstrate this approach's usefulness, I present the Zexplorer system, a tool for exploring Large Document Collections of Research papers with Semantic Interaction, as well as a user study to validate the approach. The Zexplorer system is built as an extension to Zotero, a widely-used open source bibliography system.
Pages/Duration:113 pages
URI:http://hdl.handle.net/10125/75925
Rights:All UHM dissertations and theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission from the copyright owner.
Appears in Collections: Ph.D. - Computer Science


Please email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.

Items in ScholarSpace are protected by copyright, with all rights reserved, unless otherwise indicated.