Towards Simulating User Behavior for Automating Usability Tests by Employing Large Language Models

Loading...
Thumbnail Image

Contributor

Advisor

Department

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Interviewee

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

Volume

Number/Issue

Starting Page

4463

Ending Page

Alternative Title

Abstract

Large Language Models (LLMs) enable the automation of tasks that typically require substantial manual effort. This work investigates their applicability in the context of usability testing. First, we evaluate whether LLM-agents can navigate in and interact with different applications to accomplish given tasks. Second, we compare LLM-generated streams-of-thought with human think-aloud comments collected during usability tests. Results show that, based on GPT-4o, LLM-agents can successfully interact with websites and perform tasks such as information search. However, they often fail to recognize task completion and tend to engage in actions beyond the intended goals. The comparison further reveals clear differences between LLM-based and human observations: while human users overlook certain issues, LLM-agents identify them. These findings demonstrate the potential of LLMs as a preparatory step in usability testing and outline directions for advancing their adaptation and improvement.

Description

Citation

DOI

Extent

10 pages

Format

Geographic Location

Time Period

Related To

Proceedings of the 59th Hawaii International Conference on System Sciences

Related To (URI)

Table of Contents

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International

Rights Holder

Catalog Record

Local Contexts

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.