Zero-shot Comparison of Large Language Models (LLMs) Reasoning Abilities on Long-text Analogies

Combs, Kara; Bihl, Trevor; Howlett, Spencer; Adams, Yuki

Zero-shot Comparison of Large Language Models (LLMs) Reasoning Abilities on Long-text Analogies

Files

0158.pdf (464.73 KB)

Date

2025-01-07

Authors

Starting Page

1610

Abstract

In recent years, large language models (LLMs) have made substantial strides in mimicking human language and coherently presenting information. However, researchers continue to debate the accuracy and robustness of LLMs’ reasoning abilities. The reasoning abilities of thirteen LLMs were tested on two long-text analogy datasets, named Rattermann and Wharton, which required them to rank a series of stories from most analogous to least analogous compared to a source story. On the Rattermann dataset, GPT-4 obtained the highest accuracy of 70%. As a whole, LLMs seem to struggle with over-emphasizing similar story entities (characters and settings) and a lack of awareness of higher-order relationship(s) between stories. LLMs struggled more with the Wharton dataset, with the highest accuracy achieved being 46.4% by GPT-4o, and all but nine LLMs performing below random chance accuracy. Although LLMs are improving, they still struggle with higher-cognitive tasks such as analogical reasoning.

Keywords

Natural Language Processing and Large Language Models Supporting Data Analytics for System Sciences, analogical reasoning, artificial intelligence, generative ai, large language models, zero-shot learning

URI

https://hdl.handle.net/10125/109034

Extent

10

Related To

Proceedings of the 58th Hawaii International Conference on System Sciences

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International

Collections

Natural Language Processing and Large Language Models Supporting Data Analytics for System Sciences

Full item page

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.

Zero-shot Comparison of Large Language Models (LLMs) Reasoning Abilities on Long-text Analogies

Files

Date

Authors

Contributor

Advisor

Department

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

Volume

Number/Issue

Starting Page

Ending Page

Alternative Title

Abstract

Description

Keywords

Citation

URI

Extent

Format

Geographic Location

Time Period

Related To

Related To (URI)

Table of Contents

Rights

Rights Holder

Local Contexts

Collections