The Visual Analogs of Linguistic Concepts and Their Implications on Generative AI
Files
Date
2025-01-07
Authors
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Narrator
Transcriber
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
713
Ending Page
Alternative Title
Abstract
Many visual generative artificial intelligence (AI) models use textual “prompts” as input(s) to guide the development of the resulting image(s). Converting text to images utilizes pragmatics and semantics, which can make an impact on the output. To facilitate more precise prompting, we propose the three-dimensional vector space of textual similarity which uses textual representation, auditory representation, and meaning similarity as its axes. Next, we show that meaning similarity between two words does not necessarily yield visual similarity between corresponding AI-generated images of those words. We quantitively justify this by leveraging eight image generators to generate images for abstract and concrete synonyms, antonyms, and hypernyms-hyponym pairs and compare their image-image CLIPScores to their corresponding text-text CLIPScores. Across all models and relationship types the average similarity comparing text-text and image-image similarity decreased from 92.8% to 70.1% for synonyms, 89% to 58.9% for antonyms, and 85.6% to 68.1% for hypernym-hyponym pairs.
Description
Keywords
Technological Advancements in Digital Collaboration with Generative AI and Large Language Models, generative ai, human computer interaction, large language models, prompt engineering, visual analogies
Citation
Extent
10
Format
Geographic Location
Time Period
Related To
Proceedings of the 58th Hawaii International Conference on System Sciences
Related To (URI)
Table of Contents
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International
Rights Holder
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.