Parallel Cache-Efficient Algorithms on GPUs
Date
2023
Authors
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Narrator
Transcriber
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
Ending Page
Alternative Title
Abstract
Graphics Processing Units (GPUs) have emerged as a highly attractive architecture for general-purpose computing due to their numerous programmable cores, low-latency memory units, and efficient thread context switching capabilities. However, theoretical research on parallel algorithms for GPUs is challenging due to the multitude of interdependent factors influencing overall runtime. Computational models are commonly employed to provide simplified abstractions of computing system architectures. However, developing a computational model that is both simple and accurate, encompassing all performance-affecting aspects of GPU algorithms, is a seemingly impossible task. Existing GPU models often incorporate numerous variables to account for specific performance factors, rendering them less accessible to researchers.
This dissertation obviates the lack of a widely accepted model of computation for GPUs by instead employing multiple classical parallel models to capture both parallel computational complexity and cache-efficiency. Namely, we leverage existing knowledge and algorithmic techniques from the Parallel Random Access Machine (PRAM), Parallel External Memory (PEM), and Distributed Memory Machine (DMM) models to aid in the design and analysis of GPU algorithms at various levels of detail. We validate and demonstrate our approach through case studies on specific problems (e.g., sorting, searching, and single source shortest paths), providing both theoretical analysis and corresponding empirical results. Our results highlights the applicability of the selected parallel models of computation to GPUs and illustrates how theoretical research can expose valuable insights into the performance of GPU algorithms in practice.
Description
Keywords
Computer science, bank conflicts, cache-efficent, GPU, models of computation, parallel algorithms
Citation
Extent
115 pages
Format
Geographic Location
Time Period
Related To
Related To (URI)
Table of Contents
Rights
All UHM dissertations and theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission from the copyright owner.
Rights Holder
Local Contexts
Collections
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.