Parallel Cache-Efficient Algorithms on GPUs

Berney, Kyle Mitsuo

Parallel Cache-Efficient Algorithms on GPUs

dc.contributor.advisor	Sitchinava, Nodari S.
dc.contributor.author	Berney, Kyle Mitsuo
dc.contributor.department	Computer Science
dc.date.accessioned	2023-09-28T20:14:56Z
dc.date.available	2023-09-28T20:14:56Z
dc.date.issued	2023
dc.description.degree	Ph.D.
dc.identifier.uri	https://hdl.handle.net/10125/106101
dc.subject	Computer science
dc.subject	bank conflicts
dc.subject	cache-efficent
dc.subject	GPU
dc.subject	models of computation
dc.subject	parallel algorithms
dc.title	Parallel Cache-Efficient Algorithms on GPUs
dc.type	Thesis
dcterms.abstract	Graphics Processing Units (GPUs) have emerged as a highly attractive architecture for general-purpose computing due to their numerous programmable cores, low-latency memory units, and efficient thread context switching capabilities. However, theoretical research on parallel algorithms for GPUs is challenging due to the multitude of interdependent factors influencing overall runtime. Computational models are commonly employed to provide simplified abstractions of computing system architectures. However, developing a computational model that is both simple and accurate, encompassing all performance-affecting aspects of GPU algorithms, is a seemingly impossible task. Existing GPU models often incorporate numerous variables to account for specific performance factors, rendering them less accessible to researchers. This dissertation obviates the lack of a widely accepted model of computation for GPUs by instead employing multiple classical parallel models to capture both parallel computational complexity and cache-efficiency. Namely, we leverage existing knowledge and algorithmic techniques from the Parallel Random Access Machine (PRAM), Parallel External Memory (PEM), and Distributed Memory Machine (DMM) models to aid in the design and analysis of GPU algorithms at various levels of detail. We validate and demonstrate our approach through case studies on specific problems (e.g., sorting, searching, and single source shortest paths), providing both theoretical analysis and corresponding empirical results. Our results highlights the applicability of the selected parallel models of computation to GPUs and illustrates how theoretical research can expose valuable insights into the performance of GPU algorithms in practice.
dcterms.extent	115 pages
dcterms.language	en
dcterms.publisher	University of Hawai'i at Manoa
dcterms.rights	All UHM dissertations and theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission from the copyright owner.
dcterms.type	Text
local.identifier.alturi	http://dissertations.umi.com/hawii:11861

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Berney_hawii_0085A_11861.pdf
Size:: 2.03 MB
Format:: Adobe Portable Document Format

Download

Collections

Ph.D. - Computer Science