Scheduling Heuristics For Executing Scientific Workflows On Homogeneous Clusters With Globally and Locally-Accessible Persistent Storage

Date

2018-08

Contributor

Advisor

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

University of Hawaii at Manoa

Volume

Number/Issue

Starting Page

Ending Page

Alternative Title

Abstract

Many applications in science and engineering today are structured as scientic workows, i.e., task graphs with data dependencies between graphs, where tasks are implemented as standalone executables and data dependencies are via les read/written from/to stable storage. For many relevant application domains, these workows are both large and data-intensive. Therefore, optimizing data accesses is crucial for ecient scientic workow executions. Typical HPC (High Performance Computing) platforms used to run scientic workows are commodity clusters, in which each compute node has access to private, small, highbandwidth \local" storage, and to shared, large, low-bandwidth \global" storage. To date, production Workow Management Systems (WMs), software infrastructures for executing workows in practice, only use global storage. There is thus an opportunity to improve workow performance by exploiting local storage. The diculty, however, is twofold. First, the capacity of local storage is limited and often allows holding only a few workow les. Second, storing data in local storage reduces parallelism because storage is private to a single node. In this thesis, we design scheduling heuristics to orchestrate workow execution in this context, with the objective of minimizing workow execution time. These heuristics decide which les should be stored in which level of storage (local or global) and replicate tasks so as to increase the availability of data across compute nodes and thus maintain parallelism. We implement a simulation framework to evaluate and drive the design of these heuristics using both real-world and synthetic workow congurations. We also implement a software prototype for using these heuristics on HPC platforms. From experimental results obtained in simulation and on an actual HPC cluster we are able to evaluate the relative merit of our heuristics and draw conclusions about the most promising approaches and remaining challenges.

Description

Keywords

Scientic Workows, DAG, Task Replication, Local/Global Storages, Heuristics

Citation

Extent

Format

Geographic Location

Time Period

Related To

Related To (URI)

Table of Contents

Rights

Rights Holder

Local Contexts

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.