Parallel XPath query evaluation on multi-core processors

Date

2012-05

Contributor

Advisor

Department

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

University of Hawaii at Manoa

Volume

Number/Issue

Starting Page

Ending Page

Alternative Title

Abstract

XML and the XPath querying language are global standards that are used in industrial settings. The high latency of queries over large XML databases remains a problem for many applications. While this latency could be reduced by parallel execution, issues such as work partitioning, memory contention, and load imbalance may diminish these benefits. To our knowledge, no previous attempt has been made to analyze the performance of multiple parallelization techniques on XPath queries. In this paper, we model the behavior of XPath queries and analyze the benefits of several parallel algorithms. Using synthetic XML databases, we model the query execution time using three quantifiable machine-dependent parameters. Our performance model allows us to estimate a lower-bounds of parallel execution time for arbitrary queries and databases on a given hardware environment. We propose five distinct XPath "query engines", including four parallel engines, and compare their performance on synthetic and realworld XML databases. These engines attempt to solve the issue of load-balancing while minimizing sequential execution time. We find that load-balancing is easily achieved for most synthetic data sets, though results on real-world data sets are inconsistent. Our two dynamic query engines (Dynamic Work Queue and Producer-Consumer) achieve the best results on all tests, with Producer-Consumer achieving near-ideal speedup on some queries on real-world data sets. Across two distinct multi-core hardware environments, we find that our models accurately estimate both sequential and parallel execution time of synthetic data sets, though our parameter estimation is not sufficiently precise to prove accurate estimates for real-world data sets.

Description

Keywords

XPath, XML, Computer science

Citation

Extent

Format

Geographic Location

Time Period

Related To

Theses for the degree of Master of Science (University of Hawaii at Manoa). Computer Science.

Related To (URI)

Table of Contents

Rights

Rights Holder

Local Contexts

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.