Closeness of Factor Analysis and Principal Component Analysis in Semi-High-Dimensional Conditions

Date

2016-05

Contributor

Advisor

Department

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

[Honolulu] : [University of Hawaii at Manoa], [May 2016]

Volume

Number/Issue

Starting Page

Ending Page

Alternative Title

Abstract

Factor analysis (FA) and principal component analysis (PCA) are routinely employed in research in the social sciences. Guttman (1956) first showed that when the number of variables (p) increases without limit while the ratio of number of factors (m) to p goes to zero, FA and PCA loadings converge. Schneeweiss and Mathes (1995) and Schneeweiss (1997) later derived two conditions for closeness between FA and PCA loadings that can be considered generalizations of the Guttman condition. The Schneeweiss-Mathes and Schneeweiss conditions were derived under the regular assumption that sample size (N) exceeds p. However, as fast computing has become available, analysis of semi-high-dimensional data in which p is large but doesn’t exceed N is becoming more common. The first goal of the current study was to examine whether the three conditions are still valid as measures of closeness between FA and PCA loadings with such semi-high-dimensional data. Because the computation for PCA is simpler than that for FA, PCA can be used as an approximation for FA when p is large. However, as p increases, non-consistency might become an issue. Therefore, it is necessary to simultaneously consider the closeness between the estimated FA and the estimated PCA loadings, the closeness between the estimated PCA and the population FA loadings, and the closeness between the estimated and the population FA loadings. The second goal of the current study was to examine the behavior of these kinds of closeness in semi-high-dimensional conditions. To deal with semi-high-dimensionality, a ridge method proposed by Yuan and Chan (2008) was employed. As a measure for closeness, the average canonical correlation (CC) between two loading matrices and its Fisher’s z-transformation were employed. The results indicate that, in semi-high-dimensional conditions, (i) the estimates of loadings by FA and PCA are rather close for all the conditions considered, and (ii) closeness between the estimated FA and PCA loadings is easier and faster to achieve than closeness between the estimated and the population FA loadings or between the estimated PCA and the population FA loadings.

Description

Ph.D. University of Hawaii at Manoa 2016.
Includes bibliographical references.

Keywords

High-dimensionality, Large p small N, Regularization, Ridge regression, Shrinkage, Fisher’s z-transformation, Canonical correlation

Citation

Extent

Format

Geographic Location

Time Period

Related To

Theses for the degree of Doctor of Philosophy (University of Hawaii at Manoa). Psychology

Related To (URI)

Table of Contents

Rights

Rights Holder

Local Contexts

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.