Closeness of Factor Analysis and Principal Component Analysis in Semi-High-Dimensional Conditions
Date
2016-05
Authors
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Narrator
Transcriber
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
[Honolulu] : [University of Hawaii at Manoa], [May 2016]
Volume
Number/Issue
Starting Page
Ending Page
Alternative Title
Abstract
Factor analysis (FA) and principal component analysis (PCA) are routinely employed in research in the social sciences. Guttman (1956) first showed that when the number of variables (p) increases without limit while the ratio of number of factors (m) to p goes to zero, FA and PCA loadings converge. Schneeweiss and Mathes (1995) and Schneeweiss (1997) later derived two conditions for closeness between FA and PCA loadings that can be considered generalizations of the Guttman condition. The Schneeweiss-Mathes and Schneeweiss conditions were derived under the regular assumption that sample size (N) exceeds p. However, as fast computing has become available, analysis of semi-high-dimensional data in which p is large but doesn’t exceed N is becoming more common. The first goal of the current study was to examine whether the three conditions are still valid as measures of closeness between FA and PCA loadings with such semi-high-dimensional data. Because the computation for PCA is simpler than that for FA, PCA can be used as an approximation for FA when p is large. However, as p increases, non-consistency might become an issue. Therefore, it is necessary to simultaneously consider the closeness between the estimated FA and the estimated PCA loadings, the closeness between the estimated PCA and the population FA loadings, and the closeness between the estimated and the population FA loadings. The second goal of the current study was to examine the behavior of these kinds of closeness in semi-high-dimensional conditions. To deal with semi-high-dimensionality, a ridge method proposed by Yuan and Chan (2008) was employed. As a measure for closeness, the average canonical correlation (CC) between two loading matrices and its Fisher’s z-transformation were employed. The results indicate that, in semi-high-dimensional conditions, (i) the estimates of loadings by FA and PCA are rather close for all the conditions considered, and (ii) closeness between the estimated FA and PCA loadings is easier and faster to achieve than closeness between the estimated and the population FA loadings or between the estimated PCA and the population FA loadings.
Description
Ph.D. University of Hawaii at Manoa 2016.
Includes bibliographical references.
Includes bibliographical references.
Keywords
High-dimensionality, Large p small N, Regularization, Ridge regression, Shrinkage, Fisher’s z-transformation, Canonical correlation
Citation
Extent
Format
Geographic Location
Time Period
Related To
Theses for the degree of Doctor of Philosophy (University of Hawaii at Manoa). Psychology
Related To (URI)
Table of Contents
Rights
Rights Holder
Local Contexts
Collections
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.