Please use this identifier to cite or link to this item: http://hdl.handle.net/10125/51387

Closeness of Factor Analysis and Principal Component Analysis in Semi-High-Dimensional Conditions

File Description SizeFormat 
2016-05-phd-liang_r.pdfVersion for non-UH users. Copying/Printing is not permitted1.52 MBAdobe PDFView/Open
2016-05-phd-liang_uh.pdfFor UH users only1.61 MBAdobe PDFView/Open

Item Summary

Title: Closeness of Factor Analysis and Principal Component Analysis in Semi-High-Dimensional Conditions
Authors: Liang, Lu
Keywords: High-dimensionality
Large p small N
Regularization
Ridge regression
Shrinkage
show 2 moreFisher’s z-transformation
Canonical correlation

show less
Issue Date: May 2016
Publisher: [Honolulu] : [University of Hawaii at Manoa], [May 2016]
Abstract: Factor analysis (FA) and principal component analysis (PCA) are routinely employed in research in the social sciences. Guttman (1956) first showed that when the number of variables (p) increases without limit while the ratio of number of factors (m) to p goes to zero, FA and PCA loadings converge. Schneeweiss and Mathes (1995) and Schneeweiss (1997) later derived two conditions for closeness between FA and PCA loadings that can be considered generalizations of the Guttman condition. The Schneeweiss-Mathes and Schneeweiss conditions were derived under the regular assumption that sample size (N) exceeds p. However, as fast computing has become available, analysis of semi-high-dimensional data in which p is large but doesn’t exceed N is becoming more common. The first goal of the current study was to examine whether the three conditions are still valid as measures of closeness between FA and PCA loadings with such semi-high-dimensional data. Because the computation for PCA is simpler than that for FA, PCA can be used as an approximation for FA when p is large. However, as p increases, non-consistency might become an issue. Therefore, it is necessary to simultaneously consider the closeness between the estimated FA and the estimated PCA loadings, the closeness between the estimated PCA and the population FA loadings, and the closeness between the estimated and the population FA loadings. The second goal of the current study was to examine the behavior of these kinds of closeness in semi-high-dimensional conditions. To deal with semi-high-dimensionality, a ridge method proposed by Yuan and Chan (2008) was employed. As a measure for closeness, the average canonical correlation (CC) between two loading matrices and its Fisher’s z-transformation were employed. The results indicate that, in semi-high-dimensional conditions, (i) the estimates of loadings by FA and PCA are rather close for all the conditions considered, and (ii) closeness between the estimated FA and PCA loadings is easier and faster to achieve than closeness between the estimated and the population FA loadings or between the estimated PCA and the population FA loadings.
Description: Ph.D. University of Hawaii at Manoa 2016.
Includes bibliographical references.
URI/DOI: http://hdl.handle.net/10125/51387
Appears in Collections:Ph.D. - Psychology


Please contact sspace@hawaii.edu if you need this content in an alternative format.

Items in ScholarSpace are protected by copyright, with all rights reserved, unless otherwise indicated.