PCA & Factor Analysis Guide
PCA & Factor Analysis Guide
Learning Objectives
After reading this chapter, you should understand:
55The basics of principal component and factor analysis.
55The principles of exploratory and confirmatory factor analysis.
55Key terms, such as communality, eigenvalues, factor loadings, and factor scores.
55What factor rotation is.
55How to determine whether data are suitable for carrying out an exploratory factor
analysis.
55How to interpret SPSS principal component analysis output.
55The principles of reliability analysis and its execution in SPSS.
55The concept of structural equation modeling.
Keywords
Anti-image • Bartlett method • Bartlett’s test of sphericity • Communality • Components • Confirmatory
factor analysis • Correlation residuals • Covariance-based structural equation modeling • Cronbach’s
Alpha • Direct oblimin rotation • Eigenvalue • Eigenvectors • Exploratory factor analysis • Factor analy-
sis • Factor loading • Factor rotation • Factor scores • Factor weights • Factors • Internal consistency reliabil-
ity • Kaiser criterion • Kaiser–Meyer–Olkin criterion • Latent root criterion • Measure of sampling adequa-
cy • Oblique rotation • Orthogonal rotation • Parallel analysis • Partial least squares structural equation mod-
eling • Path diagram • Principal axis factoring • Principal component analysis • Principal components • Prin-
cipal factor analysis • Promax rotation • Quartimax rotation • Regression method • Reliability analysis • Scree
plot • Split-half reliability • Structural equation modeling • Test-retest reliability • Varimax rotation
8.1 Introduction
Principal component analysis (PCA) and factor analysis (also called principal factor analysis
or principal axis factoring) are two methods for identifying structure within a set of vari-
ables. Many analyses involve large numbers of variables that are difficult to interpret. Using
PCA or factor analysis helps find interrelationships between variables (usually called items)
to identify a smaller number of unifying variables called factors. Consider the example of a
soccer club whose management wants to measure the satisfaction of the fans. The manage-
ment could, for instance, measure fan satisfaction by asking how satisfied the fans are with
the (1) assortment of merchandise, (2) quality of merchandise, and (3) prices of merchan-
dise. It is likely that these three items together measure satisfaction with the merchandise.
Through the application of PCA or factor analysis, we can determine whether a single factor
represents the three satisfaction items well. Practically, PCA and factor analysis are applied
to understand much larger sets of variables, tens or even hundreds, when just reading the
variables’ descriptions does not determine an obvious or immediate number of factors.
PCA and factor analysis both explain patterns of correlations within a set of observed
variables. That is, they identify sets of highly correlated variables and infer an underlying
factor structure. While PCA and factor analysis are very similar in the way they arrive at a
solution, they differ fundamentally in their assumptions of the variables’ nature and their
treatment in the analysis. Due to these differences, the methods follow different research
objectives, which dictate their areas of application. While the PCA’s objective is to reproduce
260 Chapter 8 · Principal Component and Factor Analysis
a data structure, as well as possible only using a few factors, factor analysis aims to explain
the variables’ correlations using factors (e.g., Hair et al. 2010; Matsunaga 2010; Mulaik
2009).1 We will discuss these differences and their implications in this chapter.
Both PCA and factor analysis can be used for exploratory or confirmatory purposes.
What are exploratory and confirmatory factor analyses? Comparing the left and right panels
of . Fig 8.1 shows us the difference. Exploratory factor analysis, often simply referred to as
EFA, does not rely on previous ideas on the factor structure we may find. That is, there may
be relationships (indicated by the arrows) between each factor (indicated by ovals) and each
item. While some of these relationships may be weak (indicated by the dotted arrows), others
are more pronounced, suggesting that these items represent an underlying factor well. The
left panel of . Fig. 8.1 illustrates this point. Thus, an exploratory factor analysis reveals the
number of factors and the items belonging to a specific factor. In a confirmatory factor anal-
ysis, usually simply referred to as CFA, there may only be relationships between a factor and
specific items. In the right panel of . Fig. 8.1, the first three items relate to factor 1, whereas
the last two items relate to factor 2. Different from the exploratory factor analysis, in a con-
firmatory factor analysis, we have clear expectations of the factor structure (e.g., because
researchers have proposed a scale that we want to adapt for our study) and we want to test
for the expected structure.
8 In this chapter, we primarily deal with exploratory factor analysis, as it conveys the
principles that underlie all factor analytic procedures and because the two techniques are
(almost) identical from a statistical point of view. Nevertheless, we will also discuss an
important aspect of confirmatory factor analysis, namely reliability analysis, which tests
the consistency of a measurement scale (see 7 Chap. 3). We will also briefly introduce a
specific confirmatory factor analysis approach called structural equation modeling (often
simply referred to as SEM). Structural equation modeling differs statistically and practi-
cally from PCA and factor analysis. It is not only used to evaluate how well observed vari-
ables relate to factors but also to analyze hypothesized relationships between factors that
the researcher specifies prior to the analysis based on theory and logic.
Satisfaction with the Satisfaction with Satisfaction with the Satisfaction with
outer appearance of the the stadium outer appearance of the the stadium
stadium (x2) (factor 1) stadium (x2) (factor 1)
Satisfaction with the Satisfaction with the
interior design of the interior design of the
stadium (x3) stadium (x3)
Assortment of Assortment of
merchandise (x4) Satisfaction with merchandise (x4) Satisfaction with
the merchandise the merchandise
Quality of (factor 2) Quality of (factor 2)
merchandise (x5) merchandise (x5)
. Fig. 8.1 Exploratory factor analysis (left) and confirmatory factor analysis (right)
1 Other methods for carrying out factor analyses include, for example, unweighted least squares, gen-
eralized least squares, or maximum likelihood but these are statistically complex.
8.2 · Understanding Principal Component and Factor Analysis
261 8
8.2 Understanding Principal Component and Factor Analysis
Researchers often face the problem of large questionnaires comprising many items. For
example, in a survey of a major German soccer club, the management was particularly inter-
ested in identifying and evaluating performance features that relate to soccer fans’ satisfaction
(Sarstedt et al. 2014). Examples of relevant features include the stadium, the team composi-
tion and their success, the trainer, and the management. The club therefore commissioned a
questionnaire comprising 99 previously identified items by means of literature databases and
focus groups of fans. All the items were measured on scales ranging from 1 (“very dissatisfied”)
to 7 (“very satisfied”). . Table 8.1 shows an overview of some items considered in the study.
Satisfaction with …
Identification of the players with the club Opening times of the fan-shops
Presence of a player with whom fans can identify Behavior of the sales persons in the fan shops
262 Chapter 8 · Principal Component and Factor Analysis
As you can imagine, tackling such a large set of items is problematic, because it pro-
vides quite complex data. Given the task of identifying and evaluating performance fea-
tures that relate to soccer fans’ satisfaction (measured by “Overall, how satisfied are you
with your soccer club”), we cannot simply compare the items on a pairwise basis. It is far
more reasonable to consider the factor structure first. For example, satisfaction with the
condition of the stadium (x1), outer appearance of the stadium (x2), and interior design
of the stadium (x3) cover similar aspects that relate to the respondents’ satisfaction with
the stadium. If a soccer fan is generally very satisfied with the stadium, he/she will most
likely answer all three items positively. Conversely, if a respondent is generally dissatis-
fied with the stadium, he/she is most likely to be rather dissatisfied with all the perfor-
mance aspects of the stadium, such as the outer appearance and interior design. Conse-
quently, these three items are likely to be highly correlated—they cover related aspects
of the respondents’ overall satisfaction with the stadium. More precisely, these items can
be interpreted as manifestations of the factor capturing the “joint meaning” of the items
related to it. The arrows pointing from the factor to the items in . Fig. 8.1 indicate this point.
In our example, the “joint meaning” of the three items could be described as satisfaction
with the stadium, since the items represent somewhat different, yet related, aspects of the
stadium. Likewise, there is a second factor that relates to the two items x4 and x5, which,
8 like the first factor, shares a common meaning, namely satisfaction with the merchandise.
PCA and factor analysis are two statistical procedures that draw on item correlations
in order to find a small number of factors. Having conducted the analysis, we can make
use of few (uncorrelated) factors instead of many variables, thus significantly reducing the
analysis’s complexity. For example, if we find six factors, we only need to consider six cor-
relations between the factors and overall satisfaction, which means that the recommen-
dations will rely on six factors.
Like any multivariate analysis method, PCA and factor analysis are subject to certain
requirements, which need to be met for the analysis to be meaningful. A crucial require-
ment is that the variables need to exhibit a certain degree of correlation. In our example in
. Fig. 8.1, this is probably the case, as we expect increased correlations between x1, x2, and
x3, on the one hand, and between x4 and x5 on the other. Other items, such as x1 and x4, are
probably somewhat correlated, but to a lesser degree than the group of items x1, x2, and x3
and the pair x4 and x5. Several methods allow for testing whether the item correlations are
sufficiently high.
Both PCA and factor analysis strive to reduce the overall item set to a smaller set of
factors. More precisely, PCA extracts factors such that they account for variables’ vari-
ance, whereas factor analysis attempts to explain the correlations between the variables.
Whichever approach you apply, using only a few factors instead of many items reduces its
precision, because the factors cannot represent all the information included in the items.
Consequently, there is a trade-off between simplicity and accuracy. In order to make the
analysis as simple as possible, we want to extract only a few factors. At the same time, we
do not want to lose too much information by having too few factors. This trade-off has to
be addressed in any PCA and factor analysis when deciding how many factors to extract
from the data.
8.3 · Principal Component Analysis
263 8
Once the number of factors to retain from the data has been identified, we can proceed
with the interpretation of the factor solution. This step requires us to produce a label for each
factor that best characterizes the joint meaning of all the variables associated with it. This step
is often challenging, but there are ways of facilitating the interpretation of the factor solution.
Finally, we have to assess how well the factors reproduce the data. This is done by examining
the solution’s goodness-of-fit, which completes the standard analysis. However, if we wish to
continue using the results in further analyses, we need to calculate the factor scores. Factor
scores are linear combinations of the items and can be used as variables in follow-up analyses.
. Figure 8.2 illustrates the steps involved in the analysis; we will discuss these in more
detail in the following sections. In doing so, our theoretical descriptions and illustrations
will focus on the PCA, as this method is easier to grasp. However, most of our descriptions
also apply to factor analysis.
Before carrying out a PCA, we have to consider several requirements, which we can test
by answering the following questions:
44Are the measurement scales appropriate?
44Is the sample size sufficiently large?
44Are the observations independent?
44Are the variables sufficiently correlated?
264 Chapter 8 · Principal Component and Factor Analysis
0.50–0.59 Miserable
0.60–0.69 Mediocre
0.70–0.79 Middling
0.80–0.89 Meritorious
not interpret the anti-image values directly, but use a measure based on the anti-image
concept: The Kaiser–Meyer–Olkin (KMO) criterion. The KMO criterion, also called the
measure of sampling adequacy (MSA), indicates whether the other variables in the dataset
can explain the correlations between variables. Kaiser (1974), who introduced the sta-
tistic, recommends a set of nicely labeled threshold values for KMO and MSA, which
. Table 8.2 presents.
The Bartlett’s test of sphericity can be used to test the null hypothesis that the correla-
tion matrix is a diagonal matrix (i.e., all non-diagonal elements are zero) in the population.
Since we need high correlations for PCA, we want to reject the null hypothesis. A large test
statistic value and corresponding a small p-value will favor the rejection of the hypoth-
esis. In practical applications, it is virtually impossible not to reject this null hypothesis,
as typically there are some correlations, particularly in larger sets of items. In addition,
PCA is typically used with large samples, a situation, which favors the rejection of the null
hypothesis. Thus, Bartlett’s test is of rather limited value for assessing whether the vari-
ables are sufficiently correlated.
To summarize, the correlation matrix with the associated significance levels provides a
first insight into the correlation structures. However, the final decision of whether the data
are appropriate for PCA should be primarily based on the KMO statistic. If this measure
indicates sufficiently correlated variables, we can continue the analysis of the results. If
not, we should try to identify items that correlate only weakly with the remaining items
and remove them. In Box 8.1, we discuss how to do this.
Unique
variance
Variance extracted
Variance excluded
With 20 or fewer variables and communalities below 0.40—which are clearly undesirable
in empirical research—the differences are probably pronounced (Stevens 2009).
Apart from these conceptual differences in the variables’ nature, PCA and factor
analysis differ in the aim of their analysis. Whereas the goal of factor analysis is to explain
the correlations between the variables, PCA focuses on explaining the variables’ variances.
That is, the PCA’s objective is to determine the linear combinations of the variables that
retain as much information from the original variables as possible. Strictly speaking, PCA
does not extract factors, but components, which are labeled as such in SPSS.
Despite these differences, which have very little relevance in many common research
settings in practice, PCA and factor analysis have many points in common. For example,
the methods follow very similar ways to arrive at a solution and their interpretations of
statistical measures, such as KMO, eigenvalues, or factor loadings, are (almost) identical.
In fact, SPSS blends these two procedures when running a PCA as the program initially
applies a factor analysis but rescales the estimates such that they conform to a PCA. That
way, the analysis assumes that the entire variance is common but produces (rotated) load-
ings (we will discuss factor rotation in 7 Sect. 8.3.4.1), which facilitate the interpretation
of the factors.
Despite the small differences of PCA and factor analysis in most research settings,
researchers have strong feelings about the choice of PCA or factor analysis. Cliff (1987, p.
349) summarizes this issue well, by noting that proponents of factor analysis “insist that
components analysis is at best a common factor analysis with some error added and at
worst an unrecognizable hodgepodge of things from which nothing can be determined.”
For further discussions on this topic, see also Velicer and Jackson (1990) and Widaman
(1993).2
2 Related discussions have been raised in structural equation modeling, where researchers have
heatedly discussed the strengths and limitations of factor-based and component-based approaches
(e.g. Sarstedt et al. 2016a; Hair et al. 2017b).
268 Chapter 8 · Principal Component and Factor Analysis
90°
F2
by five vectors starting at the zero point, with each vector’s length standardized to one.
To maximize the variance accounted for, the first factor F1 is fitted into this vector space
in such a way that the sum of all the angles between this factor and the five variables in
the vector space is minimized. We do this to interpret the angle between two vectors
as correlations. For example, if the factor’s vector and a variable’s vector are congruent,
8 the angle between these two is zero, indicating that the factor and the variable correlate
perfectly. On the other hand, if the factor and the variable are uncorrelated, the angle
between these two is 90°. This correlation between a (unit-scaled) factor and a vari-
able is called the factor loading. Note that factor weights and factor loadings essentially
express the same thing—the relationships between variables and factors—but they are
based on different scales.
After extracting F1, a second factor (F2) is extracted, which maximizes the remain-
ing variance accounted for. The second factor is fitted at a 90° angle into the vector space
(. Fig. 8.4) and is therefore uncorrelated with the first factor.4 If we extract a third factor,
it will explain the maximum amount of variance for which factors 1 and 2 have hitherto
not accounted. This factor will also be fitted at a 90° angle to the first two factors, making it
independent from the first two factors (we don’t illustrate this third factor in . Fig. 8.4, as
this is a three-dimensional space). The fact that the factors are uncorrelated is an import-
ant feature, as we can use them to replace many highly correlated variables in follow-up
analyses. For example, using uncorrelated factors as independent variables in a regression
analysis helps solve potential collinearity issues (7 Chap. 7).
3 Note that . Fig. 8.3 describes a special case, as the five variables are scaled down into a two-dimen-
sional space. In this set-up, it would be possible for the two factors to explain all five items. However,
in real-life, the five items span a five-dimensional vector space.
4 Note that this changes when oblique rotation is used. We will discuss factor rotation later in this
chapter.
8.3 · Principal Component Analysis
269 8
The Explained Visually webpage offers an excellent illustration of two- and three-dimensional
factor extraction.
© intheskies/stock.adobe.com
http://setosa.io/ev/principal-component-analysis/
An important PCA feature is that it works with standardized variables (see 7 Chap. 5 for
an explanation of what standardized variables are). Standardizing variables has import-
ant implications for our analysis in two respects. First, we can assess each factor’s eigen-
value, which indicates how much a specific factor extracts all of the variables’ variance
(see 7 Sect. 8.3.2.3). Second, the standardization of variables allows for assessing each
variable’s communality, which describes how much the factors extracted capture or repro-
duce each variable’s variance (see 7 Sect. 8.3.2.4).
100% 100%
0.30
0.50
0.80
68%
% of variance
1.30 1.30
5.00
2.10 2.10
overall variance. Every additional factor extracted increases the variance accounted for
until we have extracted as many factors as there are variables. In this case, the factors
account for 100 % of the overall variance, which means that the factors reproduce the
complete variance.
For readers interested in the statistical principles, the Explained Visually webpage illustrates the
concepts of eigenvalues and eigenvectors.
© Alexander Vasilyev/stock.adobe.com
http://setosa.io/ev/eigenvectors-and-eigenvalues/
8.3 · Principal Component Analysis
271 8
Following the PCA approach, we assume that factor extraction can reproduce each vari-
able’s entire variance. In other words, we assume that each variable’s variance is common;
that is, the variance is shared with other variables. This differs in factor analysis, in which
each variable can also have a unique variance.
Determining the number of factors to extract from the data is a crucial and challenging
step in any PCA. Several approaches offer guidance in this respect, but most researchers
do not pick just one method, but use multiple ones. If different methods suggest the same
number of factors, this leads to greater confidence in the results.
number of factors (Cattell 1966). This distinct break is called the “elbow.” Researchers
typically recommend retaining all factors above this break, as they contribute most to
the explanation of the variance in the dataset. Thus, we select one factor less than indi-
cated by the elbow.
8.3.3.4 Expectations
When, for example, replicating a previous market research study, we might have a priori
information on the number of factors we want to find. For example, if a previous study
suggests that a certain item set comprises five factors, we should extract the same number
of factors, even if statistical criteria, such as the scree plot, suggest a different number.
Similarly, theory might suggest that a certain number of factors should be extracted from
the data.
Strictly speaking, these are confirmatory approaches to PCA, which blur the distinc-
tion between these two factor analysis types. Ultimately however, we should not only rely
on the data, but keep in mind that the research results should be interpretable and action-
able for market research practice. Once we have decided on the number of factors to retain
from the data, we can start interpreting the factor solution.
F1 F1
x1 x4 x1 x4 F 2
x2 x5 x2
F1 F1 x5
x3 F2 x3
49º 45º
49º 66º
F2 F2
aims at maximizing the dispersion of loadings within factors, which means a few vari-
ables will have high loadings, while the remaining variables’ loadings will be considerably
smaller (Kaiser 1958)
Alternatively, we can choose between several oblique rotation techniques. In oblique
rotation, the 90° angle between the factors is not maintained during rotation, and the
resulting factors are therefore correlated. . Figure 8.6 (right side) illustrates an example
of an oblique factor rotation. Promax rotation is a commonly used oblique rotation tech-
nique. The promax rotation allows for setting an exponent (referred to as kappa) that
needs to be greater than 1. Higher values make the loadings even more extreme (i.e.,
high loadings are amplified and weak loadings are reduced even further), which is at
the cost of stronger correlations between the factors and less total variance explained
(Hamilton 2013). A kappa value of 3 works well for most applications. Direct oblimin
rotation is a popular alternative oblique rotation type, which allows specifying the
maximum degree of obliqueness. This degree is the delta, which determines the level
of the correlation allowed between the factors. A delta of zero (the default) ensures
that the factors are—if at all—only moderately correlated, which is acceptable for most
analyses. Oblique rotation is used when factors are possibly related. It is, for example,
very likely that the respondents’ satisfaction with the stadium is related to their satis-
8 faction with other aspects of the soccer club, such as the number of stars in the team or
the quality of the merchandise. However, relinquishing the initial objective of extract-
ing uncorrelated factors can diminish the factors’ interpretability. We therefore recom-
mend using the varimax rotation to enhance the interpretability of the results. Only if
the results are difficult to interpret, an oblique rotation should be applied. Among the
oblique rotation methods, researchers generally recommend the promax (Gorsuch
1983) or oblimin (Kim and Mueller 1978) methods but differences between the rota-
tion types are typically marginal (Brown 2009).
After the rotation and interpretation of the factors, we can compute the factor scores,
another element of the analysis. Factor scores are linear combinations of the items and can
be used as separate variables in subsequent analyses. For example, instead of using many
highly correlated independent variables in a regression analysis, we can use few uncor-
related factors to overcome collinearity problems.
The simplest ways to compute factor scores for each observation is to sum all the scores
of the items assigned to a factor. While easy to compute, this approach neglects the poten-
tial differences in each item’s contribution to each factor (Sarstedt et al. 2016).
Drawing on the item weights produced by the PCA is a more elaborate way of com-
puting factor scores (Hershberger 2005). These weights indicate each item’s relative con-
tribution to forming the factor; we simply multiply the standardized variables’ values
with the weights to get the factor scores. Factor scores computed on the basis of item
weights have a zero mean. This means that if a respondent has a value greater than zero
for a certain factor, he/she scores above the average in terms of the characteristic that this
factor describes. Conversely, if a factor score is below zero, then this respondent exhibits
the characteristic below average.
276 Chapter 8 · Principal Component and Factor Analysis
Different from the PCA, a factor analysis does not produce determinate factor scores.
In other words, the factor is indeterminate, which means that part of it remains an arbi-
trary quantity, capable of taking on an infinite range of values (e.g., Grice 2001; Steiger
1979). Thus, we have to rely on other approaches to compute factor scores. The use of these
approaches is, however, not restricted to factor analysis but extends to PCA, because of the
specific way SPSS (and other programs) have implemented the method. The most promi-
nent of these approaches is the regression method. This method takes into account (1) the
correlation between the factors and variables (via the variable loadings), (2) the correla-
tion between the variables, and (3) the correlation between the factors if oblique rotation
has been used (DiStefano et al. 2009). The regression method z-standardizes each factor to
zero mean and unit standard deviation.5 We can therefore interpret an observation’s score
in relation to the mean and in terms of the units of standard deviation from this mean. For
example, an observation’s factor score of 0.79 implies that this observation is 0.79 standard
deviations above the average with regard to the corresponding factor.
Another popular approach is the Bartlett method, which is similar to the regression
method. In SPSS, the method produces factor scores with zero mean and standard devia-
tions larger than one. Owing to the way they are estimated, the factor scores that the Bartlett
method produces are considered more accurate (Hershberger 2005). However, in practical
8 applications, both methods produce very similar results. Because of the z-standardization
of the scores, which facilitates the comparison of scores across factors, we recommend
using the regression method.
In . Table 8.3 we summarize the main steps that need to be taken when conducting a
PCA or factor analysis in SPSS.
Many researchers and practitioners acknowledge the prominent role that exploratory
factor analysis plays in exploring data structures. Data can be analyzed without precon-
ceived ideas of the number of factors or how these relate to the variables under consider-
ation. Whereas this approach is, as its name implies, exploratory in nature, the confirma-
tory factor analysis allows for testing hypothesized structures underlying a set of variables.
In a confirmatory factor analysis, the researcher needs to first specify the constructs
and their associations with variables, which should be based on previous measurements
or theory. Instead of allowing the procedure to determine the number of factors, as is
done in an exploratory factor analysis, a confirmatory factor analysis tells us how well
the actual data fit the pre-specified structure. Returning to our introductory example, we
could, for example, assume that the construct satisfaction with the stadium can be mea-
sured by the three items x1 (condition of the stadium), x2 (appearance of the stadium),
and x3 (interior design of the stadium). Likewise, we could hypothesize that satisfaction
with the merchandise can be adequately measured using the items x4 and x5. In a confir-
matory factor analysis, we set up a theoretical model linking the items with the respective
5 Note that this is only the case in PCA. When using factor analysis, the standard deviations are differ-
ent from one (DiStefano et al. 2009).
8.4 · Confirmatory Factor Analysis and Reliability Analysis
277 8
. Table 8.3 Steps involved in carrying out a PCA or factor analysis in SPSS
Theory Action
Check assumptions and carry out preliminary analyses
Select variables that should be ► Analyze ► Dimension Reduction ► Factor. Enter the
reduced to a set of underlying variables in the Variables box.
factors (PCA) or should be used to
identify underlying dimensions
(factor analysis)
Are the variables interval or ratio Determine the measurement level of your variables
scaled? (see 7 Chap. 3). If ordinal variables are used, make sure that
the scale steps are equidistant.
Is the sample size sufficiently Check MacCallum et al.’s (1999) guidelines for minimum
large? sample size requirements, dependent on the variables’
communality. For example, if all the communalities are
above 0.60, small sample sizes of below 100 are adequate.
With communalities around 0.50, sample sizes between 100
and 200 are sufficient. Ensure that your dataset meets these
thresholds after handling missing values.
Choose the method of factor If the goal is to reduce the number of variables to a set of
analysis underlying factors (i.e., principal component analysis):
► Analyze ► Dimension Reduction ► Factor ► Extraction
► Principal components.
If the goal is to identify underlying dimensions (i.e., factor
analysis):
► Analyze ► Dimension Reduction ► Factor ► Extraction
►Principal axis factoring
278 Chapter 8 · Principal Component and Factor Analysis
Theory Action
Determine the number of factors
Determine the number of factors Kaiser criterion: Extract all factor with an eigenvalue greater
than 1 (default).
Create a scree plot and select the number left of the
distinctive break (elbow).
► Analyze ► Dimension Reduction ► Factor ► Extraction
► Scree plot
Run parallel analysis: Download the syntax file Parallel
analysis.sps from the book’s website (⤓Web Appendix
→ Downloads) and open it in SPSS. Specify the number
of observations under compute Ncases (line 10) and
the number of variables under compute NVars (line 11).
Go to ► Run ► All. Extract those factors whose original
eigenvalues are greater than those indicated under Prcntyle.
Pre-specify the number of factors based on a priori
information: ► Analyze ► Data Reduction ► Factor
8 ► Extraction ► Fixed number of factors: Factors to extract
Check the Cumulative % column in the Total Variance
Explained column.
Rotate the factors Use the varimax procedure or, if necessary, the promax
procedure with kappa set to 3: ► Analyze ► Dimension
Reduction ► Factor ► Rotation.
Assign variables to factors and Use the rotated solution to assign each variable to a certain
interpret the factors factor based on the highest absolute loading. To facilitate
interpretation, you may also assign a variable to a different
factor but check that the loading lies at an acceptable level
(0.50 if only few factors are extracted, 0.30 if many factors are
extracted).
Find an umbrella term for clusters of items assigned to each
factor.
Compute factor scores Save factor scores as new variables using the regression
method: ► Analyze ► Dimension Reduction ► Factor
► Scores ► Save as variables: Regression.
Check how much of each Examine the communalities from the reproduced correlation
variable’s variance is reproduced matrix. Check if reproduced communalities (on the diagonal)
by means of factor extraction are ≥ 0.50.
8.4 · Confirmatory Factor Analysis and Reliability Analysis
279 8
construct (note that in confirmatory factor analysis, researchers generally use the term
construct rather than factor). This process is also called operationalization (see 7 Chap. 3)
and usually involves drawing a visual representation (called a path diagram) indicating
the expected relationships.
. Figure 8.7 shows a path diagram—you will notice the similarity to the diagram in
. Fig. 8.1. Ovals represent the constructs (e.g., Y1, satisfaction with the stadium) and boxes
represent the items (x1 to x5). Other elements include the relationships between the con-
structs and respective items (the loadings l1 to l5), the error terms (e1 to e5) that capture the
extent to which a construct does not explain a specific item, and the correlations between
the constructs of interest (r12).
Having defined the individual constructs and developed the path diagram, we can
estimate the model. The relationships between the constructs and items (the loadings l1
to l5) and the item correlations (not shown in . Fig. 8.7) are of particular interest, as they
indicate whether the construct has been reliably and validly measured.
Reliability analysis is an important element of a confirmatory factor analysis and essen-
tial when working with measurement scales. The preferred way to evaluate reliability is by
taking two independent measurements (using the same subjects) and comparing these
using correlations. This is also called test-retest reliability (see 7 Chap. 3). However, prac-
ticalities often prevent researchers from surveying their subjects a second time.
An alternative is to estimate the split-half reliability. In the split-half reliability, scale
items are divided into halves and the scores of the halves are correlated to obtain an esti-
mate of reliability. Since all items should be consistent regarding what they indicate about
the construct, the halves can be considered approximations of alternative forms of the
l3
Satisfaction with the interior
e3
design of the stadium (x3)
r 12
Assortment of
e4
merchandise (x4)
l4
Satisfaction with
the merchandise (Y2)
l5
Quality of
e5
merchandise (x5)
same scale. Consequently, instead of looking at the scale’s test-retest reliability, research-
ers consider the scale’s equivalence, thus showing the extent to which two measures of the
same general trait agree. We call this type of reliability the internal consistency reliability.
In the example of satisfaction with the stadium, we compute this scale’s split-half reli-
8 ability manually by, for example, splitting up the scale into x1 on the one side and x2 and
x3 on the other. We then compute the sum of x2 and x3 (or calculate the items’ average)
to form a total score and correlate this score with x1. A high correlation indicates that the
two subsets of items measure related aspects of the same underlying construct and, thus,
suggests a high degree of internal consistency. However, with many items, there are many
different ways to split the variables into two groups.
Cronbach (1951) proposed calculating the average of all possible split-half coefficients
resulting from different ways of splitting the scale items. The Cronbach’s Alpha coefficient
has become by far the most popular measure of internal consistency. In the example above,
this would comprise calculating the average of the correlations between (1) x1 and x2 + x3,
(2) x2 and x1 + x3, as well as (3) x3 and x1 + x2. The Cronbach’s Alpha coefficient gener-
ally varies from 0 to 1, whereas a generally agreed lower limit for the coefficient is 0.70.
However, in exploratory studies, a value of 0.60 is acceptable, while values of 0.80 or higher
are regarded as satisfactory in the more advanced stages of research (Hair et al. 2011). In
Box 8.2, we provide more advice on the use of Cronbach’s Alpha. We will illustrate a reli-
ability analysis using the standard SPSS module in the example at the end of this chapter.
Whereas a confirmatory factor analysis involves testing if and how items relate to specific
constructs, structural equation modeling involves the estimation of relations between these
constructs. It has become one of the most important methods in social sciences, includ-
ing marketing research.
There are broadly two approaches to structural equation modeling: Covariance-based
structural equation modeling (e.g., Jöreskog 1971) and partial least squares structural equa-
tion modeling (e.g., Wold 1982), simply referred to as CB-SEM and PLS-SEM. Both estima-
tion methods are based on the idea of an underlying model that allows the researcher to
test relationships between multiple items and constructs.
8.5 · Structural Equation Modeling
281 8
. Figure 8.8 shows an example path diagram with four constructs (represented by circles
or ovals) and their respective items (represented by boxes).6 A path model incorporates
two types of constructs: (1) exogenous constructs (here, satisfaction with the stadium (Y1)
and satisfaction with the merchandise (Y2)) that do not depend on other constructs, and
(2) endogenous constructs (here, overall satisfaction (Y3) and loyalty (Y4)) that depend on
one or more exogenous (or other endogenous) constructs. The relations between the con-
structs (indicated with p) are called path coefficients, while the relations between the con-
structs and their respective items (indicated with l) are the item loadings. One can distin-
guish between the structural model that incorporates the relations between the constructs
and the (exogenous and endogenous) measurement models that represent the relationships
between the constructs and their related items. Items that measure constructs are labeled x.
In the model in . Fig. 8.8, we assume that the two exogenous constructs satisfaction with
the stadium and satisfaction with the merchandise relate to the endogenous construct overall
satisfaction and that overall satisfaction relates to loyalty. Depending on the research question,
we could of course incorporate additional exogenous and endogenous constructs. Using
empirical data, we could then test this model and, thus, evaluate the relationships between
all the constructs and between each construct and its items. We could, for example, assess
which of the two constructs, Y1 or Y2, relates more strongly to Y3. The result helps us when
developing marketing plans to increase overall satisfaction and, ultimately, loyalty.
The evaluation of a path model analysis requires several steps that include the assess-
ment of both measurement models and the structural model. Diamantopoulos and Siguaw
(2000) and Hair et al. (2019) provide thorough descriptions of the covariance-based struc-
tural equation modeling approach and its application. Hair et al. (2017a, 2018) provide a
step-by-step introduction on how to set up and test path models using partial least squares
structural equation modeling.
x1 x6
l1 l6
Satisfaction with p11 Overall satisfaction
x2 l2 the stadium (Y1) l7 x7
(Y3)
l3 l8
x3 x8
p34
p21
x4 x9
l4 Satisfaction with l9
Loyalty (Y4)
the merchandise (Y2)
l5 l10
x5 x10
Structural model
Measurement models of the exogenous constructs Measurement models of the endogenous constructs
8.6 Example
In this example, we take a closer look at some of the items from the Oddjob Airways dataset
(⤓ Web Appendix → Downloads). This dataset contains eight items that relate to the cus-
tomers’ experience when flying with Oddjob Airways. For each of the following items, the
respondents had to rate their degree of agreement from 1 (“completely disagree”) to 100
(“completely agree”). The variable names are included below:
44with Oddjob Airways you will arrive on time (s1),
44the entire journey with Oddjob Airways will occur as booked (s2),
44in case something does not work out as planned, Oddjob Airways will find a good
solution (s3),
44the flight schedules of Oddjob Airways are reliable (s4),
44Oddjob Airways provides you with a very pleasant travel experience (s5),
44Oddjob Airways’s on board facilities are of high quality (s6),
44Oddjob Airways’s cabin seats are comfortable (s7), and
44Oddjob Airways offers a comfortable on-board experience (s8).
Our aim is to reduce the complexity of this item set by extracting several factors. Hence,
8 we use these items to run a PCA.
Under Rotation, you can choose between several orthogonal and oblique rotation
methods. Select the Varimax procedure and click on Continue. Finally, under Options, you
can decide how missing values should be handled and specify the display format of the
coefficients in the component matrix. Select Exclude cases listwise to eliminate observations
that have missing values in any of the variables used in any of the analyses. Avoid replacing
missing values with the mean as this will diminish the variation in the data, especially if
there are many missing values in your dataset. You should always check the option Sorted
by size under Coefficient Display Format, as this greatly increases the clarity of the display
of results. If you wish, you can suppress low loadings of say less than 0.20 by selecting the
option Suppress small coefficients. Particularly when analyzing many items, this option
makes the output much easier to interpret. After having specified all the options, you can
initiate the analysis by clicking on Continue, followed by OK.
284 Chapter 8 · Principal Component and Factor Analysis
The descriptive statistics in . Table 8.4 reveal that there are several observations with
missing values in the dataset. However, with 921 valid observations, the sample size
requirements are clearly met, even if the analysis produces very low communality values.
The correlation matrix in the upper part of . Table 8.5 indicates that there are several
pairs of highly correlated variables. The values in the diagonal are all 1.000, which is logical,
as this is the correlation between a variable and itself! The off-diagonal cells correspond
to the pairwise correlations. For example, the pairwise correlation between s1 and s2 is
.754. The corresponding value in the lower part of . Table 8.5 shows that all correlations
are significant (.000). As an absolute minimum standard, we need at least one correlation
in the off-diagonal cells to be significant and we clearly meet this minimum.The correla-
tion matrix in . Table 8.5 also shows that there are several pairs of highly correlated vari-
ables. For example, not only s1 is highly correlated with s2 (correlation = .754), but also s3
is highly correlated with s1 (correlation = .622), just like s4 (correlation = .733). As these
variables’ correlations with the remaining correlations is much lower, it is likely that these
four variables form one factor. As you can see from the correlation matrix, we can already
identify a likely factor structure.
The results in . Table 8.6 indicate a KMO value of .907, which is “marvelous”
(. Table 8.2). Correspondingly, all MSA values shown on the diagonal in the lower part of
the Anti-image Matrices output (. Table 8.7) are high. For example, s1 has an MSA value
of .914. Not surprisingly the Bartlett’s test shown in . Table 8.6 is significant (Sig. = .000),
which means that we can reject the null hypothesis of uncorrelated variables. Summariz-
ing these results, we conclude that the data are appropriate for PCA. Hence, we can con-
tinue with the interpretation of the PCA.
8.6 · Example
285 8
Descriptive Statistics
Correlation Matrix
s1 s2 s3 s4 s5 s6 s7 s8
df 28
Sig. .000
Anti-image Matrices
8.6 · Example
s1 s2 s3 s4 s5 s6 s7 s8
Anti-image Covariance s1 .354 −.107 −.029 −.112 −8.946E−5 −.007 −.011 −.019
Anti-image Correlation s1 .917a −.334 −.072 −.325 .000 −.025 −.035 −.068
Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Component Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %
1 5.249 65.611 65.611 5.249 65.611 65.611 3.411 42.633 42.633
Eigenvalue
2
Elbow
1
1 2 3 4 5 6 7 8
Component Number
6SHFLILFDWLRQVIRUWKLV5XQ
1FDVHV
1YDUV
1GDWVHWV
3HUFHQW
5DQGRP'DWD(LJHQYDOXHV
5RRW0HDQV3UFQW\OH
(1'0$75,;
8 of variables used in the analysis, which is 8. Next, initiate the analysis by going to ► Run
► All and SPSS will show an output similar to . Fig. 8.14.
The column labeled Prcntyle shows the 95th percentile for each factor’s eigenvalue
resulting from the randomly generated data. Note that because of this random process,
your numbers are going to look different. However, deviations typically occur at the third
decimal place. We can now compare the original eigenvalues from . Table 8.8 with the ran-
domly generated eigenvalues from . Fig. 8.14. We learn that the first two factors produce
eigenvalues larger than the randomly generated eigenvalues. Whereas the first original
Eigenvalue is clearly higher (5.249; . Table 8.7) than the randomly generated one (1.185;
. Fig. 8.14), the difference is much less pronounced for the second factor (1.328 vs. 1.122).
The third factor’s original eigenvalue (.397) is clearly lower than its randomly generated
counterpart (1.076). Hence, based on the parallel analysis results, we would also opt for
a two-factor solution.
Component
1 2
s6 .903 .272
s7 .902 .208
s8 .856 .352
s5 .852 .347
s4 .198 .885
s2 .282 .871
s1 .299 .829
s3 .340 .759
that the first set of variables (s1–s4) relate to reliability aspects of the journey and related
processes, such as the booking. We could therefore label this factor (i.e., factor 2) reli-
ability. The second set of variables (s5–s8) relate to different aspects of the onboard facil-
ities and the travel experience. Hence, we could label this factor (i.e., factor 1) onboard
experience. The labeling of factors is subjective and you could provide different labels.
> In case the analysis indicates a poor goodness-of-fit, you should reconsider the
set-up by eliminating items that have low communalities and MSA values.
8
292
Reproduced Correlations
s1 s2 s3 s4 s5 s6 s7 s8
Reproduced Correlation s1 .777a .806 .731 .793 .542 .495 .442 .548
Communalities
Initial Extraction
s1 1.000 .777
s2 1.000 .838
s3 1.000 .691
s4 1.000 .822
s5 1.000 .847
s6 1.000 .890
s7 1.000 .856
s8 1.000 .856
! SPSS can only calculate factor scores if it has information on all the variables
included in the analysis. If SPSS does not have all the information, it only shows a “.”
(dot) in the data view window, indicating a system-missing value.
To illustrate its usage, let’s carry out a reliability analysis of the first factor onboard experi-
ence by calculating Cronbach’s Alpha as a function of variables s5 to s8. To run the reliabil-
ity analysis, click on ► Analyze ► Scale ► Reliability Analysis. Next, enter variables s5,
s6, s7, and s8 into the Items box (again, you may have to right-click on the items and select
Display Variable Names to show the names instead of the variable labels) and type in the
294 Chapter 8 · Principal Component and Factor Analysis
Reliability Statistics
carry out a reliability analysis to test a scale using a different sample—this example is only
for illustration purposes! The rightmost column of . Table 8.13 indicates what the Cron-
bach’s Alpha would be if we deleted the item indicated in that row. When we compare
each of the values with the overall Cronbach’s Alpha value, we can see that any change in
the scale’s set-up would reduce the Cronbach’s Alpha value. For example, by removing s5
from the scale, the Cronbach’s Alpha of the new scale comprising only s6, s7, and s8 would
be reduced to .928. In the column labeled Corrected Item-Total Correlation of . Table 8.13,
SPSS indicates the correlation between the item and the scale that is composed of other
items. This information is useful for determining whether reverse-coded items were also
identified as such. Reverse-coded items should have a minus sign.
Item-Total Statistics
Case Study
Haver & Boecker (http:// and complex. Since the importance of customer
www.haverboecker.com) is company’s philosophy satisfaction and decided
one of the world’s leading is to help customers and to commission a market
and most renowned business partners solve their research project in order to
machine producers in the challenges or problems, identify marketing activities
fields of mineral processing, they often customize their that can positively contribute
as well as the storing, products and services to to the business’s overall
conveying, packing, and meet the buyers’ needs. success. Based on a thorough
loading of bulk material. Therefore, the customer literature review, as well
The family-owned group is not a passive buyer, but as interviews with experts,
operates through its global an active partner. Given the company developed a
network of facilities, with this background, the short survey to explore their
manufacturing units, among customers’ satisfaction customers’ satisfaction with
others, in Germany, the UK, plays an important role in specific performance features
Belgium, US, Canada, Brazil, establishing, developing, and their overall satisfaction.
China, and India. and maintaining successful All the items were measured
The company’s relationships customer relationships. on seven-point scales, with
with its customers are Early on, the company’s higher scores denoting
usually long-term oriented management realized the higher levels of satisfaction.
8.8 · Review Questions
297 8
A standardized survey was 44 Overall, how satisfied are data are sufficiently
mailed to customers in you with the supplier? correlated?
12 countries worldwide, (overall) (b) How many factors
which resulted in 281 fully would you extract?
completed questionnaires. Your task is to analyze the Base your decision on
The following items (names dataset to provide the the Kaiser criterion,
in parentheses) were listed in management of Haver the scree plot, and
the survey: & Boecker with advice parallel analysis. Do
44 Reliability of the machines for effective customer these three methods
and systems. (s1) satisfaction management. The suggest the same
44 Life-time of the machines dataset is labeled Haver and number of factors?
and systems. (s2) Boecker.sav (⤓ Web Appendix (c) Find suitable labels for
44 Functionality and → Downloads). the extracted factors.
user-friendliness 1. Using regression analysis, (d) Evaluate the
operation of the machines locate those variables factor solution’s
and systems. (s3) that best explain the goodness-of-fit.
44 Appearance of the customers’ overall 3. Use the factor scores and
machines and systems. (s4) satisfaction (overall). regress the customers’
44 Accuracy of the machines Evaluate the model fit overall satisfaction
and systems. (s5) and assess the impact (overall) on these.
44 Timely availability of the of each variable on the Evaluate the strength of
after-sales service. (s6) dependent variable. the model and compare it
44 Local availability of the Remember to consider with the initial regression.
after-sales service. (s7) collinearity diagnostics. What should Haver &
44 Fast processing of 2. Determine the factors Boecker’s management
complaints. (s8) that characterize the do to increase their
44 Composition of respondents using factor customers’ satisfaction?
quotations. (s9) analysis. Use items s1–s12 4. Calculate the Cronbach’s
44 Transparency of for this. Run a PCA with Alpha over items s1–s5
quotations. (s10) varimax rotation to help and interpret the results.
44 Fixed product prize interpretation. Consider
for the machines and the following aspects: For further information on
systems. (s11) (a) Are all assumptions the dataset and the study,
44 Cost/performance ratio for carrying out a see Festge and Schwaiger
of the machines and PCA met? Are the (2007), as well as Sarstedt
systems. (s12) et al. (2009).
1. What is factor analysis? Try to explain what factor analysis is in your own words.
2. What is the difference between exploratory factor analysis and confirmatory factor
analysis?
3. What is the difference between PCA and factor analysis?
4. Describe the terms communality, eigenvalue, and factor loading. How do these
concepts relate to one another?
5. Describe the Kaiser criterion, the scree plot, and parallel analysis to determine
the number of factors. What are there similarities and differences between these
methods?
298 Chapter 8 · Principal Component and Factor Analysis
References
Brown, J. D. (2009). Choosing the right type of rotation in PCA and EFA. JALT Testing & Evaluation SIG News-
letter, 13(3), 20–25.
Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1(2),
245–276.
Cliff, N. (1987). Analyzing multivariate data. New York, NJ: Harcourt Brace Jovanovich.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334.
Diamantopoulos, A., & Siguaw, J. A. (2000). Introducing LISREL: A guide for the uninitiated. London: Sage.
Dinno, A. (2009). Exploring the sensitivity of Horn’s parallel analysis to the distributional form of random
8 data. Multivariate Behavioral Research, 44(3), 362–388.
DiStefano, C., Zhu, M., & Mîndriă, D. (2009). Understanding and using factor scores: Considerations fort he
applied researcher. Practical Assessment, Research & Evaluation, 14(20), 1–11.
Festge, F., & Schwaiger, M. (2007). The drivers of customer satisfaction with industrial goods: An interna-
tional study. Advances in International Marketing, 18, 179–207.
Gorsuch, R. L. (1983). Factor analysis (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.
Graffelman, J. (2013). Linear-angle correlation plots: New graphs for revealing correlation structure. Jour-
nal of Computational and Graphical Statistics, 22(1), 92–106.
Grice, J. W. (2001). Computing and evaluating factor scores. Psychological Methods, 6(4), 430–450.
Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2019). Multivariate data analysis. A global perspective
(8th ed.). Boston. MA: Cengage.
Hair, J. F., Ringle, C. M., & Sarstedt, M. (2011). PLS-SEM: Indeed a silver bullet. Journal of Marketing Theory
and Practice, 19(2), 139–151.
Hair, J. F., Hult, G. T. M., Ringle, C. M., & Sarstedt, M. (2017a). A primer on partial least squares structural equa-
tion modeling (PLS-SEM) (2nd ed.). Thousand Oaks, CA: Sage.
Hair, J. F., Hult, G. T. M., Ringle, C. M., Sarstedt, M., & Thiele, K. O. (2017b). Mirror, mirror on the wall. A com-
parative evaluation of composite-based structural equation modeling methods. Journal of the Acad-
emy of Marketing Science, 45(5), 616–632.
Hair, J. F., Sarstedt, M., Ringle, C. M., & Gudergan, S. P. (2018). Advanced issues in partial least squares struc-
tural equation modeling (PLS-SEM). Thousand Oaks, CA: Sage.
Hamilton, L. C. (2013), Statistics with Stata: Version 12: Cengage Learning.
Hayton, J. C., Allen, D. G., & Scarpello, V. (2004). Factor retention decisions in exploratory factor analysis:
A tutorial on parallel analysis. Organizational Research Methods, 7(2), 191–205.
Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor analysis in published research: Common
errors and some comment on improved practice. Educational and Psychological Measurement, 66(3),
393–416.
Hershberger, S. L. (2005). Factor scores. In: B. S. Everitt & D. C. Howell (Eds.), Encyclopedia of statistics in
behavioral science (pp. 636–644). New York, NJ: John Wiley.
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30(2),
179–185.
Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36(4), 409–426.
Kaiser, H. F. (1958). The varimax criterion for factor analytic rotation in factor analysis. Educational and Psy-
chological Measurement, 23(3), 770–773.
Kaiser, H. F. (1974). An index of factorial simplicity. Psychometrika, 39(1), 31–36.
References
299 8
Kim, J. O., & Mueller, C. W. (1978). Introduction to factor analysis: What it is and how to do it. Thousand Oaks,
CA: Sage.
Longman, R. S., Cota, A. A., Holden, R. R., & Fekken, G. C. (1989). A regression equation for the parallel anal-
ysis criterion in principal components analysis: Mean and 95th percentile Eigenvalues. Multivariate
Behavioral Research, 24(1), 59–69.
MacCallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological
Methods, 4(1), 84–99.
Matsunga, M. (2010). How to factor-analyze your data right: Do’s and don’ts and how to’s. International
Journal of Psychological Research, 3(1), 97–110.
Mulaik, S. A. (2009). Foundations of factor analysis (2nd ed.). London: Chapman & Hall.
O'Connor, B. P. (2000). SPSS and SAS programs for determining the number of components using parallel
analysis and Velicer's MAP test. Behavior Research Methods, Instruments, & Computers, 32(3), 396–402.
Preacher, K. J., & MacCallum, R. C. (2003). Repairing Tom Swift’s electric factor analysis machine. Under-
standing Statistics, 2(1), 13–43.
Russell, D. W. (2002). In search of underlying dimensions: The use (and abuse) of factor analysis in Person-
ality and Social Psychology Bulletin. Personality and Social Psychology Bulletin, 28(12), 1629–1646.
Sarstedt, M., Schwaiger, M., & Ringle, C. M. (2009). Do we fully understand the critical success factors of
customer satisfaction with industrial goods? Extending Festge and Schwaiger’s model to account for
unobserved heterogeneity. Journal of Business Market Management, 3(3), 185–206.
Sarstedt, M., Ringle, C. M., Raithel, S., & Gudergan, S. (2014). In pursuit of understanding what drives fan
satisfaction. Journal of Leisure Research, 46(4), 419–447.
Sarstedt, M., Hair, J. F., Ringle, C. M., Thiele, K. O., & Gudergan, S. P. (2016). Estimation issues with PLS and
CBSEM: Where the bias lies!. Journal of Business Research, 69(10), 3998–4010.
Steiger, J. H. (1979). Factor indeterminacy in the 1930's and the 1970's some interesting parallels. Psycho-
metrika, 44(2), 157–167.
Stevens, J. P. (2009). Applied multivariate statistics for the social sciences (5th ed.). Hillsdale: Erlbaum.
Velicer, W. F., & Jackson, D. N. (1990). Component analysis versus common factor analysis: Some issues in
selecting an appropriate procedure. Multivariate Behavioral Research, 25(1), 1–28.
Widaman, K. F. (1993). Common factor analysis versus principal component analysis: Differential bias in
representing model parameters?. Multivariate Behavioral Research, 28(3), 263–311.
Wold, H. O. A. (1982). Soft modeling: The basic design and some extensions. In: K. G. Jöreskog & H. O. A.
Wold (Eds.), Systems under indirect observations: Part II (pp. 1–54). Amsterdam: North-Holland.
Zwick, W. R., & Velicer, W. F. (1986). Comparison of five rules for determining the number of components
to retain. Psychological Bulletin, 99(3), 432–442.
Further Reading
Nunnally, J. C., & Bernstein, I. H. (1993). Psychometric theory (3rd ed.). New York: McGraw-Hill.
Stewart, D. W. (1981). The application and misapplication of factor analysis in marketing research. Journal
of Marketing Research, 18(1), 51–62.