100% found this document useful (1 vote)
124 views4 pages

Understanding Factor Analysis in Statistics

Factor analysis is a statistical technique used to analyze relationships among variables and reduce a large set of variables down to a smaller set of underlying dimensions called factors. It involves identifying common factors that explain the correlations between variables. The document provides an example of how factor analysis could be used to validate a conceptual model of leadership as comprising two factors of task skills and people skills. It also outlines the basic steps in conducting a factor analysis, including extracting an initial factor solution, rotating the factors, interpreting the results, and naming the identified factors.

Uploaded by

veerashah85
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
124 views4 pages

Understanding Factor Analysis in Statistics

Factor analysis is a statistical technique used to analyze relationships among variables and reduce a large set of variables down to a smaller set of underlying dimensions called factors. It involves identifying common factors that explain the correlations between variables. The document provides an example of how factor analysis could be used to validate a conceptual model of leadership as comprising two factors of task skills and people skills. It also outlines the basic steps in conducting a factor analysis, including extracting an initial factor solution, rotating the factors, interpreting the results, and naming the identified factors.

Uploaded by

veerashah85
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd

Multivariate Statistics: Factor Analysis

Factor Analysis can be seen as the granddaddy of all the multivariate techniques we are
looking at here. Of the three, it is the most-frequently used, and has the largest amount of
literature devoted to it. See references for some places to start.)

Definition and an example

Factor analysis is:

a statistical approach that can be used to analyze interrelationships among a large number
of variables and to explain these variables in terms of their common underlying
dimensions (factors). The statistical approach involving finding a way of condensing the
information contained in a number of original variables into a smaller set of dimensions
(factors) with a minimum loss of information (Hair et al., 1992).

Factor analysis could be used to verify your conceptualization of a construct of interest.


For example, in many studies, the construct of "leadership" has been observed to be
composed of "task skills" and "people skills." Let's say that, for some reason, you are
developing a new questionnaire about leadership and you create 20 items. You think 10
will reflect "task" elements and 10 "people" elements, but since your items are new, you
want to test your conceptualization.

Before you use the questionnaire on your sample, you decide to pretest it (always wise!)
on a group of people who are like those who will be completing your survey. When you
analyze your data, you do a factor analysis to see if there are really two factors, and if
those factors represent the dimensions of task and people skills. If they do, you will be
able to create two separate scales, by summing the items on each dimension. If they don't,
well it's back to the drawing board.

What you need in order to do a factor analysis

Remember, factor analysis requires that you have data in the form of correlations, so all
of the assumptions that apply to correlations, are relevent here.

Types of factor analysis: Two main types:

• Principal component analysis -- this method provides a unique solution, so that


the original data can be reconstructed from the results. It looks at the total
variance among the variables, so the solution generated will include as many
factors as there are variables, although it is unlikely that they will all meet the
criteria for retention. There is only one method for completing a principal
components analysis; this is not true of any of the other multidimensional
methods described here.
• Common factor analysis -- this is what people generally mean when they say
"factor analysis." This family of techniques uses an estimate of common variance
among the original variables to generate the factor solution. Because of this, the
number of factors will always be less than the number of original variables. So,
choosing the number of factors to keep for further analysis is more problematic
using common factor analysis than in principle components.

Steps in conducting a factor analysis

There are four basic factor analysis steps:

• data collection and generation of the correlation matrix


• extraction of initial factor solution
• rotation and interpretation
• construction of scales or factor scores to use in further analyses

Extraction of an initial solution

The output of a factor analysis will give you several things. The table below shows how
output helps to determine the number of components/factors to be retained for futher
analysis. One good rule of thumb for determining the number of factors, is the
"eigenvalue greater than 1" criteria. For the moment, let's not worry about the meaning of
eigenvalues, however this criteria allows us to be fairly sure that any factors we keep will
account for at least the variance of one of the variables used in the analysis. However,
when applying this rule, keep in mind that when the number of variables is small, the
analysis may result in fewer factors than "really" exist in the data, while a large number
of variables may produce more factors meeting the criteria than are meaningful. There
are other criteria for selecting the number of factors to keep, but this is the easiest to
apply, since it is the default of most statistical computer programs.

Note that the factors will all be orthogonal to one another, meaning that they will be
uncorrelated.

Remember that in our hypothetical leadership example, we expected to find two factors,
representing task and people skills. The first output is the results of the extraction of
components/factors, which will look something like this:

Table #1: Sample extraction of components/factors


Factors Eigenvalue % of variance Cumulative % of variance
1 2.6379 44.5 37.6
2 1.9890 39.3 83.8
3 0.8065 8.4 92.2
4 0.6783 7.8 100.0

Interpreting your results


Since the first two factors were the only ones that had eigenvalues > 1, the final factor
solution will only represent 83.8% of the variance in the data. The loadings listed under
the "Factor" headings represent a correlation between that item and the overall factor.
Like Pearson correlations, they range from -1 to 1. The next panel of factor analysis
output might look something like this:

Table #2: Unrotated Factor Matrix


Variables Factor 1 Factor 2 Communality
Ability to define problems .81 -.45 .87
Ability to supervise others .84 -.31 .79
Ability to make decisions .80 -.29 .90
Ability to build consensus .89 .37 .88
Ability to facilitate decision-making .79 .51 .67
Ability to work on a team .45 .43 .72

This table shows the difficulty of interpreting an unrotated factor solution. All of the most
significant loadings (highlighted) are on Factor #1. This is a common pattern. One way to
obtain more intepretable results is to rotate your solution. Most computer packages use
varimax rotation, although there are other techniques.

Below is an example of what the factors might look like if we rotated them. Notice that
the loadings are distributed between the factors, and that the results are easier to interpret.

Table #3: Rotated Factor Matrix


Variables Factor 1 Factor 2 Communality
Ability to define problems .68 .17 .87
Ability to supervise others .87 .24 .79
Ability to make decisions .65 .07 .90
Ability to build consensus .16 .76 .88
Ability to facilitate decision-making .30 .83 .67
Ability to work on a team .19 .69 .72

Naming the factors

Now we have a highly interpretable solution, which represents almost 90% of the data.
The next step is to name the factors. There are a few rules suggested by methodologists:

Factor names should

• be brief, one or two words


• communicate the nature of the underlying construct

Look for patterns of similarity between items that load on a factor. If you are seeking to
validate a theoretical structure, you may want to use the factor names that already exist in
the literature. Otherwise, use names that will communicate your conceptual structure to
others. In addition, you can try looking at what items do not load on a factor, to
determine what that factor isn't. Also, try reversing loadings to get a better interpretation.

Using the factor scores

It is possible to do several things with factor analysis results, but the most common are to
use factor scores, or to make summated scales based on the factor structure.

Because the results of a factor analysis can be strongly influenced by the presence of
error in the original data, Hait, et al. recommend using factor scores if the scales used to
collect the original data are "well-constructed, valid, and reliable" instruments.
Otherwise, they suggest that if the scales are "untested and exploratory, with little or no
evidence of reliability or validity," summated scores should be constructed. An added
benefit of summated scores is that if they are to be used in further analysis, they preserve
the variation in the data.

Other links

Phillip Ingram, of the School of Earth Sciences, Macquarie University, Sydney,


Australia, has a Statistics Page, which includes separate pages for Multivariable
Statistics, including principal components and factor analysis. The material is
more advanced than that presented here, but very useful for those who will be
employing these analyses techniques.

Back to the Multivariate Statistics home page

Forward to the Multidimensional Scaling (MDS) page

Forward to the Cluster Analysis page

You might also like