0% found this document useful (0 votes)
15 views52 pages

Statistical Terms and Tests Overview

This document serves as a refresher on statistical terms and tests, covering descriptive and inferential statistics, measurement scales, and hypothesis testing. It includes examples of SPSS output, questionnaires for data analysis, and various statistical tests such as t-tests and ANOVA. The document also outlines the process for statistical hypothesis testing and provides insights into regression analysis.

Uploaded by

goayo23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views52 pages

Statistical Terms and Tests Overview

This document serves as a refresher on statistical terms and tests, covering descriptive and inferential statistics, measurement scales, and hypothesis testing. It includes examples of SPSS output, questionnaires for data analysis, and various statistical tests such as t-tests and ANOVA. The document also outlines the process for statistical hypothesis testing and provides insights into regression analysis.

Uploaded by

goayo23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Appendix I

A Refresher on some
Statistical Terms and Tests
Chapter Objectives
• Provide a ‘refresher’ of some statistical terms and tests
• Explain what types of analysis are appropriate, under
what conditions and for what objectives
• Give examples of SPSS computer output
• Explain descriptive statistics, including frequencies,
means, standard deviations, and variance
• Present a process for statistical hypothesis testing using
a computer package
• Demonstrate how inferential statistics can be used to test
hypotheses
Statistics
• Descriptive Statistics
– Help to describe the phenomena of interest
• Inferential Statistics
– Help to draw conclusions from the analysis of
data
• Parametric
– Assumes sample drawn from normal population
• Non-parametric
– Assumes sample drawn from a non-normal population
Properties of the Four Measurement Scales
Highlights

Scale Differ- Order Distance Unique Measures Measures of Some tests


ence origin of central dispersion of
tendency significance

Nominal Yes No No No Mode — X2

Ordinal Yes Yes No No Median Semi- Rank-order


interquartile correlations
range

Interval Yes Yes Yes No Arithmetic Standard t, F


mean deviation,
variance,
coefficient of
variation

Ratio Yes Yes Yes Yes Arithmetic Standard t, F


or deviation,
geometric variance or
mean coefficient of
variation
Note: The interval scale has 1 as an arbitrary starting point. The ratio scale has the natural origin 0, which is meaningful.
Sample Questionnaire for Data Analysis

Business Research Class Questionnaire


The purpose of this short questionnaire is to collect some nominal, ordinal, interval
and ratio data that can be used to demonstrate some of the basic statistical methods
for analysing quantitative data. The individual responses will be anonymous and
the data collected will be used only for class exercises.
Please tick the appropriate box, provide the data requested or circle a number,
where appropriate.
1. What is your gender?
Female Male
2. Please indicate your height to the nearest centimetre (cm)
3. Please indicate your weight to the nearest kilogram (kg)
4. What is the colour of your eyes? (just tick one box please!)
Blue Brown Other
5. Please indicate the extent to which you disagree or agree with the following
statements:
Strongly Strongly
disagree Disagree Neutral Agree agree
Statistics is interesting. 1 2 3 4 5
Statistics knowledge is
useful in business. 1 2 3 4 5
Many thanks for your time and assistance with completing this questionnaire.
We will now proceed to analyse the data collected!
Variable Names, Labels and Values for
Sample Data Set
Variable name Labels (variable definition) Values for variable
id Identifier for each respondent

gender Female or male 0 = Female


1 = Male

height Height in centimetres

weight Weight in kilograms

eye_col Eye colour 1 = Blue


2 = Brown
3 = Other

stat_int Statistics is interesting 1 = Strongly disagree


2 = Disagree
3 = Neutral
4 = Agree
5 = Strongly agree

stat_use Statistics is useful


Example of SPSS Data Editor Input Data –
Data View
Example of SPSS Data Editor Input Data –
Variable View
Descriptive Statistics
• Frequencies
• Measures of central tendencies
– mean, median, mode
• Measures of dispersion
– range, variance, standard deviation
– other measures
– standard error of the mean
Example – Responses to the statement
‘Statistics is interesting’
The Mean

X
Range
Represents the difference between the
highest and lowest values of a variable of
interest.
• Eg, max = 50, min = 30, range = 20
Variance Formula

Note: This formula is correct. The formula in the book is incorrect


Area under the Normal Curve
Box and Whisker Plots
Normal, Skewed and Sampling Distributions

Source: Adapted from Zikmund (2000:381).


Standard Error of the Mean
• When a number of samples are taken from
the population, the sample means form a
distribution
• The standard deviation of these sample
means is called the standard error of the
mean
• As the sample size increases, the standard
error gets smaller
Standard Error of the Mean - Formula
Example of SPSS output of
Descriptive Statistics
Std.
N Min. Max. Mean Dev Skewness Kurtosis
Std. Std. Std.
Stat. Stat. Stat. Stat. Error Stat. Stat. Error Stat. Error
Height in
centimetres 34 160 188 173.74 1.30 7.59 -.088 .403 -.831 .788

Weight in
kilograms 34 54 95 74.15 1.84 10.72 -.133 .403 -.318 .788

Statistics is
interesting 34 1 5 3.03 .21 1.24 .243 .403 -.901 .788

Statistics is
useful 34 1 5 2.15 .15 .89 1.329 .403 2.579 .788

Valid N
(listwise) 34
Inferential Statistics
Helps to draw inferences or conclusions
from the analysis of the data, such as:
• The relationships between two variables
• Differences in variables among different
subgroups
• How several independent variables might
explain the variance in an independent
variable
Inferential Statistics
• Statistical hypothesis testing
– The null and alternate hypotheses
– Choosing a statistical test
– Significance level
• Correlations
Process for Statistical Hypothesis Testing
using a Computer
The null and alternate hypotheses
• Null hypothesis
– the conjecture that postulates no differences or
no relationship between or among variables
• Alternate hypothesis
– an educated conjecture that sets the parameters
one expects to find
Choosing a Statistical Test
• Parametric tests can be applied to interval
and ratio data (and also ordinal data where
they are expressed in numeric form and
‘interval’ features are present).
• Non-parametric tests are applied to
categorical data — ie, nominal and most
ordinal data
Classification of Statistical Tests
Dependent variable
Nominal Ordinal Interval/Ratio
 Chi-square test  Sign test  Analysis of
for independence  Median test variance (ANOVA)
Nominal
 Mann-Whitney U test  T-test
 Cochran Q test  Kruskal-Wallis one-way
 Fisher exact analysis of variance
probability
One

 Spearman’s rank Analysis of variance


Ordinal

correlation with trend analysis


 Kendall’s rank correlation
Independent variables

Analysis of variance  Simple regression


Interval
ratio

(ANOVA) analysis
 Pearson correlation
Friedman two-way analysis Analysis of variance
Nominal

of variance (factorial design)


Two or more

Ordinal

Multiple Multiple regression


Interval
ratio

discriminant analysis
analysis

Source: Adapted from R. L. Baker and R. E. Schultz (eds) 1972, Instructional Product Research.
New York: Van Nostrand Co.
Significance level
• the probability of rejecting the null
hypothesis when it is true
• also called the critical value
• the probability of this occurring is called 
(alpha)
• Significance level = 1 – confidence level
• Eg significance level  = 0.05, indicates
confidence level = 0.95 (or 95%)
Hypothesis Testing and
Statistical Decision Making

Statistical decision
Accept H0 Reject H0
H0 is true Correct Type I error
True state of (Probability = 1-a) (Probability = a)
the situation H0 is false Type II error Correct
(Probability = b) (Probability = 1- b)
Relationship between Type I and II Errors

Source: D. A. Aaker, V. Kumar


and G. S. Day 1995, Marketing
Research, 5th edition. New York:
John Wiley & Sons, p. 473
Pearson Correlation
• indicates the direction, strength and
significance of the bivariate relationships
between interval or ratio variables, eg:
HO: Role overload and performance are not related
to each other. [r = 0]
HA: the two are significantly negatively correlated.
[r < 0]

r = -0.1735 p = 0.083
r = -0.29 p = 0.054
r = -0.33 p = 0.049
Scatter Diagrams of two Variables with
different Correlation Coefficients
Procedure for Chi-square Test with SPSS

Step 1: Formulating the hypotheses


Step 2: Decision criterion
Step 3: Analyse data with computer package
Step 4: Make a statistical decision
Step 5: Interpret the decision
Example of SPSS Output for Crosstabs and
Chi-square tests36

Female or male * eye colour cross-tabulation


Eye colour
Blue Brown Other Total
Female Female Count 3 3 2 8
or Male Expected Count 3.1 3.1 1.9 8.0
Male Count 10 10 6 26
Expected Count 9.9 9.9 6.1 26.0
Total Count 13 13 8 34
Expected Count 13.0 13.0 8.0 34.0
Example of SPSS Output for Crosstabs and
Chi-square tests36 (cont)

Chi-square tests
Value df Asymp. Sig.
(2-sided)
Pearson chi-square .013a 2 .994
Likelihood ratio .012 2 .994
No. of valid cases 34
a
3 cells (50.0%) have expected count less than 5. The minimum expected count is 1.88.
t distribution
• is suitable for analysing the means of small
samples
• drawn from a population that is normally
distributed
• shape of the t distribution depends on the
degrees of freedom (df )
t distribution formula
Comparison of t distribution & normal curve
Example of SPSS output for single
sample tests
One-sample statistics
N Mean Std. Deviation Std. Error Mean
Statistics is interesting 34 3.03 1.24 .21
Statistics is useful 34 2.15 .89 .15

One-sample test
Test Value = 3
95% Confidence Interval
Sig. Mean of the Difference
t df (2-tailed) Difference Lower Upper
Statistics is
interesting 0.138 33 0.891 2.94E-02 -0.40 0.46
Statistics is
useful -5.575 33 0.000 -0.85 -1.16 -0.54
Example of SPSS output for two
independent samples t-tests
Group statistics
Female or Male N Mean Std. Deviation Std. Error Mean
Height in Male 26 176.38 5.97 1.17
centimetres Female 8 165.13 5.74 2.03

Independent samples test


t-Test for Equality of Means
Levene’s 90%
Test for Confidence
Equality of Std. Interval of the
Variances Sig. (2- Mean Error Difference
F Sig. t df tailed) Diff. Diff. Lower Upper
Height in Equal variances .747 .394 4.701 32 .000 11.26 2.40 7.20 15.32
centimetres assumed
Equal variances 4.803 12.062 .000 11.26 2.34 7.08 15.44
not assumed
Example of SPSS output for one-way
between groups ANOVA

ANOVA
Sum of df Mean F Sig.
Squares Square
Weight in Between groups 2350.495 5 470.099 9.117 .000
kilograms Within groups 1443.770 28 51.563
Total 3794.265 33

Statistics is Between groups 11.313 5 2.263 1.598 .193


interesting Within groups 39.657 28 1.416
Total 50.971 33
Regression Analysis
• Explains the variance in the dependent variable by
a set of predictors
• R-square (R2) is the explained variance
• Step-wise regression will indicate the order of
importance of the significant preditors in the
regression model
• The Beta weight of the predictors and their
significance indicates the weight each predictor
(independent variable) exerts in explaining the
variance in the dependent variable.
A Simple Regression Model
General Form of Simple Regression Line

Y = a + bX
where:
Y is the dependent variable
X is the independent variable
a is the intercept of the regression line on the
Y (vertical) axis
b is the slope of the regression line
Assumptions of Regression Analysis

Regression analysis — assumptions


1 Errors are normally distributed.
2 Errors have a zero mean (or expected value).
3 Errors have a constant variance over all values of the dependent variable
and in each time period.
4 Errors are not correlated (that is, errors in one time period are not related
to one another).
Assumptions (1) to (4) are required to obtain unbiased estimates and to use
probability theory to test the reliability of estimates.
5 Other assumptions for multiple regression analysis include: The number of
independent or explanatory variables in the regression must be smaller than
the number of observations.
6 There must not be perfect linear correlation among the independent
variables.
Example of SPSS output for simple regression
analysis
Variables entered/removedb
Model Variables Entered Variables Removed Method
1 Height in centimetresa . Enter
a
All requested variables entered.
b
Dependent variable: Weight in kilograms

Model summary
Model R R Square Adjusted R Square Std. Error of the
Estimate
1 .702a .492 .476 7.76
a
predictors: (constant) height in centimetres

ANOVAb
Model Sum of Squares df Mean Square F Sig.
1 Regression 1868.153 1 1868.153 31.037 .000a
Residual 1926.112 32 60.191
Total 3794.265 33
a
Predictors: (constant), Height in centimetres
b
Dependable variable: Weight in kilograms
Example of SPSS output for simple regression
analysis (cont.)

Coefficientsa
Unstandardised Standardised
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) -98.189 30.963 -3.171 .003
Height in centimetres .992 .178 .702 5.571 .000
a
Dependent variable: Weight in kilograms
Factor Analysis
helps to reduce a vast number of variables
(for example all the questions tapping
several variables of interest in a
questionnaire) to a meaningful,
interpretable and manageable set of factors
Output of a Factor Analysis for the Evaluation
Questionnaire
Item Factor A Factor B Factor C
01 0.22346 -0.07300 0.56741
02 0.22173 0.11086 0.72553
03 -0.11021 -0.13279 0.81159
04 0.61291 -0.18332 -0.11947
05 0.66958 0.02347 0.20299
06 0.07750 -0.61141 0.20597
07 0.72003 -0.00131 0.00424
08 0.77667 0.19312 0.12914
09 0.01284 -0.24331 0.58230
10 0.62759 -0.26720 -0.10879
11 0.43661 -0.15479 0.39083
12 0.48110 -0.44660 0.03393
13 0.03974 -0.70698 0.17139
14 -0.00829 -0.81637 0.12066
15 -0.06771 -0.93732 -0.08814
Items under each Factor for the Evaluation
Questionnaire
FACTOR A
4. The trainer has provided adequate feedback on student
performance.
5. The trainer has provided an adequate flow of ideas in this course.
7. The trainer seems to know when trainees did not understand the
material.
8. The trainer made a major contribution to the value of this course.
10. The trainer has been creative in developing materials for this
course.

FACTOR B
6. This course has been adapted to trainees’ needs.
13. I learned a lot in this course.
14. This course generally fulfilled my goals.
15. This course stimulated me to want to take more work in the same
or related area.
Items under each Factor for the Evaluation
Questionnaire (cont.)

FACTOR C
1. Class time was well spent.
2. The course was well organised.
3. There was considerable agreement between announced objectives
and what was taught.
9. The goals and objectives for this course have been clearly stated.

THE LEAKING ITEMS


11. The teaching methods used in this course have been suitable.
12. This course stimulated interest in the subject content.
Multivariate Analysis
• examines several variables and their relationships
simultaneously
• Multivariate techniques include:
– MANOVA
– Discriminant analysis
– Canonical correlation
– Factor analysis
– Cluster analysis
– Multidimensional analysis

You might also like