0% found this document useful (0 votes)
16 views55 pages

Measurement Levels and Statistical Concepts

The document outlines basic measurement and scaling concepts in psychology, detailing levels of measurement, types of scales, and statistical measures. It emphasizes the importance of scaling for accurate psychological assessment and discusses measurement errors, including random and systematic errors. Additionally, it covers norms and standards in psychological testing, highlighting their significance in evaluating individual performance relative to a population.

Uploaded by

pambler.1357
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views55 pages

Measurement Levels and Statistical Concepts

The document outlines basic measurement and scaling concepts in psychology, detailing levels of measurement, types of scales, and statistical measures. It emphasizes the importance of scaling for accurate psychological assessment and discusses measurement errors, including random and systematic errors. Additionally, it covers norms and standards in psychological testing, highlighting their significance in evaluating individual performance relative to a population.

Uploaded by

pambler.1357
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

BASIC MEASUREMENT AND SCALING CONCEPTS

(Taken from Foxcroft & Roodt, Chapter 3)


Learning Outcomes

Describe 3 distinguishing properties of the different levels of


measurement
Describe & give an example of 4 types of measurement levels

Describe and give examples for each of the different


measurement scales
Describe the type of data each scaling method generates

Define & describe 3 basic categories of statistical measures of


location, variability & association
Name and describe the different types of test norms.
Consider This…

We understand that scaling is the process of assigning quantitative


values to qualitative data

Why is scaling (developing measuring tools) important for


psychology?

What could be considered a core limitation of measuring tools?


Levels of measurement
What is measurement?

Measurement is the transformation


of psychological attributes into
numbers
Properties of scales

Identity Magnitude Equal Intervals Absolute Zero

• Male/Female • ‘Moreness’ • The • True zero


• Yes/No • <>= difference where
between all absolutely
points is the nothing can
same be measured
Categories of measurement levels: Categorical vs. Continuous

Measurement Levels
Categorical Continuous
Nominal Ordinal Interval Ratio
There are an infinite number of
Data is ranked (spaces
Data sorted into categories Equal spaces between points points between each data point
between points are not equal)
(more accurate measurement)
Measurement scaling options

Nominal Ordinal Interval Ratio


Property: Identity Properties: Identity, Properties: Identity, Properties – all
Qualitative Magnitude magnitude, Equal Intervals can be
E.g. Horse Race intervals divided infinitely
E.g: Gender
Quantitative & Quantitative &
Qualitative Qualitative
E.g. 12hr clock Kelvin Temp, 24hr
clock, weight
PREFERENCE FOR SCALES

• This is more statistically rigorous • We often treat ordinal data as if


Preference for
Interval / Ratio

ordinal / nominal
Psych data is usually
so gives more ‘accurate it is interval data – we assume
measurements’. But as equal differences between
psychological data is not intervals/measurements. This
mathematical in nature, our issue with this can be seen most
measurements can’t be this clearly in cross cultural contexts
exact. of assessment where the
differences are more marked.
Measurement Errors
MEASUREMENT ERRORS

• Source of error & • Built into the test


Random sampling
error

Systematic Error
impact of error • Consistently
due to chance influences scores
factors of specific groups
RANDOM SAMPLING ERROR

Randomly affects measurement across the sample

Effect is not consistent across the sample

Effects tend to cancel each other out

Adds variability but does not affect the average

Random error is considered ‘noise’

Sample size is important – greater size = less error


SYSTEMATIC ERROR

Any factor(s) that systematically affect(s) measurement of


the variable across the sample

Consistently positive or consistently negative across the


sample

Assessment results tend to be a function of group


membership & not a true measure of the construct being
measured.

Result of effect = BIAS


REDUCING ERRORS

Pilot test instruments Double check data Statistical Multiple measures of


procedures the same construct
Application: Random vs. Systematic Error

• Differentiate between
systematic and random error
in psychological assessment
tools.
• Use this figure to substantiate
your answers
Random vs. Systematic Errors
Systematic error is an error inherent in the tool itself. This will result in a
consistent error which is measurable and can be anticipated if known.
This means that the tool would consistently treat groups differently
advantaging the one and disadvantaging the other. The main idea here
is that the results are dependent on group membership not on what is
being tested. Another form of systematic error, could be that there are
validity issues with the test. This would also impact negatively on the
learner’s results.

Random error refers to errors that cannot be controlled as they are not
consistent from one administration of the tool to another. The young boy
is clearly distressed, he may doubt his ability or may be struggling with
performance anxiety or a sense of learned helplessness. This would
impact on his ability to concentrate which would impact on his
performance.
Types of Scales
Different scaling options

Semantic Constant
Category Likert Intensity
Differential Sum
Similar to
Opposite poles
semantic Ipsative
anchored
Indicate degree differential
Responses are
to which you
defined
agree with a
categories
statement Indicate where
Two opposite Ranking through
you fall between
anchor points weighting
the poles
Different scaling options

Paired Graphic
Forced Choice Guttman
Comparison Rating
Ipsative Used for attitude
Must choose an
Ranking using or prejudice
option
visual display e.g. research
Ranking by weight emoticons for
service or pain Score depends on
scale Can only choose
cumulative score
2 options only one option
of sub-questions
Application: Identifying different types of scales

Use one or more of the links to access a free test


online. Have a look at the types of questions and see if Links to free tests
you can identify some different scale options. Career Test
Work Values Test
What did you use to help you make up your mind? Team Roles Test
Considerations when choosing a scale format

Single vs.
Questions vs.
composite
statements
dimensions

These are important


choices to make, can
Labelled/Unlabelled Single attribute vs you think of contextual
responses comparative rating factors that would guide
your thinking here?

Even vs. odd Ipsative vs.


numbered ratings normative scales
BASIC STATISTICAL
CONCEPTS
What is DATA?

Observations you collect from your sample but more specifically the observed values of the
variable you are looking at

We make sense of data using statistics

Statistics can be descriptive: give us information about the data

Statistics can be inferential: we use the statistics to draw conclusions about the sample
which we then apply to the population
Raw data is not very useful what would you say is the benefit of visual data?

Visual data allows us to instantly make sense of the data just by looking at it. When presented
visually, it facilitates the interpretation of the data. We can quickly see groupings and patterns and
can start to ask some questions about what we are seeing
We ‘measure’ data to be more rigorous in our analysis

Just by looking at the data we


After presenting data visually we
get a feeling for it, but a ‘feeling’
then analyse it.
is subjective

Two main types of measures


Instead of just looking at the exist:
data we ‘measure’ it • Measures of Centre
• Measures of Variance
Measures of Centre / Central Tendency

• Purpose: Look at where data values lie


• There are 3 measures of center
• Mean = mathematical average (most rigorous
as it is sensitive to all the data)
• Median = once ranked, this is the value that is
in the middle
• Mode = most frequently occurring value
Normal vs. Skewed Distributions
Measures of Variation

• Variation tells us how spread out


data values are
• The more spread out values are the
less consistent the data is so fewer
inferences can be made reliably i.e.
data varies more from sample to
sample
• Range = crudest measurement of
variation = maximum value –
minimum value
Measures of Variance Cont.

More complex measures of variation are variance and


standard deviation.

These are complex because they take into account all


the variables within the data set

Variance = measure of average distance from each


point (data value) to the middle (mean)

Standard Deviation is the most frequently measure of


variance used.
Calculating variance and standard deviation:

Variance Standard Deviation

• Work out the mean • Square root of variance


• For each number subtract
the mean & square the
result
• Work out the average of
the squared differences
Example

• You and your friends have just measured the heights of your dogs (in millimeters)
• The heights at the shoulders are: 600mm, 470mm, 170mm, 430mm and 300mm.
• Find out the Mean, the Variance and the Standard Deviation
Answer: The Mean

• (600 + 470 + 170 + 430 + 300) / 5 = 1970/5 = 394

• This means the average height is 394mm.

• It is plotted on the graph (green line)


Answer: Variance

• To calculate the Variance you need to take each difference,


• square it
• and then average the result

• [2062 + 762 + (-224)2 + 362 + (-94)2] / 5 = 108,520 / 5 = 21,704


Answer: Standard Deviation

• The Standard Deviation is the square root of the Variance

• Standard Deviation = √21,704 = 147


Why is the Standard Deviation important/useful?

• We can use this to identify which of the dogs are within one standard deviation of the
mean.
• Identify which dogs are ‘normal’, ‘extra large’ or ‘extra small’
Measures of Association
Measures of Association

Correlation Coefficient – relationship


between 2 variables

Ranges from -1 to +1 where zero


shows no correlation

Formula = Pearson Product –


Moment Correlation
Strong positive Moderate positive No correlation
correlation correlation

Moderate negative Strong negative Curvilinear


correlation correlation relationship
What is regression analysis?

The comparison that gives the


best prediction of the relationship
between the dependent variable
and independent variable(s).

E.g. Aptitude tests being used to


predict job performance of
machine operators
Norms
Why are norms important?
Norms

Standards are important.

A normal distribution has a mean of zero and a SD of 1

Comparison with peers and/or the greater population.

The norm group is a representative sample of the population.

Establishing norms for a test is the final stage of


standardisation
Establishing Norms

Identify Population (applicant pool)

Take a Random Representative Sample (Incumbent population)

Test/Research the sample

Calculate norms

Co-Norming – use one sample group to establish normative data


on 2 or more related measures
Types of norms usually found in psychological tests

Developmental Norms
• Mental Age
• Grade Equivalent Percentiles

Standard Scores
• Z-Scores
Deviation IQ • T-Scores
• Sten
• Stanine
Standard Scores (Z Scores):

Used to ID probability of a score occurring within our normal distribution

Enables comparison of two scores from different normal distributions.

Z Score is derived by subtracting direct score from the mean and dividing by standard deviation

Arithmetic mean of a standard score is zero and standard deviation is one

Standard Scores are dimensionless


T scores

T Scores eliminate negative values from Z Scores

T Score = (Z Score X 10) + 50

Z-scores and T scores both represent standard deviations from the mean,

T-scores use a mean of 50 and z-scores use a mean of 0.

A T-score of over 50 means above average, below 50 means below average.


Stanines

Indicates nine statistical units from a scale of 1 to 9

Stanines are used to indicate performance level a single digit


score

Scores are ranked lowest to highest and then assigned to the


corresponding stanine

Two scores in the same stanine can be further apart than two
scores in an adjacent stanine which reduces the value of the
scale
Stens

The Sten (standard ten) is a standard score system commonly used with personality questionnaires.

Stens divide the score scale into ten units.

Each unit has a band width of half a standard deviation except the highest unit (Sten 10).

Sten scores can be calculated from Z-scores using the formula: Sten = (Zx2) + 5.5.

Stens have the advantage that they enable results to be thought of in terms of bands of scores, rather than absolute scores.

These bands are narrow enough to distinguish statistically significant differences between candidates, but wide enough not to
over emphasize minor differences between candidates.
Stens
Percentiles versus Percentages

Percentage

• A percentage is the number out of every hundred with a particular


attribute. For example, if 120 of 150 candidates pass an assessment,
the percentage pass rate is 80% (80 out of every hundred).

Percentile

• A percentile is a position in a rank ordering expressed as the


percentage who are lower in the rank ordering. For example, a student
at the 70th percentile performed better than 70% of other candidates.
Percentiles
Deviation IQ Scales
• Normal Standard
Score with Mean of
100 and standard
deviation of 15
• Not easily
comparable to other
norm scores
because of the
different standard
deviation.
Norms versus Standards

• Norms describe what a given population can do


• Standards represent what a given population should be able to do
• Tests can be norm-referenced or criterion-referenced

Norm-referenced assessment Criterion-referenced assessment

• shows how students are achieving compared with a • shows what students can or can’t do in relation to a
statistical sample of others of an equivalent group at a specific list of tasks or skills. Teachers’ judgments are
given point in time. Such tests often provide results in about whether the student has achieved each individual
percentiles or stanines. skill or task. When writing, for example, a student may
be able to succeed at each task or skill but still not be
able to write a compelling piece which meets the needs
of an audience.
Application: Thinking about fair and ethical
assessment
Numeracy & Literacy assessments are
often done with learners. These
assessments have both age norms and
grade norms. Grade norms (when
aligned to the curriculum) are an
example of a criterion referenced
assessment. Why do you think using
these norms would be fairer and/or
ethical within the South African context
where we have a history of unequal
access to education?
Using Grade Norms
We use grade norms as these are aligned
with what each learner needs to achieve in
order to move to the next grade.
Understanding performance relative to
these criterion is meaningful as it informs
where support is needed.
The age range in any grade can be quite
broad. Not all learners have had the same
level of stimulation and facilitation and so
results would vary greatly and would not
necessarily inform meaningful intervention
strategies.

You might also like