0% found this document useful (0 votes)
190 views5 pages

Test Utility in Psychological Assessment

Utility refers to the practical value and usefulness of a test for decision making. Factors that affect a test's utility include its psychometric soundness (reliability and validity), costs of administering the test, and potential benefits. A utility analysis weighs the costs and benefits of using a test to determine if it provides value. Methods for conducting a utility analysis include expectancy tables, formulas to calculate utility and productivity gains, and decision theory. Practical considerations include the applicant pool size and job complexity. Methods for setting cut scores on tests include expert panels rating likelihood of passing and comparing scores of known groups.

Uploaded by

Adam Vida
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
190 views5 pages

Test Utility in Psychological Assessment

Utility refers to the practical value and usefulness of a test for decision making. Factors that affect a test's utility include its psychometric soundness (reliability and validity), costs of administering the test, and potential benefits. A utility analysis weighs the costs and benefits of using a test to determine if it provides value. Methods for conducting a utility analysis include expectancy tables, formulas to calculate utility and productivity gains, and decision theory. Practical considerations include the applicant pool size and job complexity. Methods for setting cut scores on tests include expert panels rating likelihood of passing and comparing scores of known groups.

Uploaded by

Adam Vida
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

UTILITY

- Refers to how useful a test is

- Refers to the practical value of using a test to aid in decision making

Factors That Affect a Test’s Utility

- Psychometric Soundness: reliability and validity of a test


o Test is said to be psychometrically sound for a particular purpose if reliability and
validity coefficients are acceptably high
o Index of utility: the practical value of the information derived from scores on the
test
o Test scores are said to have utility if their use in a particular situation helps us to
make better decisions—better, that is, in the sense of being more cost-effective
o The higher the criterion-related validity of test scores for making a particular
decision, the higher the utility of the test is likely to be
- Costs: disadvantages, losses, or expenses in both economic and noneconomic terms
o Allocate funds to purchase (1) a particular test, (2) a supply of blank test
protocols, and (3) computerized test processing, scoring, and interpretation from
the test publisher or some independent service
o Costs of testing may come in the form of (1) payment to professional personnel
and staff associated with test administration, scoring, and interpretation, (2)
facility rental, mortgage, and/or other charges related to the usage of the test
facility, and (3) insurance, legal, accounting, licensing, and other routine costs of
doing business
o Costs may be offset by revenue, such as fees paid by testtakers (clinics) or test
user’s funds from government grants/private donations (research orgs)
o Noneconomic costs are far more subtle and more significant despite the
improved costs
- Benefits: profits, gains, or advantages
o Financial returns in dollars and cents a successful testing program can yield
o Good working environment
o Beneficial to society at large

Utility Analysis
- other, less definable elements—such as prudence, vision, and, for lack of a better (or
more technical) term, common sense—must be ever-present in the process
- Family of techniques that entail a cost–benefit analysis designed to yield information
relevant to a decision about the usefulness and/or practical value of a tool of
assessment
- Umbrella term covering various possible methods, each requiring various kinds of data
to be inputted and yielding various kinds of output
- May be undertaken for the purpose of evaluating whether the benefits of using a test
(or training program or intervention) outweigh the costs
- Endpoint of a utility analysis is typically an educated decision about which of many
possible courses of action is optimal
- Can be decisions about (test):
o Preference of test
o Preference of assessment tool
o Additional tests or nah?
o No testing at all sksksk
- Can be decisions about (training/intervention)
o Preference of training program
o Preference of intervention
o Addition or subtraction of elements to an existing training program
o No training at all
o No intervention
Hits and Misses
- Hit: a correct classification

- Miss: an incorrect classification

- Hit Rate: The proportion of people that an assessment tool accurately identifies as
possessing or exhibiting a particular trait, ability, behavior, or attribute
- Miss rate: The proportion of people that an assessment tool inaccurately identifies as
possessing or exhibiting a particular trait, ability, behavior, or attribute
- False Positive: A specific type of miss whereby an assessment tool falsely indicates that
the testtaker possesses or exhibits a particular trait, ability, behavior, or attribute
- False Negative: A specific type of miss whereby an assessment tool falsely indicates that
the testtaker does not possess or exhibit a particular trait, ability, behavior, or attribute

How Is a Utility Analysis Conducted?


- Expectancy data: Provide an indication of the likelihood that a testtaker will score within
some interval of scores on a criterion measure
o Passing, acceptable, failing
o Taylor-Russell tables: provide an estimate of the extent to which inclusion of a
particular test in the selection system will improve selection
 Provide an estimate of the percentage of employees hired by the use of a
particular test who will be successful at their jobs, given different
combinations of three variables: the test’s validity, the selection ratio
(numerical value that reflects the relationship between the number of
people to be hired and the number of people available to be hired) used,
and the base rate
 Determining the increase over current procedures
 Relationship between predictor (test) and the criterion (variable) must be
linear
 Potential difficulty of identifying a criterion score that separates
“successful” from “unsuccessful” employees
o Naylor-Shine tables: obtaining the difference between the means of the selected
and unselected groups to derive an index of what the test (or some other tool of
assessment) is adding to already established procedures
 Determines the increase in average score on some criterion measure
- Brodgen-Cronbach-Gleser formula: calculate the dollar amount of a utility gain resulting
from the use of a particular selection instrument under specified conditions
o Utility gain: estimate of the benefit (monetary or otherwise) of using a particular
test or selection method

U.G. = (N)(T)(rxy)(SDy)(Zm) − (N)(C)

o N= number of applicants selected per year


o T= represents the average length of time in the position (or, tenure)
o rxy= (criterion related) validity coefficient for the given predictor and criterion
o SDy= standard deviation of performance (in dollars) of employees
o z´m= mean (standardized) score on the test for selected applicants

o 2nd part of the formula represents the cost of testing


o C= cost of each applicant
o One recommended way to estimate SDy is by setting it equal to 40% of the mean
salary for the job
o Productivity gain: estimated increase in work output
o In order to check productivity gain just changed the SDy into SDp

P.G. = (N)(T)(rxy)(SDp)(Zm) − (N)(C)

- Decision theory and test utility


o Decision theory: provides guidelines for setting optimal cutoff scores
o Employers are reluctant to use decision-theory-based strategies in their hiring
practices because of the complexity of their application and the threat of legal
challenges

Some Practical Considerations

- The pool of job applicants: there will be a ready supply of viable applicants from which
to choose and fill positions
o There are certain jobs, however, that require such unique skills or demand such
great sacrifice that there are relatively few people who would even apply, let
alone be selected
o pool of possible job applicants for a particular type of position may vary with the
economic climate
o How many people would actually accept the employment position offered to
them even if they were found to be a qualified candidate
- The complexity of the job: the more complex the job, the more people differ on how
well or poorly they do that job
- The cut score in use
o Relative cut score: reference point that is set based on norm-related
considerations rather than on the relationship of test scores to a criterion
o Norm-referenced cut score: Type of cut score set with reference to the
performance of a group (or some target segment of a group)
o Fixed cut score: reference point—in a distribution of test scores used to divide a
set of data into two or more classifications—that is typically set with reference
to a judgment concerning a minimum level of proficiency required to be included
in a particular classification. Also called absolute cut scores
o Multiple cut scores: use of two or more cut scores with reference to one
predictor for the purpose of categorizing testtakers.
o Compensatory model of selection: assumption made that high scores on one
attribute can, in fact, “balance out” or compensate for low scores on another
attribute

Method for Setting Cut Scores


- Angoff Method: can be applied to personnel selection tasks as well as to questions
regarding the presence or absence of a particular trait, attribute, or ability; an expert
panel makes judgments concerning the way a person with that trait, attribute, or ability
would respond to test items. In both cases, the judgments of the experts are averaged
to yield cut scores for the test
- Known Groups Method/Method of contrasting groups: entails collection of data on the
predictor of interest from groups known to possess, and not to possess, a trait,
attribute, or ability of interest
o Determination of where to set the cutoff score is inherently affected by the
composition of the contrasting groups (no standard set of guidelines exist for
choosing contrasting groups)
- IRT Based Methods: each item is associated with a particular level of difficulty. In order
to “pass” the test, the testtaker must answer items that are deemed to be above some
minimum level of difficulty, which is determined by experts and serves as the cut score
o Item-mapping method: entails the arrangement of items in a histogram, with
each column in the histogram containing items deemed to be of equivalent
value. Judges who have been trained regarding minimal competence required
for licensure are presented with sample items from each column and are asked
whether or not a minimally competent licensed individual would answer those
items correctly about half the time. If so, that difficulty level is set as the cut
score; if not, the process continues until the appropriate difficulty level has been
selected
o Bookmark method: Expert places a “bookmark” between the two pages (or, the
two items) that are deemed to separate testtakers who have acquired the
minimal knowledge, skills, and/or abilities from those who have not. The
bookmark serves as the cut score
- Other Methods
o Method of predictive yield: technique for setting cut scores which took into
account the number of positions to be filled, projections regarding the likelihood
of offer acceptance, and the distribution of applicant scores
o Discriminant analysis (discriminant function analysis): typically used to shed
light on the relationship between identified variables and two naturally occurring
groups

Common questions

Powered by AI

The method of predictive yield considers the number of positions available, the likelihood of offer acceptance, and the distribution of applicant scores, making it useful for contexts requiring strategic resource allocation. It allows decision-makers to balance applicant quality and quantity, optimizing cut scores to enhance recruitment efficiency. This method effectively aligns selection processes with organizational goals and labor market conditions .

'Hits' and 'misses' refer to correct and incorrect classifications made by assessment tools. By analyzing hit rates (proportion of accurate identifications) and miss rates, organizations can refine their selection processes. For instance, understanding false positives and negatives allows for adjustments in test criteria or cutoff scores. This analysis helps in enhancing the predictive accuracy of selection methods, contributing to better employee fit and performance outcomes .

The psychometric soundness of a test, which refers to its reliability and validity, directly influences its utility. If a test has high reliability and validity coefficients, it is considered psychometrically sound for its intended purpose, which means the information derived from the test scores is likely useful in making better and more cost-effective decisions. The higher the criterion-related validity, the more likely the test will provide valuable data that enhances decision-making processes .

Expectancy data provide insights into the likelihood of scoring within certain score intervals, which aids in predicting outcomes using a test. Taylor-Russell tables estimate the improvement in selection by adding a test to the selection process. They help determine the percentage of successful employees based on test validity, selection ratio, and base rate. This data is crucial in utility analysis, allowing decision-makers to quantify the added value of a testing method, thus guiding strategic hiring practices .

The complexity and specific skill requirements of a job often necessitate multiple cut scores in assessments. For highly specialized roles requiring diverse skill sets, using multiple cut scores can help categorize candidates based on varied competencies. This differentiation aids in refining selection to ensure candidates meet distinct performance criteria, optimizing job-person fit and enhancing role effectiveness .

Economic costs of administering a test include purchasing the test and related materials, payment to personnel for administration and scoring, and other logistical expenses such as facility rental. Noneconomic costs, which can be more significant, include the time invested and the opportunity costs of not utilizing resources elsewhere. These costs impact the test's utility by potentially outweighing the benefits gained, thereby reducing the overall practical value of using the test .

The Brogden-Cronbach-Gleser formula estimates utility gain by calculating the benefit of using a selection instrument, factoring in applicants selected, average tenure, test validity, performance standard deviation, and associated costs. Utility gain, expressed monetarily or otherwise, helps organizations determine the economic impact of a selection method. This calculation is vital because it informs decisions on investing in specific tests, ensuring resources are allocated to tools that maximize organizational benefits .

Employers might hesitate to apply decision-theory-based strategies due to their complexity and potential legal challenges. Determining accurate cutoff scores requires comprehensive data analysis and understanding specific job requirements, which can be cumbersome. Additionally, legal risks arise from potential claims of discrimination or unfair practices if cutoff scores are perceived as biased, complicating their application in hiring .

The compensatory model assumes that higher scores in one attribute can offset lower scores in another, allowing for a holistic evaluation of candidates. In practice, this model supports hiring decisions in fields where diverse attributes are valued, enabling the selection of well-rounded employees. For example, in roles demanding both technical skills and interpersonal abilities, the model facilitates a flexible approach, balancing various competencies in candidate evaluations .

The Angoff Method involves expert judgments to estimate how a minimally qualified individual would perform on test items, averaging these judgments to set cut scores. In contrast, the Known Groups Method uses data from groups identified as possessing or lacking a trait to establish cutoffs, making it sensitive to the composition of these groups. Each method differs in approach but seeks to ensure that selected candidates meet a baseline standard of competence .

You might also like