0% found this document useful (0 votes)
18 views6 pages

ANCOVA vs. Change from Baseline Analysis

This article compares analysis of covariance (ANCOVA) and analysis of variance (ANOVA) of change from baseline in terms of power and bias for randomized and nonrandomized studies. ANCOVA is shown to have more power in randomized studies, while in nonrandomized studies, it may introduce bias due to preexisting group differences. The article concludes that ANCOVA should be used in randomized studies, while ANOVA of change is recommended for nonrandomized studies with preexisting groups.

Uploaded by

lekvarosnudli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views6 pages

ANCOVA vs. Change from Baseline Analysis

This article compares analysis of covariance (ANCOVA) and analysis of variance (ANOVA) of change from baseline in terms of power and bias for randomized and nonrandomized studies. ANCOVA is shown to have more power in randomized studies, while in nonrandomized studies, it may introduce bias due to preexisting group differences. The article concludes that ANCOVA should be used in randomized studies, while ANOVA of change is recommended for nonrandomized studies with preexisting groups.

Uploaded by

lekvarosnudli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Journal of Clinical Epidemiology 59 (2006) 920–925

ORIGINAL ARTICLES

ANCOVA versus change from baseline had more power in


randomized studies and more bias in nonrandomized studies
Gerard J.P. Van Breukelen*
Department of Methodology & Statistics, Research Institute Caphri, Maastricht University, P.O. Box 616, 6200 MD Maastricht, The Netherlands
Accepted 13 July 2005

Abstract
Background and Objective: For inferring a treatment effect from the difference between a treated and untreated group on a quantitative
outcome measured before and after treatment, current methods are analysis of covariance (ANCOVA) of the outcome with the baseline as
covariate, and analysis of variance (ANOVA) of change from baseline. This article compares both methods on power and bias, for random-
ized and nonrandomized studies.
Methods: The methods are compared by writing both as a regression model and as a repeated measures model, and are applied to a
nonrandomized study of preventing depression.
Results: In randomized studies both methods are unbiased, but ANCOVA has more power. If treatment assignment is based on the
baseline, only ANCOVA is unbiased. In nonrandomized studies with preexisting groups differing at baseline, the two methods cannot both
be unbiased, and may contradict each other. In the study of depression, ANCOVA suggests absence, but ANOVA of change suggests
presence, of a treatment effect. The methods differ because ANCOVA assumes absence of a baseline difference.
Conclusion: In randomized studies and studies with treatment assignment depending on the baseline, ANCOVA must be used. In non-
randomized studies of preexisting groups, ANOVA of change seems less biased than ANCOVA, but two control groups and two baseline
measurements are recommended. Ó 2006 Elsevier Inc. All rights reserved.
Keywords: ANCOVA; Change from baseline; Nonrandomized studies; Regression to different means; Regression to the mean; Repeated measures

1. Introduction method. The aim of this article is to clarify the purposes


and limitations of both methods. This is done by writing
The effect of a treatment or exposure on a quantitative
both methods as a regression model and as a repeated mea-
outcome, like blood pressure or total score on a clinical sures model and applying them to a nonrandomized study
questionnaire, is usually evaluated with a ‘‘pretest–posttest
of psychotherapy. In Section 2, a definition of treatment ef-
control group design.’’ The outcome is measured before
fect is given, and the role of randomization, control group,
(pretest, baseline) and after (posttest, outcome) treatment
and baseline are discussed. Section 3 applies both methods
in the treated group and in a control group. Usually, treat-
to a nonrandomized study, showing that they may lead to
ment assignment is based on (1) randomization, or (2) base-
contradictory conclusions. In Section 4, the methods are
line values, or (3) preexisting communities. The treatment
compared by two regression equations. It is shown which
effect is tested by either of two methods for comparing both
method is best if treatment assignment is based on random-
groups: (1) analysis of covariance (ANCOVA) with the ization (Section 5), the baseline (Section 6), or preexisting
posttest as outcome and pretest as covariate, or (2) analysis
groups (Section 7). The article ends with practical advice
of variance (ANOVA) of the change from baseline, defined
for nonrandomized studies.
as posttest minus pretest. Other methods, such as repeated
measures and regression analysis, are equivalent to one of
these two, as we will see. 2. Treatment effect and the role of control group,
There are several publications on the merits and dangers pretest, and randomization
of both methods [1–11], but most researchers use a single
Following [2,6,9], the effect of a treatment G (1 5 yes,
0 5 no) on an outcome Y for person i is defined as the dif-
* Corresponding author: Tel.: 0031-433882274 or 0031-433884001. ference Di between that person’s outcome under treatment
E-mail address: [Link]@[Link] and under no treatment. The treatment effect for a population
0895-4356/06/$ – see front matter Ó 2006 Elsevier Inc. All rights reserved.
doi: 10.1016/[Link].2006.02.007
G.J.P. Van Breukelen / Journal of Clinical Epidemiology 59 (2006) 920–925 921

is the average D in the population of interest. Most treat- treated group consisted of 88 students, 14–20 years old,
ments are evaluated by a parallel groups design in which in the medium-sized Dutch town Nijmegen, the control
half of all persons are treated (the experimental group, group of 92 students in the equally large neighboring town
G 5 1) and half are not (control group, G 5 0). The mean Arnhem. Assessment of symptoms of depression and skills
posttest difference between the groups is used to estimate was done before and after intervention. Persons were in-
D. In doing so, one assumes that, apart from sampling error, cluded if their pretest Beck’s Depression Inventory (BDI)
the posttest mean of the control group is equal to the posttest score was between 10 and 25, reflecting mild to moderate
mean of the treated group that would have resulted if that depression.
group had not been treated. This assumption is warranted Of all 180 students, 32 dropped out before the posttest:
if treatment assignment is based on randomization. How- 20 treated and 12 controls. Logistic regression of dropout
ever, randomization is not always possible. Exposure studies on treatment, age, gender, schooltype, and pretest of all out-
involve preexisting communities. Mass media interventions comes, showed dropout to depend on age and schooltype
can only be implemented at the community level. Treatment only (all other P O .30). The present analysis is limited
contamination may occur if persons within the same school to complete cases, postponing the inclusion of dropouts
or hospital are allocated to different groups. until Section 4. So two analyses were run with SPSS: (1)
But if randomization is impossible, then how can we ad- ANOVA of change (post- minus pretest) and (2) ANCOVA
just for a baseline group difference to estimate D unbias- with posttest as outcome and pretest as covariate. Both
edly ? Usually, the outcome is observed before treatment analyses were repeated with age, gender, and schooltype
(pretest, X ) and after treatment (posttest, Y ). If the groups as covariates. Residual checks showed only mild violation
differ significantly at pretest, this invalidates their posttest of normality and homogeneity of variance. No treatment
difference as treatment effect estimator. The next step is ad- by pretest interaction was found. Figure 1 shows the result
justing the posttest difference such that D is estimated un- for Symptoms, and plots for skills were similar.
biasedly. ANCOVA with treatment G as a factor, pretest The treated group had a higher pretest mean than the
X as a covariate, and posttest Y as an outcome, is one at- control group (P 5 .000), and the group difference was
tempt at adjustment. ANOVA of change from baseline, with smaller and no longer significant at posttest (two-tailed
G as a factor and change (Y 2 X ) as an outcome, is another P 5 .10). ANOVA of change suggested a treatment effect,
one. This article compares both methods in terms of power because symptoms decreased more in the treated than in the
and bias. To prevent misunderstanding, it must be empha- untreated group (effect estimate 20.46, SE 5 0.13, P 5
sized that if the groups in a nonrandomized study do not .000). In contrast, ANCOVA suggested absence of an effect
differ at pretest, this does not guarantee that the posttest dif- (estimate 20.13, SE 5 0.13, P 5 0.31). These results were
ference unbiasedly estimates D. For instance, the groups hardly affected either by adjusting for covariates or by
may differ in age, and this may lead to a posttest difference including dropouts (for details, see Section 4).
even if the pretest means are equal and there is no treat-
ment. A more dramatic example is given in Section 7.
For an unbiased effect estimation in nonrandomized studies 4. ANCOVA vs. ANOVA of change: A formal
‘‘strongly ignorable treatment assignment’’ [8] is needed. comparison
Roughly, this means that the actual treatment assignment Traditionally, ANCOVA is treated as an extension of
of person i is independent of Di, and rules out selection ANOVA [7,13]. Here, we present ANCOVA as a regression
of the treatment by each person. Strongly ignorable treat-
ment assignment may hold after correcting for some cova- .4
riate, which is then called a ‘‘complete confounding factor’’ control
[11]. For an example, see Section 6. One other assumption .2
is needed to test D, that is, ‘‘stable unit-treatment value’’ treated
[8], which comes down to independence between the Di 0.0
group mean

of person i and the treatment assigned to other persons,


and rules out treatment contamination. -.2

-.4
3. Example: A nonrandomized study
of prevention of depression
-.6
Before going into the differences between ANCOVA and
ANOVA of change, both methods will be applied to a non- -.8
pretest posttest
randomized study of prevention of depression [12], which
serves as an example throughout this article. The study time point
aim was to evaluate the effectiveness of a psychotherapeutic Fig. 1. Change of mean Symptoms score per group in the study of depres-
course in preventing depression among adolescents. The sion prevention.
922 G.J.P. Van Breukelen / Journal of Clinical Epidemiology 59 (2006) 920–925

model, briefly mentioning the difference with classical AN- existence after the pretest, by randomization or treatment
COVA. In terms of regression analysis, ANCOVA assumes assignment based on X. For these two designs, ANCOVA
that: is known to be the best method [2,9,10].
Repeated-measures analysis of the psychotherapy exam-
Yij 5 b0 1 b1 Gij 1 b2 Xij 1 eij ð1Þ ple in Section 3 was run with the SPSS procedure Mixed,
or equivalently, using model (2) with and without the pretest group effect
g1 Gij. These two models gave the same effect estimate,

Yij 2 b2 Xij 5 b0 1 b1 Gij 1 eij SE and P-value as ANOVA of change and ANCOVA, re-
spectively, confirming that g3 in (2) is equivalent to b1 in
where Yij is the posttest score of person i in group j (e.g., (1). An advantage of the repeated measures approach is that
j 5 1 for control, j 5 2 for treated); Gij is a treatment indi- it allows inclusion of persons with a missing posttest due to
cator (Gi1 5 0 for controls, Gi2 5 1 for treated); Xij is the dropout. In this example, including dropouts hardly
covariate, for example, the pretest score; and eij is normally affected the results. In general, it may make a difference
distributed with zero mean and constant variance. Classical (see Section 7).
ANCOVA differs from (1) in that Gij is coded (21,11) and In summary, in terms of regression (1), ANOVA of
Xij is ‘‘centered’’ by subtracting its mean [13,14]. This only change is a special case of ANCOVA in that it assumes
affects b0, called the ‘‘grand mean’’ in ANOVA. a slope b2 5 1 for regressing posttest Y on pretest X. In
In eq. (1), b1 is the group difference on Y adjusted for terms of repeated measures (2), ANCOVA is a special case
differences on X. Practical use of ANCOVA requires esti- of ANOVA of change in that it assumes a slope g1 5 0 for
mation of b2, which is a function of the within-group vari- regressing pretest X on group G. It is this difference that
ances and correlation of pretest and posttest. ANCOVA makes ANCOVA superior in randomized studies and ques-
assumes linearity of the covariate effect and absence of co- tionable in nonrandomized ones.
variate by group interaction. Both assumptions can be re-
laxed [14], but this article is limited to the classical
model to allow a comparison with ANOVA of change (Y 2 5. Randomized studies: Power
X ), which comes down to (1) with the assumption that
b2 5 1. In a randomized study any pretest group difference is
The real difference between ANCOVA and ANOVA of due to sampling error, so any value of b2 in (1) gives the
change becomes clear, however, by writing both in terms same b1 (5 D) apart from sampling error, because b1 is
of repeated measures. ANOVA of change is equivalent to the posttest difference minus b2 3 the pretest difference.
testing the group by time interaction in the following model ANOVA of the posttest lets b2 5 0, ANOVA of change
(with g instead of b for regression weights to prevent takes b2 5 1, and ANCOVA computes b2 such that the re-
confusion with the ANCOVA model (1): sidual posttest variance is minimized, thereby minimizing
the standard error of the treatment effect estimate. So AN-
Yijt 5 g0 1 g1 Gij 1 g2 Tit 1 g3 Gij Tit 1 eijt ð2Þ COVA gives the largest power and the smallest confidence
interval. If pretest and posttest have the same within-group
where Yijt is the observation of person i in group j at time
variance and rXY denotes the pretest–posttest correlation
point t, G is the group (0 5 control, 1 5 treated), T is within groups, then ANCOVA needs a sample size only
the time point (0 5 pretest, 1 5 posttest), and eijt is a ran- (1 1 rXY)/2 as large as that for ANOVA of change to have
dom person by time effect. Filling in G and T shows that g0 the same standard error, for instance, only 75% if rXY 5
is the pretest (population) mean of the control group, g1 is 0.50.
the pretest mean difference between the groups, g2 is the
In terms of repeated measures (2), the superiority of AN-
mean change in the control group, and g3 is the difference
COVA in randomized studies is due to the fact that, because
in mean change between the groups. So testing absence of
there is no group effect at pretest, ANCOVA is more
group by time interaction, that is, of H0: g3 5 0 in eq. (2), is
parsimonious than ANOVA of change, which contains a
equivalent to testing the H0 of no group effect on the superfluous parameter g1.
change (Y 2 X ). Repeated-measures ANOVA differs from In nonrandomized studies the group indicator G in (1)
(2) only in that it uses (21,11) instead of (0,1) coding for correlates with the pretest X, thereby inflating the SE of
G and T. the ANCOVA estimator [14]. This explains why both
It is much less known that ANCOVA is equivalent to
methods gave the same SE in Section 3. But in nonrandom-
testing the group by time interaction g3 in the reduced
ized studies bias, not power, is the issue.
model (2), which is obtained by assuming that g1 5 0.
So ANCOVA assumes that there is no group difference at
pretest [15]. This assumption is warranted if treatment as-
6. Treatment assignment based on the pretest: Bias
signment is based on randomization or on the pretest X.
In both cases, there is only one group of persons, and so Suppose that treatment assignment is based on the pre-
there can be no group effect at pretest. Groups come into test X such that the groups have different pretest means.
G.J.P. Van Breukelen / Journal of Clinical Epidemiology 59 (2006) 920–925 923

An example is randomized assignment where the probabil- 30


ity of assignment to the treated group increases with X be-
cause a high X indicates a strong need for treatment. An
extreme case is the ‘‘regression discontinuity design’’ [3], 20
where all persons with X above some cutoff are treated
and all persons below it are controls. In these cases the

posttest
methods cannot both be unbiased, because b1 in (1) is the
10
posttest difference minus b2 3 the pretest difference. So
b1 depends on b2 , which is 1 for ANOVA of change, but
less than 1 for ANCOVA unless posttest variance is much
larger than pretest variance. If pre- and posttest have the 0

same within-group variance, then b2 5 rXY , the within-


group correlation. With treatment assignment based on X,
ANOVA of change is biased due to regression to the mean -10
while ANCOVA is unbiased [1–3,10,11]. Stratified on X -10 0 10 20 30
there is random assignment, and so by including X as a pretest
covariate the treatment effect D is estimated unbiasedly. Fig. 2. Regression to the mean effect in the regression discontinuity
Compared with pure randomization power is lost, as G in design. If X O mean, then Y ! X (downward regression), and if X !
(1) correlates with X. mean, then Y O X (upward regression), for the majority. Reference lines:
Regression to the mean may best be understood by tak- X 5 mean (10.5), Y 5 mean (10.5), Y 5 X.
ing the case of one group, with pretest and posttest having
the same variance. Regression of Y on X then simplifies 7. Treatment assigment of preexisting groups: Bias
into: (predicted Y – mY) 5 rXY 3 (observed X – mX), where
rXY is the within-group correlation of X and Y, which is less In nonrandomized studies of preexisting groups these
than 1. So the predicted posttest Y is closer to its mean than groups often have different pretest means. So b1 in (1) de-
the observed pretest X used as predictor [16], hence ‘‘re- pends on b2 and ANOVA of change and ANCOVA cannot
gression to the mean.’’ This effect may also be understood both be unbiased and may give contradictory results, a phe-
by noting that if pretest and posttest have the same vari- nomenon known as Lord’s ANCOVA paradox [2,4,5]. The
ance, then it can be shown mathematically that change difference between the methods can again be shown by the
(Y 2 X ) correlates negatively with pretest X. In particular, case of no treatment so that D 5 0, and therefore, b1 5
if the mean change is zero, then high pretest values are on 0 must hold for (1) to be unbiased. For ANOVA of change
average followed by a decrease, and low pretest values are to be unbiased, filling in b2 5 1 in (1) shows that the post-
on average followed by an increase. test group difference must equal the pretest difference,
That this is not just a mathematical tric, but a real-life apart from sampling error. In contrast, ANCOVA gives b2
phenomenon, can be shown with a simple example. The ! 1 and so b1 5 0 can hold only if the posttest group dif-
pretest X of N 5 20 persons on a symptoms checklist varies ference is smaller than the pretest difference. So, whereas
from 1 (healthy) to 20 (unhealthy), each person having a dif- ANOVA of change predicts equal change, ANCOVA pre-
ferent score, which gives a mean of 10.5 and SD of 5.9. A dicts convergence between groups if there is no treatment.
clinician decides to give all persons with X O 10.5 treat- The reason for this behavior of ANCOVA is its assump-
ment and use all other persons as controls. Unknown to tion of no group effect at pretest [g1 5 0 in (2)], which
the clinician, no treatment is given at all. Posttest Y is 1 year leads to regression of both groups to a common mean. If
later, giving again a mean of 10.5 and an SD of 5.9, and treatment assignment is based on randomization or on the
a pre–post correlation of 0.52. Figure 2 plots Y against X. pretest, this assumption is valid, because at pretest no as-
Of all 10 persons allocated to treatment (those with X O signment has yet been made and there is only one group.
mean), 6 are below the line Y 5 X. Of all 10 controls (with But in a nonrandomized study with preexisting groups it
X ! mean), 8 are above the line Y 5 X. So Y is closer to the is not obvious toward what population mean the individuals
mean than X for 14 of 20 persons and the posttest group of a group regress [11]. If the two groups are random sam-
difference is only 4.6 against a pretest difference of 10. ples from their populations, and if these populations have
ANOVA of change ignores regression to the mean and takes different means, then regression of individual scores to
the pretest difference too seriously by subtracting this the mean of their own population will not change group
whole difference from the posttest difference, giving a treat- means apart from sampling error. As a result, the posttest
ment effect of 25.4 (P 5 .03), where no treatment was difference equals the pretest difference and ANOVA of
given at all. ANCOVA takes regression to the mean into ac- change rather than ANCOVA is unbiased, at least if each
count and subtracts only part of the pretest difference from population has a stable mean or if this mean changes in
the posttest difference, leading to the correct conclusion of the same way in both populations. If the two groups are
no effect (P 5 0.60). nonrandom samples, then both methods may be biased.
924 G.J.P. Van Breukelen / Journal of Clinical Epidemiology 59 (2006) 920–925

That ANCOVA is biased for preexisting groups, which on the total sample. This is a problem in nonrandomized
are random samples from their populations is shown in studies where the inclusion of persons is based on cutoffs
Fig. 3. The sample of Fig. 2 is now the control group (solid for the pretest, which act like a mild matching. But there
circles) and the experimental group is obtained by adding is a simple solution. If exclusion is based on the pretest
110 to each X and Y in the control group (clear circles). data, then posttest data of the excluded persons are ‘‘miss-
So the group difference is 10 at both time points. There ing at random’’ [17]. This type of missingness can be han-
is no treatment in either group and so D 5 0. ANOVA of dled by repeated-measures analysis (2), including the
change correctly estimates b1 to be zero (P 5 1.00), but pretest data of excluded persons, which is not possible with
ANCOVA estimates b1 to be 4.8 (P 5 .03). Almost the (1). In the present example repeated-measures analysis of
same result (effect 5 5.4, P 5 .03) is obtained by first all 40 persons, using only the pretest data of excluded per-
matching on X, which leads to the exclusion of all persons sons, gave an effect estimate of 2.55, with a two-tailed P 5
with X ! 11 or X O 20 (vertical lines in Fig. 3), and then .30, leading to the correct conclusion of no effect, while
applying ANOVA to the posttest Y or the change Y 2 X of ANCOVA of all data and ANOVA of change without
included persons only. Figure 3 shows the cause of this excluded persons led to the wrong conclusion.
bias. Matching leads to selection of the upper half of con- What do these results imply for the example in Section
trol group I and the lower half of experimental group II. At 3? Given that the groups were recruited from different
posttest there is regression, not to a common mean as AN- towns and had different pretest means, ANOVA of change
COVA assumes, but to the mean that would have been ob- seems more reasonable than ANCOVA, and one might con-
served without selection, that is, to 10.5 in the control clude that there was a treatment effect. But there are two
group and 20.5 in the experimental group. Of all 10 in- complications. First, the BDI score was used both as inclu-
cluded controls, 6 are below the Y 5 X line (downward re- sion criterion (10 ! BDI ! 25) and as part of the outcome
gression). Of all 10 included experimentals, 8 are above it Symptoms, implying some matching on the pretest. As
(upward regression). The opposite trends occur in the Fig. 3 shows, this may lead to differential regression if
excluded subgroups. ANCOVA is a mathematical method the two populations (before exclusion based on BDI) have
of matching and shares its bias in nonrandomized studies. different BDI means. This not only threatens the unbiased-
In this example the bias is clear, because there is no ness of ANCOVA, but also that of ANOVA of change when
treatment and we have posttest data of all persons. In prac- applied to the included persons only. But because no data
tice, there are no posttest data of excluded persons and AN- are available from the excluded persons, no further analysis
OVA of change on the included (matched) persons suffers is possible. A second complication is that the BDI score
from the same differential regression effect as ANCOVA was measured twice before the intervention period in the
control group. The first was used as inclusion criterion
and the second was 1 month later on the pretest of all out-
40
comes ([12], p. 142). Given that a BDI score O10 is well
above the population mean ([12], p. 72), it is likely that
regression to the mean had already occurred at pretest in
30 the control group. So the pretest difference in Fig. 1 may be
artificially large, casting doubt on the treatment effect. Un-
fortunately, no data from that first BDI measurement in the
20 control group are available.
posttest

10 8. Discussion
Based on literature, we saw that (1) the difference be-
tween ANCOVA and ANOVA of change is that between as-
0
group 1 suming absence or presence of a baseline group difference,
and (2) the choice between both methods depends on the
group 2
treatment assignment procedure. If treatment assignment
-10 is by randomization, both methods are unbiased but AN-
-10 0 10 20 30 40
pretest
COVA has more power. If treatment assignment is based
on the pretest, ANCOVA is unbiased but ANOVA of change
Fig. 3. Bias introduced by matching on the pretest X due to regression to is not, due to regression to the mean. Both designs imply
different means, in a nonrandomized study of preexisting groups with
treatment assignment after the pretest and so at pretest there
a fixed mean each. Reference lines: X 5 10.5 and X 5 20.5 (inclusion cri-
terion: 10 ! X ! 21) and Y 5 X. : Included: X O 10, result: Y ! X; is one group, justifying the ANCOVA assumption of no
excluded: X ! 10, result: Y O X, for a majority. B:Included: X ! 20, group effect at pretest. In contrast, if preexisting groups
result: Y O X; excluded: X O 20, result: Y ! X, for a majority. are assigned to treatment, the unbiasedness of both methods
G.J.P. Van Breukelen / Journal of Clinical Epidemiology 59 (2006) 920–925 925

depends on strong assumptions about trends in the absence groups may be a repeated-measures analysis including all
of treatment. The ANOVA of change assumption (equal available data of dropouts and of excluded persons.
change) is more plausible than the ANCOVA assumption
(regression to a common mean), at least if each group is
a random sample from its population. The larger the pretest
difference between preexisting groups, the worse ANCOVA References
is on bias (Section 7) and efficiency (Section 5). This bias
[1] Campbell DT, Kenny DA. A primer on regression artifacts. New
has to do with measurement error (intraindividual variabil-
York: Guilford Press; 1999.
ity) in the covariate, which leads to underestimation of b2 in [2] Holland PW, Rubin DB. On Lord’s paradox. In: Wainer H,
(1) and a greater discrepancy between ANCOVA and AN- Messick S, eds. Principals of modern psychological measurement.
OVA of change. Statistical corrections for this underestima- Hillsdale, NJ: Erlbaum; 1983. p. 3–25.
tion exist [7,18], but are beyond the present scope. Instead, [3] Kenny DA. A quasi-experimental approach to assessing treatment ef-
measurement error can be reduced by repeated pretesting fects in the nonequivalent control group design. Psychol Bull
1975;82:345–62.
and taking a person’s average as covariate [19]. [4] Lord FM. A paradox in the interpretation of group comparisons.
The present results lead to practical advice for non- Psychol Bull 1967;68:304–5.
randomized studies, assuming that person and cluster ran- [5] Lord FM. Statistical adjustments when comparing pre-existing
domization are impossible. The design is enhanced by groups. Psychol Bull 1969;72:336–7.
having (1) more than one control group [20], and (2) more [6] Maris E. Covariance adjustment versus gain scores revisited. Psychol
Methods 1998;3:309–27.
than one pretest [7,11], and (3) more than one outcome, in- [7] Porter AC, Raudenbush SW. Analysis of covariance: its model and
cluding some that are known to be unaffected by treatment use in psychological research. J Counseling Psychol 1987;34:383–92.
[21]. Having more than one control group or pretest allows [8] Rosenbaum PR, Rubin DB. Estimating the effects caused by treat-
estimation of group trend in the absence of treatment. Equal ments. J Am Stat Assoc 1984;79:26–8.
change of two control groups provides support for ANOVA [9] Rubin DB. Estimating causal effects of treatments in randomized and
nonrandomized studies. J Ed Psychol 1974;66:688–701.
of change, especially if the treated group is in-between both [10] Rubin DB. Assignment to treatment group on the basis of a covariate.
control groups at pretest. Likewise, equal change between J Ed Stat 1977;2:1–26.
repeated pretests of treated and control group suggests the [11] Weisberg HI. Statistical adjustments and uncontrolled studies.
use of ANOVA of change rather than ANCOVA. Moreover, Psychol Bull 1979;86:1149–64.
a person’s average pretest is less subject to measurement er- [12] Ruiter M. Preventie van depressie bij jongeren (Prevention of depres-
sion among adolescents). Doctoral dissertation, Nijmegen University,
ror than a single pretest, leading to larger power for both The Netherlands; 1997.
methods and less disagreement. Finally, including out- [13] Maxwell SE, Delaney HD. Designing experiments and analyzing
comes known to be unaffected by treatment allows checks data: a model comparison perspective. Pacific Grove, CA: Brooks/
on hidden bias in methods of analysis. For instance, finding Cole; 1990.
[14] Kleinbaum DG, Kupper LL, Muller KE, Nizam A. Applied regres-
a treatment effect on intelligence in the study of depression
sion analysis and other multivariable methods Pacific Grove, CA:
would cast doubt on the method of analysis. In nonrandom- Brooks/Cole; 1998.
ized studies with one preexisting control group and one pre- [15] Laird NM, Wang F. Estimating rates of change in randomized clinical
test, ANOVA of change may be better than ANCOVA, but trials. Controlled Clin Trials 1990;11:405–19.
running both methods may be even better. If both methods [16] Stigler SM. Regression towards the mean, historically considered.
lead to the same conclusion, differing only in effect size, Stati Methods Med Res 1997;6:103–14.
[17] Schafer JL, Graham JW. Missing data: our view of the state of the art.
this increases one’s confidence in that conclusion [3]. Psychol Methods 2002;7:147–77.
Additional problems, illustrated by the study of depres- [18] Carroll RJ. Covariance analysis in general linear measurement error
sion, are dropout and bias due to inclusion criteria in non- models. Stat Med 1989;8:1075–93.
randomized studies. In randomized and nonrandomized [19] Senn SJ. Covariance analysis in generalized linear measurement error
studies, dropouts must be included by using proper methods models. Stat Med 1990;9:583–6.
[20] Rosenbaum PR. The role of a second control group in an observa-
for missing data [17]. In nonrandomized studies, further tional study. Stat Sci 1987;2:292–316.
bias arises from the exclusion of persons based on pretest [21] Rosenbaum PR. The role of known effects in observational studies.
data, as Fig. 3 showed. So the best analysis of preexisting Biometrics 1987;45:557–69.

You might also like