Logic of Scientific Inference/ What is Causality?
Richard Williams, University of Notre Dame, [Link]
Last revised February 15, 2015
[NOTE: Toolbook files will be used when presenting this material] (Adapted from ch. 2 of
Stinchcomb, Constructing Social Theories; and, Ch. 1 of Cook and Campbell, Quasi-
Experimentation)
I. Fundamental forms of scientific inference
Scientific inference starts with a theoretical statement, an element of a theory, which
states that one class of phenomena will be connected in a certain way with another class of
phenomena. From this theoretical statement, we derive, by logical deduction and by operational
definitions of the concepts, an empirical statement (or hypothesis). We then compare the
different kinds of consequences of the theory with the facts.
a. If a consequence of a theory turns out to be true, the theory becomes more
credible.
b. If several consequences of the theory turn out to be true, the theory becomes even
more credible (especially the more dissimilar the consequences).
c. However, the credibility of a theory is enhanced very little if chance alone could
have produced the observed results. Hence, statistical sign. of a set of observations (proof that
they are unlikely to be explained by random distributions) is usually a minimum requirement for
regarding the observations as substantial support for a theory.
d. Note that we never claim that we have proven a theory; we just say that we have
not disproven it.
Some of the consequences implied by theory A may also be implied by Theory B; that is,
there can be alternative explanations for any given phenomena. We wish to eliminate alternative
theories. Crucial experiments are a means of doing this. A crucial experiment is a set of
observations which will give one result if one of the main alternative theories is true, and a
different result if another is true. *** Illustrate via Venn diagrams ***
EX: Some believed that suicide was a result of mental illness. Durkheim showed that
the correlation between mental illness rates and suicide rates was insignificant.
Logic of Scientific inference; What is causality Page 1
II. The structure of causal theories.
A variable is a concept which can have two or more values, and which is defined in such
a way that one can tell what those values are (i.e. the concept can be measured).
a. The simplest kind of variable is one which has two values, which can be
represented as 1 and 0.
1. There can be natural dichotomies (sex, employed/ unemployed);
2. conceptual dichotomies that are created by the investigator (left wing/
right wing; totalitarian/ nontotalitarian).
3. Some dichotomies are simplifications of variables which have more than
two categories (e.g. collapse age into minor/ adult; collapse income into high/ low).
b. Variables might also have a finite number of categories greater than two
1. natural - country of citizenship, brands of automobiles
2. conceptual - classify buildings as single-family or apartment or factory;
classify countries as communist, democratic, socialist
3. simplifications - income into high, medium, low
4. combinations of simpler variables - married woman, unmarried man
(combination of gender and marital status)
c. A third kind of variable has exactly as many values as there are observations - e.g.
rank in class at graduation
d. And, variables can be continuous, or nearly so (e.g. income, age). For each of
these types of variables we have a concept in terms of which we make observations, and we
classify or order these observations in some way so that each observation is connected with a
single value of the variable.
Skip to handout on variable
naming.
A causal law is a statement that a change in the value of one variable is sufficient to
produce a change in the value of another, without the operation of intermediate causes. In order
for observations to support a causal theory, it must be the case that:
Logic of Scientific inference; What is causality Page 2
1. We observe different values of the causal variable. We need to observe at least
two values for every variable
EX: If we want to look at the effect of gender on vote, both men and women have to be
included in the sample.
2. There must be covariation between the dependent and independent variables -
variations in the values of the dependent variable must be associated with different values of the
causal variable.
EX: If both men and women are equally likely to support the Democratic nominee for
president, we cannot say gender affects party vote.
Covariation is usually established via
a. experimentation (where the investigator himself changes the value of the
causal variable) - experimenter himself changes the values of the variables
b. observations of natural variations in the two variables.
3. Causal direction must be established - we must show that it is possible to change
the value of the dependent variable by changing the causal variable (note that this does not rule
out the possibility of reciprocal effects).
EX: Demographers debate whether a woman's education affects the age at which she first
has children, or whether the age at which a woman first has children affects her level of
education.
Causal direction can be established via (*** Note: Stinchcomb lists 5 ways altogether
***)
a. experimentation (we know that we produced the change in the causal
variable; the dependent variable was not responsible).
b. Or, we can observe that change in the causal variable precedes change in
the dependent variable
Logic of Scientific inference; What is causality Page 3
4. We must show the relationship is nonspurious - must show there are not other
variables in the environment which might cause changes in the dependent variable at the same
time as the independent variable changes.
EX: Does smoking cause cancer - or does "lifestyle" affect both smoking and cancer (e.g.
smokers have a lot of bad habits, and these other bad habits make them more prone to cancer.)
We can do this via (*** Note: Stinchcomb gives 4 ways *** )
a. setting the values of all other variables equal (not usually practical in the
social sciences)
b. control via randomization - randomly assign subjects to treatment
conditions - so their values on other variables will likely be equal (within statistical limits)
c. measure other variables and compare covariation between our causal
variable and the dependent variable only among observations where the third variable has
identical values (or use statistical techniques which partial out the effect of the third variable).
Other key ideas from Stinchcomb:
1. Variables may be measured either by their causes or effects. (p. 42)
a. Cause - experimenter tells one group "you will probably like each other", and
another group that "you will probably not get along too well," in order to study the effects of
social solidarity. Clearly, he is measuring social solidarity by its presumed causes (his statements
to the subjects.)
b. Effect - intelligence presumably has the effect that a person is able to answer
more questions correctly the more intelligent s/he is. So, we use a series of these effects (answers
to questions) to locate the underlying variable.
2. Measurement is affected by concepts; our theories determine how and what we measure.
Improvement in measurement is usually due to the advance of theory. (p. 43)
3. Analysis of a theory can be done at different levels of generality. Refuting a hypothesis at
one level does not necessarily refute the theory at another level. pp. 47-53.
Logic of Scientific inference; What is causality Page 4
III. What is causality?
There are several different conceptions of cause.
Positivists - Stresses the observations of regularities. Says that high correlations
demonstrate, or are synonymous with, causation. Actually denies causation, or says it is a useless
concept - don't waste your time with unobserved entities.
Essentialist theories of causation - argues that cause should only be used to refer to
variables that explain a phenomena in the sense that these variables, when taken together, are
both necessary and sufficient for the effect to occur. This position equates cause with a
constellation of variables that necessarily, inevitably, and infallibly results in the effect. Does not
accept probabilistic relationships.
John Stuart Mill - Said causal inference depends on 3 factors: (a) cause has to precede the
effect; (b) the cause and effect have to be related; (c) Other explanations of the cause-effect
relationship have to be eliminated (i.e. must rule out spuriousness). Method of Agreement states
an effect will be present when the cause is present. (The cause is sufficient for the effect to
occur.) Method of Difference states the effect will be absent when the cause is absent (The cause
is necessary for the effect to occur). The Method of Concommittant Variation implies that when
both of the above relationships are observed, causal inference will be all the stronger since
certain other interpretations of the covariation between the cause and effect can be ruled out.
Preferred experimental manipulation to observation of natural variation
Popper and falsification - stresses the ambiguity of confirmation. At best, any theory is
"not yet disconfirmed". Alternative explanations are always possible. Popper stresses
competition between "grand theories"; Cook and Campbell stress ruling out theoretical "nuisance
factors"
Cook and Campbell - (a) Agree with positivists that causation and concomitance are
closely related but say causes have a real nature, albeit one that can only be imperfectly grasped
(b) Agree with Mill's criteria for cause (c) Agree with Popper that you should proceed less by
seeking to confirm theoretical predictions about causal connections than by seeking to falsify
them; but C & C stress pitting causal hypotheses against mundane nuisance factors rather than
Logic of Scientific inference; What is causality Page 5
grand theories. (d) causal laws of greatest practical significance are those laws involving
manipulable causes
C & C further say that
(1) Causal assertions are meaningful at the molar (causal laws stated in terms of large
and often complex objects) level even when the ultimate micromediation is not known. EX:
Flipping a switch causes a light to go on. Such causal assertions are helpful primarily because
they imply knowledge about how to control the environment; and they are meaningful because
they can be tested and are subject to verification as largely right or wrong.
(2) Molar causal laws, because they are contingent on many other conditions and
causal laws, are fallible and hence probabilistic. The light bulb may be burned out, and the cause
may not produce the effect. In the social sciences, we work with molar laws, hence we are
talking about probabilistic rather than deterministic relationships.
(3) The effects in molar causal laws can be the result of multiple causes - that is,
different causes can produce the same effect.
(4) The manipulation of a cause will result in the manipulation of an effect.
Logic of Scientific inference; What is causality Page 6