26/12/2025, 10:31 Covariance | Brilliant Math & Science Wiki
Covariance
The covariance generalizes the concept of variance to multiple random variables. Instead of measuring the fluctuat
single random variable, the covariance measures the fluctuation of two variables with each other.
Contents
Definition
Calculation of the Covariance
Covariance - Properties
Covariance Matrix
References
Definition
Recall that the variance is the mean squared deviation from the mean for a single random variable X :
Var(X) = E[(X − E[X])2 ].
The covariance adopts an analogous functional form.
DEFINITION
The covariance Cov(X, Y ) of random variables X and Y is defined as
Cov(X, Y ) = E [(X − E[X])(Y − E[Y ])] .
Now, instead of measuring the fluctuation of a single variable, the covariance measures how two variables fluctuate
For the covariance to be large, both X − E[X] and Y − E[Y ] must be large at the same time or, in other words, ch
together.
Calculation of the Covariance
It is generally simpler to find the covariance by taking
Cov(X, Y ) = E[XY − E[X]Y − XE[Y ] + E[X]E[Y ]]
= E[XY ] − E[X]E[Y ].
In other words, to compute the covariance, one can equivalently find E[XY ] (in addition to the means of X and Y )
TRY IT YOURSELF
Let X and Y be random variables such that Reveal the answ
P (X = 0, Y = −1) = 1/5
P (X = 0, Y = 1) = 1/5
P (X = 1, Y = −1) = 1/2
P (X = 1, Y = 1) = 1/10.
Find Cov(X, Y ).
[Link] 1/4
26/12/2025, 10:31 Covariance | Brilliant Math & Science Wiki
Similarly, one can find an expression in terms of variances:
Var(X + Y ) = E [(X + Y − E[X] − E[Y ])2 ]
= E[(X − E[X])2 ] + E[(Y − E[Y ])2 ] + 2E [(X − E[X])(Y − E[Y ])]
= Var(X) + Var(Y ) + 2Cov(X, Y ).
A generalized statement of this result is as follows.
THEOREM
Variance of a sum. Given random variables Xi , each with finite variance,
Var (∑ Xi ) = ∑ Var(Xi ) + 2 ∑ Cov(Xi , Xj ).
i i i<j
Covariance - Properties
The covariance inherits many of the same properties as the inner product from linear algebra. The proof involves
straightforward algebra and is left as an exercise for the reader.
THEOREM
Given a constant a and random variables X , Y , and Z , the following properties hold:
Cov(X, X) = Var(X) ≥ 0
Cov(X, Y ) = Cov(Y , X)
Cov(aX, Y ) = aCov(X, Y )
Cov(X, a) = 0
Cov(X + Y , Z) = Cov(X, Z) + Cov(Y , Z).
TRY IT YOURSELF
Given knowledge of Cov(W , Y ), Cov(W , Z), Cov(X, Y ), and Cov(X, Z), which of the
I, II, III, and IV
following can necessarily be computed?
I, III, and IV only
I. Cov(W + X, Y + Z)
I only
II. Cov(Y + Z, W + X)
III. Cov(W , X + Y + Z) I and II only
IV. Cov(W , X + Y + Z), if it known that W and X are independent I, II, and IV only
Reveal the answ
EXAMPLE
Let X and Y be random variables such that Var(X) = σ 2 and Y = aX , where σ and a are constants. Determin
Cov(X, Y ).
The inner product properties yield
[Link] 2/4
26/12/2025, 10:31 Covariance | Brilliant Math & Science Wiki
Cov(X, Y ) = Cov(X, aX) = Cov(aX, X) = aCov(X, X) = aσ 2 .
TRY IT YOURSELF
if X is a standard normal random variable and Y = 3X , what is Cov(X, Y )? Reveal the answ
As a result, the Cauchy-Schwarz inequality holds for covariances.
THEOREM
Cauchy-Schwarz inequality. Given random variables X and Y ,
[Cov(X, Y )]2 ≤ Var(X)Var(Y ).
One of the key properties of the covariance is the fact that independent random variables have zero covariance.
THEOREM
Covariance of independent variables. If X and Y are independent random variables, then Cov(X, Y ) = 0.
PROOF
If X and Y are independent, then E[XY ] = E[X]E[Y ] and therefore Cov(X, Y ) = 0. (Recall that E[XY ] =
E[X]E[Y ] is a simple consequence of the fact that P (X∣Y ) = P (X).)
EXAMPLE
Dependent variables with zero covariance. However, the converse is not in general true. As a simple example, su
that X is a standard normal random variable and that Y = X 2 . Notice that knowledge of X completely determin
which case X and Y are very clearly dependent. However, by symmetry it holds that
Cov(X, Y ) = E[XY ] − E[X]E[Y ] = 0.
A simple corollary is as follows.
THEOREM
Variance of the sum of independent variables. Given independent random variables Xi , each with finite variance
Var (∑ Xi ) = ∑ Var(Xi ).
i i
PROOF
Since the Xi are independent, it must be the case that Cov(Xi , Xj )
= 0 for all i =
j , and the result follows direc
the variance of a sum theorem.
Covariance Matrix
When dealing with a large number of random variables Xi , it makes sense to consider a covariance matrix whose m
entry is Cov(Xm , Xn ).
[Link] 3/4
26/12/2025, 10:31 Covariance | Brilliant Math & Science Wiki
Since Cov(X, Y ) = Cov(Y , X), the covariance matrix is symmetric.
References
[1] DeGroot, Morris H. Probability and Statistics. Second edition. Addison-Wesley, 1985.
Cite as: Covariance. [Link]. Retrieved 10:05, December 8, 2025, from [Link]
[Link] 4/4