Introduction to Uncertainty Quantification
Module 2.2: Random Variables
1 Random Variables
In the previous lesson, we defined a probability space (Ω, F, P ). In this lesson, we operate in this probability
space and specifically define random variables that allow us to operate mathematically on random events
by assigning real numerical values (or ranges of real numbers) to outcomes in the sample space, Ω.
1.1 Definition
A random variable, X, is a function, or mapping, from the sample space of events (Ω) to the real num-
bers such that for every real number x there exists a probability P (X(ω) ≤ x), where ω indexes Ω.
By convention, random variables are almost always denoted with a capital letter while an associated real
number value is denoted with a lower case letter. Hence, we define X as a random variable and x as a
specified numerical value. We further note that the index on the sample space (i.e. the random event
ω ∈ Ω) is typically ignored in standard notation. That is, the random variable is typically expressed as
X rather than X(ω) and the associated probability is given by P (X ≤ x) where the dependence on ω is
implied.
1.2 Types of Random Variables
It is common to categorize random variables based on the nature of their sample space. Recall that we
defined three types of sample spaces: discrete and finite, discrete and infinite, and continuous. Here, we
will distinguish between two types of random variables – discrete random variables and continuous random
variables – where both finite and infinite discrete sample spaces are considered together. To summarize we
have:
• Discrete Random Variables: The sample space is discrete and therefore probabilities are associated
with discrete events. The probability measure is discontinuous.
• Continuous Random Variables: The sample space is continuous and probabilities are associated with
ranges of values with a continuous probability measure.
We will formally define the probability measure in each case soon. But first let’s look at some simple
examples.
1.3 Examples
Discrete Random Variables:
Two concrete cylinders are loaded in a testing machine. Each cylinder can either fail (F) or pass (P) the
1
test. Here, our sample space is discrete and comprised of the following possible outcomes:
O1 = P P
O2 = P F
O3 = F P
O4 = F F
Let us define the following random variable:
X = Number of failures observed from two cylinder tests.
The random variable X maps each possible outcome to a discrete integer value (i.e. 0, 1, or 2) correspond-
ing to the number of cylinders that fail. Looking at each outcome, we have:
X(O1 ) = 0
X(O2 ) = 1
X(O3 ) = 1
X(O4 ) = 2
Continuous Random Variables:
Rather than simply determining pass or failure for each test, consider that the strength of each cylinder is
measured. Here, each element of the sample space, ω ∈ Ω, (i.e. tested cylinders) can be mapped through
the random variable Y (ω) to a real number (strength) in the continuous range (0, ∞). That is, we can
define the following random variable:
Y = Strength of a given cylinder.
The random variable Y maps each possible outcome in the sample space (each possible tested cylinder)
the range (0, ∞). That is:
Y (ω) ∈ (0, ∞)
We will discuss the measure of probability for both discrete and continuous random variables in the next
section.
2 Probability Distributions
In this section, we will introduce the measure of probability for both discrete and continuous random
variables through their respective distribution functions.
2.1 Discrete Random Variables
For discrete random variables, probabilities are assigned to each possible discrete value, xi , i = 1, . . . , n,
of the random variable through it’s Probabilty Mass Function. The probability mass function is defined,
for each value of xi as:
fX (xi ) = P (X = xi ) (1)
2
where P is the associated probability measure. According to the Axioms of Probability, the probability
mass function must possess the following properties:
fX (xi ) ≥ 0 ∀i = 1, . . . , n
n
X (2)
fX (xi ) = 1
i=1
The Cumulative Distribution Function (CDF) is defined as:
FX (x) = P (X ≤ x) (3)
and can be determined from the probability mass function by:
X
FX (x) = P (X = xi ) (4)
xi ≤x
2.2 Continuous Random Variables
For continuous random variables, the Cumulative Distribution Function is defined by Eq. (3), with
the difference being that x now belongs to a continuous set of real numbers. Therefore, the CDF cannot
be defined through a discrete summation over the probabilities of individual events as in Eq. (4). Instead,
the CDF for continuous random variable is a continuous function having the following properties:
• FX (b) ≥ FX (a) if b ≥ a.
• FX (−∞) = 0, FX (∞) = 1
• P (a < X ≤ b) = FX (b) − FX (a)
The first property states that the probability is non-decreasing, while the second property states that it is
bounded on the interval [0, 1]. The CDF is therefore a valid probability measure.
For continuous random variables, the Probability Density Function (PDF) is defined by
dFX (x)
fX (x) = (5)
dx
This implies that the CDF can be obtained from the PDF by:
Z x
FX (x) = fX (ξ)dξ (6)
−∞
Like the CDF, the PDF is a continuous function having the following properties:
• fX (x) ≥ 0, ∀x
R∞
• FX (∞) = −∞ fX (ξ)dξ = 1
Rb Ra Rb
• P (a < X ≤ b) = FX (b) − FX (a) = −∞ fX (ξ)dξ − −∞ fX (ξ)dξ = a fX (ξ)dξ
The PDF and CDF for a continuous random variable are illustrated graphically in Figure 1.
We conclude this introduction to random variables with a few important notes:
3
Figure 1: Cumulative Distribution Function (top) and Probability Density Function (bottom)
1. FX (x) and fX (x) are non-negative functions. This has previously been stated, but is reiterated here.
2. Although 0 ≤ FX (x) ≤ 1, the PDF can take any non-negative value. That is, the PDF is not bounded
on [0, 1] and is therefore not a probability measure.
3. For continuous random variables, P (X = x) = 0. This may seem counterintuitive, but can be proven
quite easily by considering P (a < X ≤ b) = FX (b) − FX (a) where b = a + ∆a and letting ∆a → 0.
More importantly, we make the distinction that the probability density function does not have the
same interpretation as the probability mass function, i.e. that of representing the probability of a
specific outcome.
4. P (X > x) = 1 − P (X ≤ x) = 1 − FX (x). Again, this can be shown quite simply using the
mathematical relations above. But, intuitively we can make sense of this based on what we’ve
seen before because {X > x} and {X ≤ x} are complementary events and we’ve already seen that
P (E) = 1 − P (E ∗ ).
2.3 Examples
In this section, we will return to the examples from Section 1.3 and discuss the associated discrete and
continuous probability distributions.
Discrete Random Variables:
Consider again that we test two concrete cylinders. Further consider that the probability of failure of
any single cylinder is 0.2 (P (F ) = 0.2). What is the probability that at most one cylinder will fail, i.e.
P (X ≤ 1)? What is the probability that at most x cylinders will fail, i.e. P (X ≤ x)?
4
Let’s begin by assessing the probability of each outcome:
P (O1 ) = 0.8 × 0.8 = 0.64
P (O2 ) = 0.8 × 0.2 = 0.16
P (O3 ) = 0.2 × 0.8 = 0.16
P (O4 ) = 0.2 × 0.2 = 0.04
Next, we can see that
P (X ≤ 1) = P (X = 0) + P (X = 1)
= P (O1 ) + P (O2 ∪ O3 )
(7)
= P (O1 ) + P (O2 ) + P (O3 )
= 0.64 + 0.16 + 0.16 = 0.96
We can further generalize this to any value of x and define the cumulative distribution function as:
0 x<0
0.64 0 ≤ x < 1
FX (x) = P (X ≤ x) = (8)
0.96 1 ≤ x < 2
1.0 x ≥ 2
Continuous Random Variables:
In the continuous case, we can determine the probability that the strength is less than some value x by
evaluating the cumulative distribution function FX (x) = P (X ≤ x) at this value. Here, we need to know
the form of the CDF (or equivalently the PDF), which is a continuous non-decreasing function bounded
on [0,1]. We will explore common forms for the CDF later in this lesson.
3 Moments of Random Variables
Random variables have probabilities defined through their cumulative distribution function. The distri-
bution of probability over the range of values of x, as illustrated in Figure 1, can be further described by
the moments of the distribution, which we will define here. We note that the moments of a distribution
are simple descriptors of the distribution and do not define the distribution in general. To this point, we
explain the meaning or interpretation of the the first several moments. Moreover, we will focus moving
forward on continuous random variables and note that, very often, we can simply replace integration with
summation to obtain equivalent expressions for discrete random variables.
3.1 Expectation
To define the moments of a random variable, we first need to introduce the Expectation Operator. The
Expectation or Expected Value of a random variable X having PDF fX (x) is given by the following
integral: Z ∞
E[X] = xfX (x)dx (9)
−∞
The expectation is a linear operation with the following properties. Let a, b, c be constants
E[c] = c
E[cX] = cE[x]
(10)
E[a + bX] = a + bE[X]
E[g1 (X) + g2 (X)] = E[g1 (X)] + E[g2 (X)]
5
3.2 Moments about the Origin
The nth moment about the origin of the random variable X having PDF fX (x) is given by the expectation
of X n . That is Z ∞
n
E[X ] = xn fX (x)dx (11)
−∞
Moments can be of arbitrary order, n, and describe different properties of the distribution. Here, we will
specifically explore moments about the origin of first and second order.
Mean Value
The mean value (or expected value) is the first moment, given by
Z ∞
µX = E[X] = xfX (x)dx (12)
−∞
The mean value is mathematically identical to the centroid of the probability density function fX (x).
Mean Square
The mean square is the second moment about the origin, given by
Z ∞
2
E[X ] = x2 fX (x)dx (13)
−∞
The mean square is mathematically equivalent to the moment of inertia of the probability density function
about the origin. Here, we will specifically explore the first four central moments and their interpretations.
3.3 Moments about the Mean (Central Moments)
The nth moment about the mean (central moment) of the random variable X having PDF fX (x) is given
by the expectation of (X − E[X])n . That is
Z ∞
n
E[(X − E[X]) ] = (x − E[X])n fX (x)dx (14)
−∞
Moments can be of arbitrary order, n, and describe different properties of the distribution.
First Moment
By definition, the first central moment of a random variable is zero.
Variance
The variance is the second moment about the mean, given by
Z ∞
2
Var(X) = E[(X − E[X]) ] = (x − E[X])2 fX (x)dx (15)
−∞
Variance is a measure of scatter around the mean value. It is mathematically equivalent to the moment of
inertia of the distribution about the mean value.
6
Using the properties of the expectation, we can show that
Z ∞
2 2
Var(X) = σX = E[(X − E[X]) ] = (x − E[X])2 fX (x)dx
Z ∞ Z−∞
∞ Z ∞
2 2
= x fX (x)dx − 2E[X] xfX (x)dx + E[X] fX (x)dx (16)
−∞ −∞ −∞
= E[X ] − 2E[X]2 + E[X]2
2
= E[X 2 ] − E[X]2
Standard Deviation
The standard deviation (σX ) is defined by:
p
σX = Var(X) (17)
Standard deviation gives a measure of scatter around the mean, but it preserves the dimensions of the
random variable X.
Coefficient of Variation
The coefficient of variation (COV) is given by
σX
νX = , µX ̸= 0 (18)
µX
The COV provides a dimensionless measure of scatter of the distribution.
Skewness
The skewness is the normalized third central moment, given by
Z ∞
1
γ1 = 3 (x − E[X])3 fX (x)dx (19)
σX −∞
Skewness gives a measure of the symmetry of the distribution. The skewness may take values that are:
• Positive – The distribution has a heavier right tail and peak that lies to the left of the mean value.
The distribution’s density is more concentrated on the left and more dispersed on the right. It may
be referred to as right-skewed, right-tailed, or skewed to the right. This can be confusing the because
the distribution appears to lean to the left.
• Negative – The distribution has a heavier left tail and peak that lies to the right of the mean value.
The distribution’s density is more concentrated on the right and more dispersed on the left. It may
be referred to as left-skewed, left-tailed, or skewed to the left. This can be confusing the because the
distribution appears to lean to the right.
• Zero – The distribution is symmetric.
This is illustrated in Figure 2.
7
Figure 2: Illustration of skewness. Note that we have not yet defined the median and mode.
Figure 3: A helpful mnemonic to remember positive and negative skew is the saying “Negative skew skis
left”.
Kurtosis
The kurtosis is the normalized fourth central moment, given by
Z ∞
1
Kurt[X] = 4 (x − E[X])4 fX (x)dx (20)
σX −∞
Kurtosis is a measure of the heaviness of the tails of the distribution.
Kurtosis is often computed relative to the normal distribution, which always has a kurtosis = 3. We define
excess kurtosis simply by γ2 = Kurt[X]−3. Excess kurtosis is often used as a measure of how non-Gaussian
the distribution is.
8
• γ2 < 3: platykurtic. Tails approach zero faster than the normal. The distribution produces fewer
extreme values than the normal
• γ2 > 3: leptokurtic. Tails approach zero slower than the normal. The distribution produces more
extreme values than the normal.
We will discuss the normal, or Gaussian, distribution next.
Nomenclature
Functions
P (·) Probability measure
fX (·) Probability density function (PDF) of a random variable X
FX (·) Cumulative distribution function (CDF) of a random variable X
E[·] Expected value of a random variable. Also denoted µX ≜ E[X]
E[X n ] The nth moment about the origin of a random variable X
2 ≜ Var(X)
Var(·) Variance of the random variable. Also denoted σX
Kurt[·] Kurtosis of a random variable
Variables
O Outcome from an experiment
E Event
Ω Sample space
ω A random event from the sample space Ω
F Event space
(Ω, F, P ) Probability space, also known as a probability triple
X A random variable
σX Standard deviation of the random variable X
νX Coefficient of Variation (COV) of the random variable X
γ1 Skewness of the random variable X
γ2 Excess kurtosis of the random variable X