STATS 2B03 - Statistical Methods for
Science
Instructor:
Eman Alamer, PhD.
Department of Mathematics and Statistics
McMaster University
Introduction
Today, we will look at:
Chapter 5- Sections: 5.1, 5.2, 5.3
Chapter 6- Section: 6.1
Concepts: probability distributions, binomial distribution, Poisson
distribution, standard normal distribution.
STATS 2B03 (Lecture 3) 1 / 30 1
5.1: Probability Distributions
A random variable (r.v.) is a variable that has a single numerical
value, determined by chance, for each outcome of an experiment.
A r.v. is discrete if we can list all of its possible values.
A probability distribution is a description that gives the probability
for each value of the random variable. It is often expressed in the
format of a table, formula, or graph.
STATS 2B03 (Lecture 3) 2 / 30 2
Example
If we flip a coin 3 times, identify
1 A discrete random variable
2 The probability distribution
STATS 2B03 (Lecture 3) 3 / 30 3
Mean, Variance of Discrete Random Variable
If a discrete random variable X can take values x1 , x2 , ..., xn with
probabilities p1 , p2 , ..., pn then
The discrete probability distribution
X
p(x) = 1, 0 ≤ p(x) ≤ 1
all x
The mean µ or expected value E(X)
X
µ = E(X) = xp(x)
all x
The variance σ 2 or V ar(X)
X X
σ 2 = V ar(X) = (x − µ)2 p(x) = x2 p(x) − µ2
all x all x
The standard deviation σ √
σ= σ2
STATS 2B03 (Lecture 3) 4 / 30 4
Example
If 3 couples each have one child, let X be the number of girls.
1 Find the discrete probability distribution.
2 Find the mean of X.
3 Find the variance of X.
4 Find the standard deviation of X.
STATS 2B03 (Lecture 3) 5 / 30 5
Identifying Significant Values
Range Rule of Thumb (for identifying significant values):
Values larger than µ + 2σ are called significantly high.
Values smaller than µ − 2σ are called significantly low.
Significant Results with Probabilities:
X successes among n trials is significantly high if the probability of X
or more successes is 0.05 or less. i.e.,P (X or more successes) ≤ 0.05.
X successes among n trials is significantly low if the probability of X
or fewer successes is 0.05 or less. i.e.,P (X or fewer successes) ≤ 0.05
Note: If the probability of a single outcome is small, it does not make
that single outcome significantly low or high.
STATS 2B03 (Lecture 3) 6 / 30 6
Example
Chapter Problem: Before its clinical trials were discontinued, the
Genetics & IVF Institute conducted a clinical trial of the XSORT
method designed to increase the probability of conceiving a female
and, among the 945 babies born to parents using the XSORT
method, there were 879 females....
If 945 couples each have one child, and P (girl) = 12 , µ = 472.5,
σ 2 = 236.25, and σ = 15.37
1 Is 879 of those couples having girls is significantly high ( Use range rule
of thumb )?
2 Is 879 of those couples having girls is significantly high, let
P (879 or more girls) = 0.00000? (Use significant results with
probabilities).
3 Is 474 girls significantly high, let P (474) = 0.0258? (Use significant
results with probabilities)
STATS 2B03 (Lecture 3) 7 / 30 7
Rare Event Rule for Inferential Statistics
Rare Event Rule for Inferential Statistics: If under a given
assumption, an observed outcome is significantly high or significantly
low we conclude that the assumption is probably not correct.
1
Example: If 945 couples each have one child, and P (girl) = 2 then
879 girls is significantly high. What can we conclude?
STATS 2B03 (Lecture 3) 8 / 30 8
Example
In a group of 12 people, 8 have brown eyes, 4 have green eyes. Three
people are chosen at random and 2 of them have green eyes. Is this
significantly high?
1 Use significant results with probabilities
2 Use range rule of thumb
STATS 2B03 (Lecture 3) 9 / 30 9
Section 5.2: The Binomial Distribution
A Bernoulli trial is an experiment with two outcomes ”success” and
”failure”.
Bernoulli experiment called ”binary” or ”dichotomous” experiment.
p denote the probability of a ”success”, q = 1 − p denote the
probability of a ”failure”.
A Bernoulli process (or binomial experiment) consists of
1 n independent Bernoulli trials.
2 The probability of success p is the same for each trial.
STATS 2B03 (Lecture 3) 10 / 30 1
Combinations (Number of Arrangements)
The notation nx or Cxn or n C x represents the number of ways to
select/choose x items from n where the order in the selection doesn’t
matter. x successes from n trials. The formula for a combination is
n n!
= Cxn = n C x =
x x!(n − x)!
where n! = n × (n − 1) × .... × 2 × 1
STATS 2B03 (Lecture 3) 11 / 30 1
Example
In how many ways can 3 S’s and 2 F’s be arranged?
STATS 2B03 (Lecture 3) 12 / 30 1
Binomial Distribution
If X is the number of successes in a Bernoulli process, then X is
called a binomial random variable with parameters n and p.
If X is a binomial random variable with parameters n and p then the
probability of obtaining x successes in these n trials, is given by:
n x
P (x) = P (x successes) = p (1 − p)n−x , x = 0, 1, ..., n
x
We denoted this by X ∼ Binomial(n, p) or X ∼ Bin(n, p).
We use the binomial distribution when we are interested in the
probability of having x successes out of n trials.
If X is a binomial random variable, then
The mean µ = E(X) = np
The variance σ 2 = V ar(X) = np(1 − p)
STATS 2B03 (Lecture 3) 13 / 30 1
Probability Keywords and Translation
Exactly 5: P(X = 5).
At most 5: P(X ≤ 5).
At least 5: P(X ≥ 5).
Fewer than 5, below 5: P(X < 5).
More than 5, exceed 5: P(X > 5).
Between 3 and 5, inclusive: P(3 ≤ X ≤ 5).
STATS 2B03 (Lecture 3) 14 / 30 1
Example
A multiple choice test has 20 questions, with 5 choice for each
question. Suppose that a student guesses on every question.
1 Find the probability of getting exactly 6 correct answers.
2 Find the probability of passing the test.
3 Find the probability of failing the test.
4 Find the mean.
5 Find the variance.
6 Find the standard deviation.
STATS 2B03 (Lecture 3) 15 / 30 1
Example
A surgeon performs 10 surgeries. Assume surgeries are independent
and each has a 0.1 probability of later developing a complication. Let
X be the number of surgeries that lead to a complication.
1 Find the probability of exactly 2 surgeries develop complications.
2 Find the probability of at most 2 surgeries develop complications.
3 Find the probability of more than 2 surgeries develop complications.
4 Find the mean.
5 Find the variance.
6 Find the standard deviation.
STATS 2B03 (Lecture 3) 16 / 30 1
Example
A fair die is rolled 26 times and lands on 6 once. Is it significantly
low?
STATS 2B03 (Lecture 3) 17 / 30 1
The Poisson Distribution
Let X be the number of occurrences of an event per unit of ”time”,
where there are a large number of possible occurrences of the event,
but the probability of each occurrence is small.
Let λ be the average number of occurrences of the event per unit of
”time”.
Then X is called a Poisson random variable with parameter λ.
A random variable X is the number of successes in a certain amount
of time or space.
Used to model the number of events that occur in an interval of time.
STATS 2B03 (Lecture 3) 18 / 30 1
Poisson Distribution
If X is a Poisson random variable with parameters λ then the
probability that x will occur is given by
λx e−λ
P (x) = P (x ocurrences) = , x = 0, 1, 2, ...
x!
We denoted this by X ∼ Poisson(λ) or X ∼ Poi(λ).
If X is a Poisson random variable, then
The mean µ = E(X) = λ
The variance σ 2 = V ar(X) = λ
STATS 2B03 (Lecture 3) 19 / 30 1
Example
A hospital emergency room receives an average of 5 new patients per
hour. Find the probability that
1 Exactly 8 people will be admitted in the next hour.
2 At least 3 people will be admitted in the next hour.
STATS 2B03 (Lecture 3) 20 / 30 2
Example
Let’s assume a doctor writes on average 15 prescriptions per day. Let
X be the number of prescriptions the doctor writes on a given day.
Find the:
1 Mean (average) number of prescriptions written in one day.
2 Variance of the number of prescriptions written in one day.
3 Standard deviation of the number of prescriptions written in one day.
STATS 2B03 (Lecture 3) 21 / 30 2
Section 6.1 The Standard Normal Distribution
A random variable is continuous if it can assume all values in a given
interval.
Continuous random variables when the random experiments involve
measuring, ”a continuous set of values”.
A continuous random variable takes a range of values ( finite or
infinite). E.g. [0, 1] , (0, ∞), (−∞, ∞), [a, b].
STATS 2B03 (Lecture 3) 22 / 30 2
Continuous Random Variable
A random variable X is continuous if there is a function f (x) and it
called the probability density function (pdf) of X.
f (x) is pdf if satisfies these rules:
1 f (x) ⩾ 0.
R∞
2 −∞ f (x)dx = 1 ( The total area under f (x) is 1).
3 P(a ≤ X ≤ b) is area under the curve between a and b.
For any continuous random variable
P (a ≤ x ≤ b) = P (a < x < b) = P (a < x ≤ b) = P (a ≤ x < b)
STATS 2B03 (Lecture 3) 23 / 30 2
Normal Distribution (Gaussian Distribution)
Symmetric, bell shaped distribution. pdf of Normal
A random variable X take any value
between −∞ and ∞.
A continuous random variable X is said
to have a Normal distribution if it has
the following pdf
1 2 2
f (x) = √ e−(x−µ) /2σ
σ 2π
Where −∞ < x < ∞, µ ∈ R, σ > 0.
We denoted this by X ∼ N (µ, σ 2 ).
STATS 2B03 (Lecture 3) 24 / 30 2
Normal Distribution (Gaussian Distribution)
µ and σ 2 the Normal Distribution
mean and
N ~ (0, 1)
variance affect N ~ (2, 2)
N ~ (0, 0.5)
0.8
the location and N ~ (−2, 2)
shape of the pdf,
0.6
respectively. f(x)
0.4
0.2
0.0
−6 −4 −2 0 2 4 6
STATS 2B03 (Lecture 3) 25 / 30 2
Standard Normal Distribution
When µ = 0 and σ 2 = 1, it called Standard Normal distribution.
Standard normal random variable denoted by Z.
Standard normal pdf denoted by ϕ(z).
1 1 2
ϕ(z) = √ e− 2 z
2π
where −∞ < z < ∞, µ ∈ R.
We denoted this by Z ∼ N (0, 1)
Probabilities: If Z is a standard normal random variable then
P (a ≤ Z ≤ b) the area under standard normal curve between a and b
STATS 2B03 (Lecture 3) 26 / 30 2
Areas Under the Standard Normal Curve
Table A-2 (pages 626 and 627) gives the area under the standard
normal curve to the left of z
Example: Find the area under the standard normal curve
1 To the left of z = 1.83
2 To the right of z = −2.17
3 Between z = 0.87 and z = 3.15
STATS 2B03 (Lecture 3) 27 / 30 2
Example
Find:
1 P (−1.25 < z ≤ 2.46)
2 Find z1 such that P (z1 < z < 1.87) = 0.8340
STATS 2B03 (Lecture 3) 28 / 30 2
Example
Let Z ∼ N (0, 1). Find z0
1 P (Z ≤ z0 ) = 0.7123
2 P (Z < z0 ) = 0.3405
3 P (Z ≥ z0 ) = 0.1515
STATS 2B03 (Lecture 3) 29 / 30 2
z−Score
Notation zα is a z−score with the property that the area under the
standard normal curve to the right of zα is equal to α.
In some contexts zα is called a critical value.
Example: Find zα/2 if
1 α = 0.1
2 α = 0.05
3 α = 0.01
STATS 2B03 (Lecture 3) 30 / 30 3