PROBABILITY
DISTRIBUTION
T.S Nwanamidwa
Learning outcomes
At the end of this study unit, learners should be able to :
define and compute the concept of a probability distribution
describe four common probability distribution used in management practice
distinguish or recognize when to apply each probability distribution
Probability distribution
Definition
What is a probability distribution.
it is a list of all possible outcomes of a random variable and their associated probabilities
of occurrence.
or is a table or an equation that links each outcome of a statistical experiment with its
probability of occurrence
To understand probability distributions it is important to understand variables, random
variables and some notations
Remember : an experiment is a trial that results in any one of the several possible
outcomes.
Probability distribution
What is a random variable:
Random variable – when the value of a variable is the outcome of a statistical
experiment
- any attribute of interest on which data is collected and
analyzed.
It summarises the results of an experiment in terms of numerical values
To represent a random variable, generally we use a capital letter and a lower case
letter to represent one of its values.
A Variable – a symbol that can take on any of a specified set of values( A, B, x, y etc.)
Examples of a random variable
X represents the random variable
P(X) represents the probability of X
P(X=x) refers to the probability that the random variable X is equal to a particular
value denoted x
Probability distribution
• Examples:
• Whatis the probability that three out of 10 Nike sneakers will be
accepted by the market?
• What is the likelihood that, for a batch of 20 Samsung phones
received by a Vodacom store, no more than two will be returned
for repairs during the one-year warranty period?
Statistics and numbers
• Need to know nature of numbers collected
• Continuous variables: type of numbers associated with measuring or weighing; any
value in a continuous interval of measurement.
• Continuous variables(such as time or weight) are infinitely divisible into whatever
units a researcher may choose
• Examples:
• Weight
of students, height of plants, time to flowering, time
measured to the nearest minute, second, half-second, etc.
• Discrete variables: type of numbers that are counted or categorical
• Discrete variables consist of indivisible categories
• Examples:
• Numbers of boys, girls, insects, plants, class size.
Examples
Which type of numbers (discrete or continuous?)
• Number of persons preferring Brand X in 5 different towns
• The weights of high school seniors
• The lengths of oak leaves
• The number of seeds germinating
• 35 tall and 12 dwarf pea plants
Types of probability distributions
Discrete
Continuous
Discrete
Discrete probability distributions take on any specific value, usually
integers (x=1,2,3…)
Example: North- West University library has only 1,2,3 or 4 copies of
Applied Business Statistics in their shelves
Types of discrete probability
distribution
Binomial distribution
Poisson distribution
Binomial distribution
It is the most widely encountered probability distribution.
It is derived from a process known as the Bernoulli trial.
It determines the probability of occurrence where there are two mutually exclusive
events (yes or no, failure or success).
The binomial experiment
You flip a coin 2times and count the number of times the coin lands on heads. This is
a binomial experiment because:
The experiment consists of repeated trials . We flip a coin 2times
Each trial can result in just two possible outcomes(heads or tail)
The probability of success is constant i.e. 0.5 on every trial
The trials are independent i.e. getting heads on one trial does not affect whether
we get heads on other trials.
Example
Binomial distribution for 10 tosses of a fair coin
0.25
0.2
0.15
P(x)
0.1
0.05
0
0 1 2 3 4 5 6 7 8 9 10
Number of heads
The Binomial question “what is the probability that r successes will occur in n trials of the
process under study?”
Formula for Binomial
distribution
OR
where:
n= represents number of trials, sample size
r= number of successes desired
p= proportion of success on a single independent trial
q= proportion of failure on a single independent trial
Examples
P(5 or 6 successes) = P(X = 5) + P(X = 6)
P(at most 3)=P(x ≤ 3)= P(x = 0)+P(x = 1)+P(x = 2)+P(x = 3)
P(less than 5 successes) = P(X < 5) = P(X ≤ 4)
P(more than 5 successes) = P(X > 5) = 1 – P(X ≤ 5)
P(at least 5 successes) = P(X ≥ 5) = 1 – P(X ≤ 4)(Note: apply complementary rule of
probability)
P(between 5 and 8 successes) = P(5 < X < 8) = P(X ≤ 7) – P(X ≤ 5)
P(between or equal to 5 and 8 successes) = P(5 ≤ X ≤ 8) = P(X ≤ 8) – P(X ≤ 4)
Note: Use the complementary rule whenever practical, to reduce the number of
calculations
Example
In 2007 approximately 4.7% of the households in the Detroit metropolitan
area were in some stage of foreclosure, the highest foreclosure rate in the
nation. Suppose we sample 100 mortgage-holding households in the
Detroit area:
a) What is the probability that exactly 5 of these households are in some
stage of foreclosure?
b) What is the probability that no more than 5 of these households are in
some stage of foreclosure?
c) What is the probability that more than 5 households are in some stage
of foreclosure?
d) Calculate the expected value, the variance and the standard deviation.
Using binomial function on Excel
• Click on formula > Insert Function > [Link])
Syntax
[Link](number_s, trials, probability_s,
cumulative)/BINOMDIST(number_s, trials, probability_s, cumulative)
The [Link] function syntax has the following arguments:
•Number_s is the number of successes in trials.
•Trials is the number of independent trials.
•Probability_s is the probability of success on each trial.
•Cumulative Required. A logical value that determines the form of the
function. If cumulative is TRUE, then [Link] returns the cumulative
distribution function, which is the probability that there are at most
number_s successes; if FALSE, it returns the probability mass function,
which is the probability that there are number_s successes.
Properties of binomial distribution
A measure of central location and a measure of dispersion can be
computed for any random variable that follows a Binomial distribution ,
using the following formulae:
Mean [E(Y)]=
Variance [Var(y)]=
Standard deviation= where
Poisson distribution
The Poisson probability distribution was named after the French mathematician
Simeon Poisson(1781-1849)
Poisson distribution is concerned with events that occur at random within a
period of time, space or volume interval.
e.g. Distribution of telephone calls going through a switch board (x=0, 1, 2,
…..)
The number of accidents at an intersection 0, 1, 2, 3,....
The number of customers that use McDonald’s drive-thru in a day.
The number of schools of fish in 100 square miles.
The number of bacteria in a specified culture.
The Poisson question???
Whatis the probability of x occurrences of a given event being observed in a
predetermined time, space or volume interval.
Characteristics of a Poisson
process
It is a discrete random variable
Only a single parameter, a, is known
The number of successes within a specified time or space interval equals
any integer between zero and infinity( i.e the value of a random variable
(x) take on any value between 0 and )
Formula for Poisson
distribution
for x=0,1,2,3…
where:
x=possible outcomes for a Poisson process/ number of occurrences of a
given event for which a probability is required
a=average / mean number of occurrences of a given event of the
random variable for a predetermined time, space or volume interval
e= mathematical constant approximately = 2.718
Example
Heritage properties is a national company specializing in rental office
accommodation. An analysis of lease records at their Durban branch has
established that, on average, five lease agreements are signed per day for
office space in the Durban metropolitan area.
• What is the probability that, on a given day, the Durban branch will sign
only three lease agreements for office space?
• What is the probability that, on a given day, the Durban branch will sign
at most two lease agreements for office space?
• What is the probability that the Durban branch will sign more than four
lease agreements for office space on a given day?
• What is the probability that the Durban branch will sign more than four
lease agreements for office space in any two-day period?
Poisson in Excel
Excel provides the following function for the Poisson distribution:
• POISSON(x, μ, cum) where μ = the mean of the distribution
and cum takes the values TRUE and FALSE
• POISSON(x, μ, FALSE) = probability density function value f(x) at
the value x for the Poisson distribution with mean μ.
• POISSON(x, μ, TRUE) = cumulative probability distribution
function F(x) at the value x for the Poisson distribution with
mean μ.
Example in Excel
Heritage properties is a national company specializing in rental office
accommodation. An analysis of lease records at their Durban branch has
established that, on average, five lease agreements are signed per day for
office space in the Durban metropolitan area.
• What is the probability that, on a given day, the Durban branch will sign
only three lease agreements for office space?
• What is the probability that, on a given day, the Durban branch will sign
at most two lease agreements for office space?
• What is the probability that the Durban branch will sign more than four
lease agreements for office space on a given day?
• [Link]
Probability slides
• [Link]
ons-77299706