INTRODUCTION TO
PROBABILITIES
Presented by Jean-Baptiste M.B. SANFO
Introductory Statistics for Economics | Fall 2025
OVERVIEW
• 4.1 An Introduction to Probabilities
• 4.2 Probability Rules for More Than
One Event
• 4.3. Counting Principles
4.1 An Introduction to Probabilities
1. A probability is a numerical value ranging from 0 to 1 that
indicates the chance, or likelihood, that a specific event will occur.
• If there is no chance it will occur, the probability is 0 (0%).
• If the event will absolutely occur, the probability is 1 (100%).
2. Statistics Jargon
• Experiment: The process of measuring an activity that leads to uncertain
outcome(s) for the purpose of collecting data. Ex: Rolling a single six-sided die
• Sample space S: All the possible outcomes of an experiment. S {1, 2, 3, 4, 5, 6}.
• Event: One or more outcomes of an experiment. Ex: rolling a pair with two dice.
• Simple event: An event with a single outcome in its most basic form that cannot
be simplified. Ex: rolling a five with a single die.
4.1 An Introduction to Probabilities
Classical Probability
Classical probability is used when we know the number of possible outcomes of
the event of interest and can calculate the probability of that event.
Number of possible outcomes that constitute Event A
P(A) = P(A) = The probability that even A will occur
Total number of possible outcomes in the sample space
The Sample Space for a Single Die 1
The probability of Event A defined as rolling a five: P(A) = = 𝟎. 𝟏𝟔𝟕
6
There is a 16.7% probability of rolling a five.
Wen S includes every possible simple event that can occur:
collectively exhaustive. Ex: {head, tail} because a coin has 2 sides
Classical probability assumes that each event in the sample space has the same
likelihood of occurring.
Classical Probability
Sample Space When Rolling Two Dice to Obtain a Sum of 5
There are four ways to obtain a sum of
5 out of the 36 possible outcomes.
4
P(A) = = 𝟎. 𝟏𝟏𝟏
36
There is an 11.1% chance of two dice
summing to the number 5.
Your Turn: What is the probability of
obtaining a sum of 8 when rolling two
dice?
Empirical Probability
Empirical probability involves conducting an experiment to observe the
frequency with which an event occurs. Then we compute the ratio below:
It is used when we don’t have the information needed to determine the
number of outcomes associated with an event (no “known” sample space)
𝑭𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚 𝒊𝒏 𝒘𝒉𝒊𝒄𝒉 𝑬𝒗𝒆𝒏𝒕 𝑨 𝒐𝒄𝒄𝒖𝒓𝒔
P(A) =
𝑻𝒐𝒕𝒂𝒍 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏𝒔
Ex: probability that a person who walks into my store will make a purchase
when I observed 100 of them and 15 made a purchase will be 15/100 = 0.15.
Law of large numbers: when an experiment is conducted a large number of
times, the empirical probabilities will converge to the classical probabilities.
Empirical Probability
Connection between classical and empirical probabilities
Ex: Flipping a fair coin three times and observing three tails
The empirical probability of a tails
in this example is 100%.
The classical probability is 50%.
Flipping the coin, say, 500 times,
the empirical probability would be
very close to 50%.
The law of large numbers: when an experiment is conducted a large number of
times, the empirical probabilities will converge to the classical probabilities. R example
Subjective Probability
• Subjective Probabilities are:
• based on the expertise and judgment of individuals.
• used when data or experiments are not available to calculate
probabilities.
• Example: A market analyst believes that there is a 50% probability that
Apple will announce a new version of the iPhone next month.
• It would be very difficult to determine this probability using classical or
empirical methods.
Basic Properties of a Probability
• Probability Rule 1: If P(A) = 1, then with certainty, Event A must occur.
• Ex: Event A = rolling a single six-sided die and observing 1, 2, 3, 4, 5, or 6.
• Probability Rule 2: If P(A) = 0, then with certainty, Event A will not occur.
• Ex: A= Drawing 5 cards from a 52-card deck without replacement and
observing five aces in your hand. (A 52-card deck only has aces)
• Probability Rule 3: The probability of any event must range from 0 to 1.
• Probability Rule 4: The sum of all the probabilities for the simple events in
the sample space must be equal to 1.
Basic Properties of a Probability
• Probability Rule 5: The complement to Event A is defined as all of the
outcomes in the sample space that are not part of Event A.
• P(A) + P(A’) = 1 or P(A) = 1 – P(A’)
• Ex: rolling a die and Event A is all even numbers: A= {2,4,6}, then Event A’
is all odd numbers, A’ = {1,3,5} Venn diagram
• The box represents the entire sample space,
• The shaded circle represents Event A.
• The area outside the circle within the box
represents the complement of Event A.
4.2 Probability Rules for More Than One Event
Situations in the real world are rarely this simple. Oftentimes, they
involve two or more events that intersect one another.
Ex: A labor economist wants to know the probability that a person is
unemployed given that they have no college degree.
Two events to consider:
Event A= Person is unemployed
Event B= Person has no college degree
The Intersection of Events
What is the probability that a student will be a
female and a freshman?
The intersection of Events A and B ( A ∩ 𝐵) is
when Events A and B occur at the same time.
Contingency tables show the actual or relative frequency of two
categorical variables at the same time: the school year and the gender.
Event A= a student living in the dorm being a female
Event B= a student living in the dorm being a freshman
17
P(A and B) = P ( A ∩ 𝐵) = = 0.34
28 29 50
P (A) = = 0.56 P(B) = = 0.58
50 50 P ( A ∩ 𝐵) is a joint probability.
The Union of Events
The union of Events A and B: the number of instances
either Event A or Event B or both occur : 𝑨 ∪ 𝑩
Event A= A female (freshman, sophomore, junior, senior)
Event B= A freshman (female or male)
Probability that the next student is either a
female (Event A) or a freshman (Event B)
40
P(A or B) = P ( A ∪ 𝐵) = = 0.80
50
And → Intersection Or → Union
P ( A ∩ 𝐵) ≠ P ( A ∪ 𝐵) 0.34 ≠ 0.80
P ( A ∩ 𝐵) < P ( A ∪ 𝐵) 0.34 < 0.80
The Addition Rule
Two events are mutually exclusive if they cannot occur
at the same time during the experiment: Receiving both
an A and a B grade at the same time in an exam.
Mutually exclusive Events: P(𝑨 or 𝑩) = P(A) + P(B)
Mutually exclusive events have no overlap in the
sample space.
Probability of either an A or B a grade
9 12
P(A) = = 0.36 P(B) = = 0.48
25 25
Event A= Student earned an A grade P(A or B) = P(A ) + P(B )
Event B= Student earned a B grade P(A or B) = 0.36 + 0.48 = 0.84
The Addition Rule
P(A and B) is shown in the orange-shaded region.
Event A= Student earned an A grade
Event B= Student is female Earning an A and being female
Events A and B are not mutually exclusive. 9 10
P(A) = = 0.36 P(B) = = 0.40
25 25
P(𝑨 or 𝑩) = The sum of individual probabilities of A and
6
B minus the probability of the intersection of A and B. P(A and B) = = 0.24
25
P(𝑨 or 𝑩) = P(A) + P(B) - P(A and B)
P(A or B) = 0.36 + 0.40 – 0.24 = 0.52
Conditional Probability
Probability that Event A will occur, given the condition that
Event B has already occurred P (A | B).
42
𝑷 𝑨 = = 𝟎. 𝟏𝟔𝟖 = 16.8%
250
𝑃(𝐴 𝑎𝑛𝑑 𝐵)
𝑃 𝐴 𝐵) =
𝑃(𝐵)
𝑃(𝐴 𝑎𝑛𝑑 𝐵)
𝑃 𝐵 𝐴) =
𝑃(𝐴)
22 70
Event A= Student scored in the 601–800 range in math 𝑃 𝐴 𝑎𝑛𝑑 𝐵 =
250
= 0.088; 𝑃 𝐵
250
= 0.280
Event B= Student participated in a prep class 0.088
𝑃 𝐴 𝐵) = = 0.314
Does the prep class increase the chance of scoring in 0.280
the 601–800 range? 𝑷 𝑨 𝑩) is nearly twice 𝑷(𝑨). 0.168 → 0.314
Independent and Dependent Events
Events A and B are independent if the occurrence of one has no
impact on the occurrence of the other.
Does the warm-up time have an impact on 11
Marginal probability of Event A 𝑷 𝑨 = = 𝟎. 𝟒𝟒
the chances of winning? 25
Conditional probability of Event A, given B: P (A |B)
7 10
𝑃 𝐴 𝑎𝑛𝑑 𝐵 = = 0.28; 𝑃 𝐵 = 0.40
25 25
𝑃(𝐴 𝑎𝑛𝑑 𝐵) 0.28
𝑃 𝐴 𝐵) = = = 0.70
𝑃(𝐵) 0.40
A and B are independent if 𝑃 𝐴 𝐵) = 𝑃(𝐴).
Event A= Deb wins the tennis match 0.44 ≠ 0.70, then A and B are not independent.
Event B= The warm-up time is long Events A and B are dependent.
The Multiplication Rule
The multiplication rule is used to determine the probability of the
intersection (joint probability) of two events occurring, or P (A and B ).
A grocery store that unknowingly received a potato
chip order containing an unusually high percentage
of bags with salt content below quality control
standards. The day you walk in the store to purchase
two bags, there are 32 bags on the shelf, 9 of which
have low salt content. What is the probability that
both bags you select will have low salt content?
Event A= The first bag has low salt content
Event B= The second bag has low salt content
The Multiplication Rule
A and B are dependent: removing the first bag reduces the sample space
Estimating P(A and B) : probability that both bags will be on low salt
9
1. Probability of Event A 𝑃 𝐴 = 32 = 0.281
2. Probability that the second bag will be on law salt given that the first bad selected
8
was on low salt 𝑃 𝐵 𝐴). 𝑃 𝐵 𝐴) = 31
= 0.258
3. Using the multiplication rule: 𝑃 𝐴 𝑎𝑛𝑑 𝐵 = 𝑃 𝐴 𝑃(𝐵|𝐴) = (0.281)(0.258) = 0.072
There is a 7.2% chance that you will leave the grocery store with two bags low on salt
content.
The Multiplication Rule
When two events are independent, the probability that both will occur is
the product of their individual probabilities of occurring.
Event A= The 1st customer orders the Chef’s Special
Event B= The 2nd customer orders the Chef’s Special
𝑃 𝐴 𝑎𝑛𝑑 𝐵 = 𝑃 𝐴 𝑃 𝐵
𝑃 𝐴 𝑎𝑛𝑑 𝐵 = 0.18 0.18 = 0.032
Club Bistro restaurant in Wilmington, Delaware, has observed that
a customer will order the Chef’s Special for the evening 18% of the There is a 3.2 percent chance that the two
time. Assuming that customer orders are independent, calculate customers order the Chef’s Special.
the joint probability of the following two events occurring:
The Multiplication Rule
Is randomly guessing answers a great test-taking strategy?
Suppose you walk into class and your teacher surprises you with a five-question
true/false pop quiz. If you randomly guess at all five answers, what is the
probability you will answer all of them correctly?
A1 = Answering question 1 correctly (50% chance of Event A 1 occurring)
A2, A3, A4, and A5 = Answering questions 2, 3, 4, and 5 correctly (50% chance each)
P(of 5 correct answers)= P(A1)P(A2)P(A3)P(A4)P(A5)
P(of 5 correct answers)= (0.5)(0.5)(0.5) (0.5)(0.5) = 0.031
Randomly guessing answers is not a great test-taking strategy.
Contingency Tables with Probabilities
A club Bistro has two types of entrée, meat and fish. Customers are
individually asked after their meals if they were satisfied with them.
Event A= Customer is satisfied with the meal.
Event B= Customer is not satisfied with the meal.
Event C= Customer orders a meat entrée.
Event D= Customer orders a fish entrée.
Contingency Tables with Probabilities
Converts frequencies into probabilities.
A Customer being satisfied with the meal P(A) = 0.85
A Customer being satisfied with the meal and had a meat
entrée P(A and C) = 0.35.
A Customer being satisfied with the meal, given the customer
had a meat entrée P(A |C) = 0.875.
Bayes’ Theorem
A mathematical rule to calculate 𝑃 𝐴 𝐵) from information about 𝑃 𝐵 𝐴)
developed by Thomas Bayes (1701–1761).
A way to update the chance of a hypothesis after seeing new evidence.
Bayes’ Theorem
A school administrator wants to know the probability that a student is
struggling academically given that they failed a diagnostic test.
Event A: Student is struggling academically
Event Aᶜ: Student is not struggling academically (doing fine)
Event B: Student fails the diagnostic test
Given Information from past data:
P(A) = 0.20 → 20% of all students struggle academically
P(B|A) = 0.85 → 85% of struggling students fail the test
P(B|Ac) = 0.10 → 10% of good students also fail
What is the probability that a student is actually struggling given that they
failed the test? That’s P(A|B).
Bayes’ Theorem
Step 1. Find the overall probability of failing the test:
P(B) = P(B|A)P(A) + P(B|Ac)P(Ac) = (0.85)(0.20) + (0.10)(0.80) = 0.17 + 0.08 = 0.25
Step 2. Apply Bayes’ Theorem
𝑃 𝐵 𝐴 ∗ 𝑃𝐴 (0.85)(0.20) 0.17
𝑃 𝐴𝐵 = = = = 0.68
𝑃 𝐵 0.25 0.25
There’s a 68% chance that a student who failed the diagnostic test is actually
struggling academically.
Even though the test is not perfect, it’s still quite reliable — most students who fail
it truly are struggling.
Bayes’ Theorem
COVID-19 rapid test (screening)
Events:
A1= person is infected
A2= person is not infected
B = test is positive
B’ = test is negative
Assume:
Covid Prevalence: 𝑃 𝐴1 = 2% (2% of people in this community are infected) ⇒ 𝑃 𝐴2 = 98%
Test Sensitivity: 𝑃 𝐵 𝐴1 ) = 90% (test catches 9 out of 10 infected).
Specificity: 𝑃 𝐵′ 𝐴2 ) = 95% ⇒ 𝑃 𝐵 𝐴2 ) = 5% (False positive rate)
False-negative rate: 𝑃 𝐵′ 𝐴1 ) = 10%
Bayes’ Theorem
1. If my test is positive, what’s the chance I’m actually infected 𝑃 𝐴1 𝐵)?
𝑃 𝐵 𝐴1 )𝑃(𝐴1 ) 0.90 ∗ 0.02 0.018
𝑃 𝐴1 𝐵) = = = ≈ 26.9%
𝑃 𝐵 𝐴1 )𝑃 𝐴1 + 𝑃 𝐵 𝐴2 )𝑃(𝐴2 ) 0.90 ∗ 0.02 + 0.05 ∗ 0.98 0.067
2. If my test is negative, what’s the chance I’m not infected 𝑃 𝐴2 𝐵′)?
𝑃 𝐵′ 𝐴2 )𝑃(𝐴2 ) 0.95 ∗ 0.98 0.931
𝑃 𝐴2 𝐵′) = = = ≈ 99.8%
𝑃 𝐵′ 𝐴2 )𝑃 𝐴2 + 𝑃 𝐵′ 𝐴1 )𝑃(𝐴1 ) 0.95 ∗ 0.98 + 0.10 ∗ 0.02 0.933
3. If my test is negative, what’s the chance I am actually infected 𝑃 𝐴1 𝐵′)?
𝑃 𝐵′ 𝐴1 )𝑃(𝐴1 ) 0.10 ∗ 0.02 0.002
𝑃 𝐴1 𝐵′) = ′ ′
= = ≈ 0.214%
𝑃 𝐵 𝐴1 )𝑃 𝐴1 + 𝑃 𝐵 𝐴2 )𝑃(𝐴2 ) 0.10 ∗ 0.02 + 0.95 ∗ 0.98 0.933
Bayes’ Theorem Prevalence: 𝑃 𝐴1 = 2% (2% of people in this community are infected); ⇒ 𝑃 𝐴2 = 98%
Sensitivity: 𝑃 𝐵 𝐴1 ) = 90% (test catches 9 out of 10 infected).
Specificity: 𝑃 𝐵 𝐴2 ) = 5% (False positive rate); ⇒ 𝑃 𝐵′ 𝐴2 ) = 95%
10,000-person picture (intuition)
False-negative rate: 𝑃 𝐵′ 𝐴1 ) = 10%
A1: 0.02×10,000=200 infected → true positives =0.90×200=180, false negatives =20
A2: 9,800 not infected → false positives = 0.05 ×9,800 =490, true negatives =9,310
A₁ (Infected) A₂ (Not infected) Total
B (Positive) 180 490 670
B′ (Negative) 20 9,310 9,330
Total 200 9,800 10,000
Bayes’ Theorem
10,000-person picture (intuition)
A1: 0.02×10,000=200 infected → true positives =0.90×200=180, false negatives =20
A2: 9,800 not infected → false positives =0.05×9,800 =490, true negatives =9,310
Among B (positives): 180 + 490 = 670→ 𝑃 𝐴1 ∣ 𝐵 = 180/670 ≈ 26.9%.
′
Among B′ (negatives): 20 + 9,310 = 9,330→ 𝑃 𝐴2 ∣ 𝐵 = 9,310/9,330 ≈ 99.8%.
20 false negatives among 9,330 total negatives → 𝑃 𝐴1 𝐵′) =20/9330≈0.214%
With low prevalence 𝑃 𝐴1 ,even a decent test produces many false positives.
Bayes’ Theorem (𝐴1 , 𝐴2 , 𝐵, 𝐵′ ) correctly updates what a result means.
Improving specificity is crucial in low-prevalence screening (reduces false positives).
4.3 Counting Principles
To use classical probability, we need to be able to count the
number of events of interest along with the total number
of events that are possible in the sample space.
For simple events, it is obvious: rolling a single die.
For more complex events, we need to rely on counting principles
The Fundamental Counting Principle
The ice cream flavors and toppings
Deciding from among four flavors and three toppings in which to indulge:
k1 = The number of ice cream flavors I’m trying to decide from among
and
k2 = The number of topping flavors I’m trying to decide from among
Total number of ways both events can occur: (k1) (k2) = (4)(3) = 12 ways
The Fundamental Counting Principle
The ice cream flavors, toppings, and size
The Fundamental Counting Rule
The number of possible outcomes
(k1) (k2) (k3) ... (kn)
where
ki = number of choices for the ith event
n = number of events
(k1) (k2) (k3) = (4)(3)(2) =24
Permutations
Permutations are the number of different ways in which objects can
be arranged in order.
E.g., There are six permutations for 1,2,3: 123 132 213 231 312 321
The number of permutations of n distinct objects is n!
Permutations of n Distinct Objects: 𝑛! = 𝑛 − 1 𝑛 − 2 𝑛 − 3 … (2)(1)
3! = 3(2)(1) = 6
If n = 6 6! = (6)(5)(4)(3)(2)(1) = 720
𝑛!
Permutations of n objects selected x at a Time nPx 𝑛−𝑥 ! = 𝑛 𝑛 − 1 𝑛 − 2 … (𝑛 − 𝑥 + 1)
If there are 12 players on a team, how many ways can any 5 players on the team be
12!
announced to start the game? 12P5 = 12(11)(10)(9)(8) = 95, 040
(12−5)!
Permutations
Suppose we have 10 objects (n=10), and we are removing 2 at a time
(x= 2). How many permutations will we have?
Combinations
Combinations are the number of ways in which objects can be arranged
without regard to order.
The number of combinations n Objects selected x at a time :
𝑛! 𝑛 𝑛−1 𝑛−2 …(𝑛−𝑥+1)
nCx (𝑛−𝑥)!𝑥! =
𝑥!
In poker, 5 cards are selected randomly from a deck of 52 cards. How many five-
52! (52) 51 50 (49)(48)
card combinations exist? 52C5 (52−5)!5! = = 2,598,960
(5)(4)(2)(1)