0% found this document useful (0 votes)
9 views19 pages

Counting Principles and Probability Basics

The document covers counting principles and probability, including the sum and multiplication rules, permutations, combinations, and various probability distributions. It explains concepts such as classical and empirical probability, conditional probability, and the expected value of random variables. Additionally, it discusses discrete and continuous probability distributions, including binomial and Poisson distributions, along with examples to illustrate these concepts.

Uploaded by

jaeiemmestropia
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views19 pages

Counting Principles and Probability Basics

The document covers counting principles and probability, including the sum and multiplication rules, permutations, combinations, and various probability distributions. It explains concepts such as classical and empirical probability, conditional probability, and the expected value of random variables. Additionally, it discusses discrete and continuous probability distributions, including binomial and Poisson distributions, along with examples to illustrate these concepts.

Uploaded by

jaeiemmestropia
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

LESSON 5: COUNTING PRINCIPLES AND Example:

PROBABILITY How many different ways can a


manager and a supervisor be selected for a
Sum Rule: Two sets of possible outcomes company branch in Manila if there are 8
are disjoint, then the number of possible employees available?
outcomes for the event is Solution:
m +n 8P2 = 56

Multiplication Rule: In a sequence of n Circular Permutations: Arranged in places


events in which the 1st has n1 possibilities along a closed curve or a circle, in which
and the 2nd event has n2, and the 3rd has any place may be regarded as the 1st or last
n3, and so forth, the total number of place. Thus, Pc = (n – 1)!
possibilities of the sequence will be
No. of ways = n1(n2)(n3)…(nk) Example:
How many ways can 6 ladies be
Example: seated in a circular table such that 2 of the
1.​ A student has a choice of 5 ladies must always sit beside each other?
sandwiches and 6 juices. In how Solution:
many ways can he choose 1 Consider the 2 ladies as one fixed
sandwich and 1 juice? object. Hence, these 2 ladies taken as one,
(5)(6) = 30 ways can be arranged in 2 ways.
2.​ How many ways can 8 persons be Thus, (n-1)! nPr = (5 – 1)! 2P2 =
seated on a chair if there are only 4 24(2) = 48
chairs available? Solution: The first
chair can be occupied by any of the Permutations with Repeated Elements: It
8 persons; anyone of the remaining often happens that virtually identical objects
7 persons can occupy the second get arranged.
chair; the third chair can be occupied
by one of the remaining 6 persons; Example:
and finally, the fourth chair can be There are 4 copies of Statistics
occupied by one of the remaining 5 book, 5 copies of Probability book, and 3
persons. Therefore, the number of copies of Forecasting book. In how many
ways can be calculated, as follows: ways can they be arranged on a shelf?

Solution:
Number of ways = n1 x n2 x n3 x n4
= 8 x 7 x 6 x 5 = 1,680 ways

Factorial Notion: n! = n(n – 1)(n – 2)(n –


3)…(3)(2)(1) Combination: is a grouping or selection of
all or part of several things (or objects)
Permutation: is an arrangement of all or without reference to the arrangement of the
part of several things (or objects) in a things selected. The number of
definite order. The number of permutations combinations of n objects taken r at a time
of n objects taken r at a time is given by is given by
Example:
P(n,r) = nPr = n!/(n-r)!, 0 ≤ r ≤ n How many ways can 4 board
members be selected out of 15 board
(n = population, r = sample) members of a company to represent the
body in the stockholders meeting?​
Solution: Where: n(E) = Number of outcomes in E
n(S) = total number of outcomes in
the sample space

Example: An Ordinary deck of cards


a.​ P(of getting a spade)
= 13/52 or 1/4
b.​ P(of getting a 5 or a club)
Sample Spaces
= 4+13-1/52
-​ An event is a collection of one or
= 15/52 or 4/13
more outcomes of an experiment.
-​ A simple event is an event that
Empirical Probability: uses frequency
includes one and only one of the
distribution based on observations to
outcomes for an experiment and is
determine numerical probabilities of events.
denoted by E. (also called an
P(E) = f/n
elementary event).
-​ A compound event is a collection of
Where: f = frequency for the class
more than one outcome for an
​ n = total frequencies of the
experiment (also called composite
distribution
event).
Example:
Example:
In a sample of 50 college students,
Find the sample space, the event of
18 are freshmen, 23 are sophomores, 2 are
getting a sum of 6 and the event of getting a
juniors, and 7 are seniors. Set up a
sum of at least 3 if a pair of dice is rolled.
frequency distribution and find the following
a.​ To find the sample space, we need
probabilities
to apply the fundamental principle
of counting. Year Frequency
Let n1 = the number of possible Freshman 18
outcomes for the first die Sophomore 23
n2 = the number of possible Junior 2
outcomes for the second die Senior 7
n(S) = (n1) (n2) n(S) = (6) (6) Total 50
n(S) = 36 a.​ P(a student is a freshman)
b.​ The event of getting a sum of 6 = 18/50 or 9/25
A = {(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)} b.​ P(a student is a freshman or a
n(A) = 5 sophomore)
c.​ The event of getting a sum of at = 18+23/50
least 3. At least 3 means “3 and = 41/50
above” c.​ P(a student neither a freshman
B = {(1,2), (1, 3), (1, 4),(1, 5), (1, 6), (2, 1), nor a junior)
(2, 2),..., (6, 6) = 23+7/50
n(B) = 35 =30/50 or 3/5
d.​ P(a student is not a senior)
Probability = 1 – 7/50
Classical Probability: Assumes that all = 50/50 – 7/50
outcomes in the sample space are equally = 43/50
likely to occur.
P(E) = n(E)/n(S) Subjective Probability: assigned to an
event based on subjective judgment,
experience, information, and belief.
Additive Rule
If A and B are two events, then
PA∪B=PA+PB−PA∩B

If A and B are mutually exclusive, then Example:


P(A ∪ B) = P(A) + P(B) A box contains white and red
balls. Each ball is labeled either A or B.
If A1, A2, A3 … An are mutually exclusive The composition of the box is shown
events then P(A1 ∪ A2 ∪ A3 … ∪ An ) = below.
P(A1) + P(A2) + P(A3) +…+ P(An) White Red Total
A 6 4 10
Mutually exclusive – cannot occur at the B 2 3 5
same time Total 8 7 15
Non-mutually exclusive – can occur at the
same time In getting one ball from the box, what is
the probability that a white ball is taken
Example: given that it is labeled A.
A card is drawn from an ordinary
deck of cards. Find these probabilities Solution:
a.​ drawing a red card or a face card P(White/A) = P(White and A)/P(A)
= 26/52 + 12/52 – 6/52 ​ = 6/15/10/15
= 32/52 or 8/13 ​ = 6/10 or 3/5
b.​ drawing an ace card or a face
card Multiplication Rule
= 4/52 + 12/52 The probability is determined with
= 16/52 or 4/13 replacement, where the happening of the
c.​ drawing a heart card or a black second event is not affected by the
card or the ace of diamonds happening of the first event. This condition
= 13/52 + 26/52 + 1/52 demonstrates the independent type of
= 40/52 or 10/13 event.
If A and B are independent events,
Conditional Probability: The probability of then, P(A ∩ B) = P(A) ∗ P(B)
an event occurring when it is known that
some event has occurred. The probability is determined without
If A and B are events such that P(A) replacement, where the occurrence of the
≠ 0. The conditional probability of B, second event is affected by the occurrence
given A, denoted by P(B/A), is given by of the first event. This condition
demonstrates the dependent type of event.
P(A ∩ B) = P(A) ∗ P(B/A)

Example: Example:
A single fair die is rolled once. Let B A box contains 5 green, 6 yellow,
denote that the number obtained is less and 4 blue balls. Find the probability of
than 4; let A denote the event that an odd selecting two balls (a yellow on the first and
number is rolled. Compute P(B/A). a blue on the second) if selection is done.
S = {1, 2, 3, 4, 5, 6} a. with replacement
b. without replacement
Jones, and Brown, in that order,
receive one of the 3 hats, list the
sample points for the orders of
returning the hats and find the
values m of the random variable M
that represent the number of correct
matches.
Sample Space m
SJB 3
SBJ 1
JSB 1
JBS 0
BSJ 0
BJS 1

Discrete random variable with a finite


number of values:
-​ Kayang bilangin kung ilan yung
LESSON 6: PROBABILITY values
DISTRIBUTION Discrete random variable with an infinite
Random Variable sequence of values:
A random variable is a variable -​ Kayang bilangin pero walang finite
whose possible values are numerical limit kung ilan.
outcomes of a random phenomenon.
Random variables are fundamental in Continuous Random Variable
probability theory and statistics because -​ A variables that represent measured
they allow us to quantify and analyze data.
random events -​ A random variable that may take on
We shall use a capital letter, say X, any value in an interval or a
to denote a random variable and its collection of intervals.
corresponding small letter, x, for one of its
values. Discrete Probability Distribution
-​ The probability function, denoted by
Example: f(x), provides the probability that the
1.​ Two balls are drawn in succession random variable X takes on a
without replacement from a box specific value.
containing 4 red balls and 3 black P(X = x) = f(x)
balls. The possible outcomes and
the values y of the random variable A discrete probability distribution
Y, where Y is the number of red must satisfy these two conditions:
balls, are
f(x) ≥ 0 or P(x) ≥ 0
Sample Space y
Σf(x) = 1.0 or ΣP(x) = 1.0
RR 2
RB 1
Expected Value and Variance
BR 1 The expected value, or mean, of a
BB 0 random variable is a measure of its central
location.
2.​ A hatcheck girl returns 3 hats at
random to 3 customers who had
previously checked them. If Smith,
Expected value of a discrete random distribution of a random variable X,
variable: the number of successes in n
independent trials is

The variance summarizes the


variability in the values of a random
variable.
Variance of a discrete random variable: Example:
Find the probability of obtaining
exactly three 2’s if an ordinary die is tossed
5 times.
Solution:
The standard deviation, 𝝈, is defined
Let X = denote the number of times 2
as the positive square root of the variance.
occurs in tossing a die five times.
x=3
Example:
n=5
Find the expected number of boys
p = 1/6, probability of success (getting a 2)
on a committee of 3 selected at random
q = 5/6, probability of failure (not getting a 2)
from 4 boys and 3 girls.

The Binomial Probability Distribution

The Poisson Probability Distribution


The probability distribution of a
Type of Discrete Probability Distributions Poisson random variable X, representing
Discrete Uniform Distribution the number of outcomes occurring in a
-​ where a random variable X takes on given time interval or specific region, is
a finite set of distinct values with
equal probability.
f(x; k) = 1/k
for x = x1, x2, …, xk; where:
k is the number of values of X. 𝜇 = the average number of outcomes
Example: Rolling a die, f(x;6) = 1/6 occurring in the given time interval or
specified region
Binomial Probability Distribution e = 2.71828…
-​ If a binomial trial can result in a
success with probability p and a Example 1:
failure q = 1 – p, then the probability
The average number of days school
is closed due to rain during the rainy season
in Metro Manila is 4. What is the probability
that the school will be closed for 6 days
during the rainy season?

Example 2:
The average number of field mice
per acre in a 5-acre wheat field is estimated
to be 10. Find the probability that a given
acre contains more than 15 mice.

Uniform Probability Distribution


The Uniform Probability Distribution
Poisson Approximation of Binomial -​ Uniform Probability Density
Distribution Function
The Poisson probability distribution can f (x) = 1/(b - a) for a < x < b = 0
be used as an approximation of the binomial elsewhere
probability distribution when p, the -​ Expected Value of X
probability of success, is small and n, the E(X) = (a + b)/2
number of trials, is large. -​ Variance of x
-​ Approximation is good when p < .05 Var(X) = (b - a)^2/12
and n > 20
-​ Set 𝝁 = np and use the Poisson Where: a = smallest value the
tables variable can assume
b = largest value the variable
Continuous Probability Distribution can assume
The function with values f(x) is called
a probability density function for the The probability of the continuous random
continuous random variable X if the total variable assuming a specific value is 0.
area under its curve and above the-axis is P(x = x1 ) = 0
equal to 1 and if the area under the curve
between any two ordinates x = a and x = b Example:
gives the probability that X lies between a Slater customers are charged for the
and b. amount of salad they take. Sampling
suggests that the amount of salad taken is
uniformly distributed between 5 ounces and
15 ounces.
f (x) = 1/15-5 for 5 ≤ x ≤ 15
= 1/10 for 5 ≤ x ≤ 15
= 0 elsewhere
where: x = salad plate filling weight

What is the probability that a


customer will take between 12 and 15
ounces of salad?

Standard Normal Probability Distribution


-​ A random variable that has a normal
distribution with a mean of zero and
a standard deviation of one is said
to have a standard normal
probability distribution.
What is the probability that a -​ The letter z is commonly used to
customer will take between 8 and 12 designate this normal random
ounces of salad? variable.
-​ The following expression converts
any Normal Distribution into the
Standard Normal Distribution

What is the probability that a


customer will take between 5 and 12
ounces of salad?
Standard Score (Z-Score)
The standard score is the distance
of the score from the mean (x̄) in terms of
the standard deviation (s). It tells how many
standard deviations the observed value (x)
lies above or below the mean of the
distribution. The standard score is useful in
comparing observed values with different
distributions. To be able to find areas under
the normal curve, observed values must first
The Normal Probability Density Function be converted into standard scores, to solve
statistical problems.
To change an observed value (x)
into a standard score use the equation:

Where:
𝜇 = mean
𝜎 = standard deviation where:
𝜋 = 3.14159… x = raw score/observed value
e = 2.71828...
x̄ = mean
s = standard deviation c.​ from 28 to 34?

Note:
A positive (+) z-score means that the
score/observed value is above the mean.
A negative (–) z-score means that the
score/observed value is below the mean.

Example:

Aptitude Test
A firm has assumed that the distribution of
the aptitude test of people applying for a job
in this firm is normal. The following sample
is available.

Example
1.​ If scores are normally distributed
with a mean of 30 and a standard
deviation of 5, what percent of the
scores is:
a.​ equal or greater than 30? We first need to estimate the mean and
standard deviation

b.​ equal or greater than 37? Z-Values and X-Values


The standard normal value (z value),
such that 10% of z values are less than or
equal to it is z = -1.28
To transform this standard normal
value to a similar value in our example, we
use the following relationship
The normal value of test marks such
that 10% of random variables are less than
it is 55.1.

Normal Approximation of Binomial


Probabilities
If X is a binomial random variable
with mean, 𝝁 = 𝒏𝒑 and a variance 𝝈^𝟐 =
𝒏𝒑𝒒, then the limiting form of the distribution
of
LESSON 7: HYPOTHESIS TESTING AND
CORRELATION REGRESSION
Hypothesis Testing
-​ It is a statistical method that is used
in making statistical decisions using
experimental data.
As n = ∞, the standardized normal -​ It is an assumption that we make
distribution about the population parameter.
The Exponential Probability Distribution Two types of Statistical Hypothesis
Null Hypothesis
Exponential Probability Density Function -​ Symbolized by H0, that assumes
that the observation is due to a
chance factor.
Alternative Hypothesis
-​ Symbolized by H1, or Ha, which
states that there is a difference
Where: between two population means (or
𝝁 = mean of variable X parameters)
e = 2.71828… Level of significance
m = 1/𝝁 = Mean of the exponential -​ refers to the degree of significance
probability distribution in which we accept or reject the null
𝝈^𝟐 = 𝟏/𝝁^𝟐 = Variance of the exponential hypothesis.
probability distribution​
The critical or rejection value =
Example: significant difference and that the null
The time between arrivals of cars at hypothesis (H0) should be rejected.
Al’s Carwash follows an exponential
probability distribution with a mean time Noncritical or nonrejection region
between arrivals of 3 minutes. Al would like = difference was probably due to chance
to know the probability that the time and that the null hypothesis (H0) should
between two successive arrivals will be 2 not be rejected.
minutes or less.

P(x ≤ 2) = 1 – 2.71828^-2/3
= 1 – 0.5134
= 0.4866
Critical Value Approach to Hypothesis Step 2 Calculate the Calculate the
Testing sample mean sample mean and
for one sample standard dev
Z-test

Step 3 Calculate the Calculate the


value of the one value of the one
sample Z-test sample t-test

Step 4 N/A Calculate the


degrees of
freedom
df = n - 1

Step 5 If Z(computed) If T(computed) <


< Z(critical), do T(critical), do not
not reject H0 reject H0 (null
(null hypothesis)
hypothesis) If T(computed) ≥
If Z(computed) T(critical), reject
≥ Z(critical), H0 (null
reject H0 (null hypothesis)
hypothesis)
Possible Outcome of a Hypothesis Test

Step 6 Conclusion Conclusion

Example z-test:
A researcher reports that the
average salary of College Deans is more
than P63,000. A sample of 35 College
One sample Z-Test Deans has a mean salary of P65,700. At α=
-​ Used when n ≥ 30, or when the 0.01, test the claim that the College Deans
population is normally distributed earn more than P63,000 a month. The
and population standard deviation is standard deviation of the population is
known. P5,250.
Test value = observed value - expected
value/standard error Given:
Procedure: 𝑥 = 65,700
Z-TEST T-TEST μ = 63,000
σ = 5,250
n n ≥ 30 n < 30 n = 35
Step 1 H0: µ = H0: µ = specified Step 1: State the hypotheses and identify
specified value value
the claim.
H1: H1: µ ≠, <, >
H0: μ ≤ 63,000; H1: μ>63,000
µ ≠, <, > specified value
specified value (claim)
Step 2: Level of significance is α= 0.01.
Step 3: The z-critical value is 2.33
(one-tailed test)
Step 4: Compute the one sample z test

Step 6: Conclusion.
We can conclude that the starting
salary of civil engineers is P18,000.
Step 6: Conclusion
We can conclude that there is Summary if ano gagamitin:
enough evidence to support the claim that
the monthly salary of College Deans is more
than P63,000.

One sample T-Test


-​ is a statistical procedure that is used
to know the mean difference
between the sample and the known
value of the population mean. The
sample size should be n < 30.​

Example:
One of the undersecretaries of the Z for Proportion
Department of Labor and Employment This test can be considered as a
(DOLE) claims that the average salary of a binomial experiment when there are only
civil engineer is P18,000. A sample of 19 two outcomes and the probability of success
civil engineers’ salaries has a mean of does not change from trial to trial (the
P17,350 and a standard deviation of outcomes of each trial are independent).
P1,230. Is there enough evidence to reject
the undersecretary’s claim at α= 0.01?

Solution:
Given:
𝑥 = 17,350
μ = 18,000 Procedure for one sample z - test
s = 1,230 -​ H0: 𝑝 = 𝑠𝑝𝑒𝑐𝑖𝑓𝑖𝑒𝑑 𝑣𝑎𝑙𝑢𝑒
n = 19 H1: 𝑝 ≠, <, > 𝑠𝑝𝑒𝑐𝑖𝑓𝑖𝑒𝑑 𝑣𝑎𝑙𝑢𝑒
Step 1: State the hypotheses and identify -​ Calculate the sample proportion
the claim. -​ Calculate the value for the z-test for
H0: μ = 18,000 (claim); H1: μ ≠ 18,000 proportion
Step 2: Level of significance is α= 0.01. -​ Statistical Decision
Step 3: The t-critical value is ±2.878 If Z(computed) < Z(critical), do not
(two-tailed test) reject H0 (null hypothesis)
Step 4: Compute the one sample t test If Z(computed) ≥ Z(critical), reject
H0 (null hypothesis)

Example:
A recent survey done by Philippine
Housing Authority found that 35% of the
population owns their homes. In a random Step 2 Calculate the Calculate the
sample of 240 heads of households, 78 sample mean sample mean
responded that they owned their homes. At & standard
the 0.01 level of significance, does that deviation
indicate a difference from the national
proportion?
Given: Stp 3 Calculate the Calculate the
X = 78 value for the value for the
n = 240 z-test t-test
p = 35% = 0.35
Step 4 If Z(computed) If Z(computed)
Step 1: < Z(critical), do < Z(critical), do
H0: p = 0.35 (claim); H1: p ≠ 0.35​ not reject H0 not reject H0
Step 2: Level of significance is α= 0.01. (null (null
hypothesis) hypothesis)
Step 3: The zcritical value is ±2.576
(two-tailed test) If Z(computed) If Z(computed)
Step 4: ≥ Z(critical), ≥ Z(critical),
reject H0 (null reject H0 (null
hypothesis) hypothesis)

Step 5 Conclusion Conclusion

Example:
A real estate agent compares the
selling price of townhouses in two major
cities in National Capital Region to see if
there is a difference in price. The results of
the study are shown. Is there enough
Step 6: Conclusion evidence to reject the claim that the average
​ We can conclude that there is not price of a townhouse in Quezon City is
enough evidence to reject the claim that higher than Makati City? Use α= 0.05.
35% of the Filipinos owned their homes.

Z test for two large independent samples


When comparing the means of two
populations, we usually consider the
difference between their means, µ1 − µ2
Procedure for two - independent
samples Step 1: State the hypotheses and identify
the claim.
Z-Test T-Test
(Large) (Small) H0: μ1 ≤ μ2
There is no significant difference
Step 1 H0: µ1 = µ2 H0: µ1 = µ2 between the average price of townhouses in
H1: H1: Quezon City and Makati City.
µ1 ≠ µ2 µ1 ≠ µ2 H1: μ1> μ2
µ1 ≤ µ2 µ1 ≤ µ2 The average price of townhouses in
µ1 ≥ µ2 µ1 ≥ µ2 Quezon City is higher than in Makati City
Step 2: Level of significance is α= 0.05.
Step 3: The z critical value is 1.645
(one-tailed test)
Step 4: Compute the test value There is no significant difference between
the salaries of private and public school
elementary teachers.
H1: μ1≠ μ2
There is a significant difference between the
salaries of private and public school
elementary teachers
Step 2: Level of significance is α= 0.05
df = n1+ n2 – 2 = 15 + 9 – 2 = 22
Step 3: The t-critical value is ±1.729
(two-tailed test)
Step 4: Compute the test value

Step 6: Conclusion
There is a significant difference in
the rates and the average price of
townhouses in Quezon City, which is higher
than in Makati City.

T-test for two small independent samples


-​ is used to test the significance of the
difference between two samples.

Example: Step 6: Conclusion


A researcher wishes to determine ​ There is enough evidence to support
whether the monthly salary of professional the claim that the salaries paid to
elementary teachers in private schools and elementary teachers employed in private
elementary teachers in public schools schools are different from those in public
differs. He selects a sample of elementary schools.
teachers. From each type of school
calculate the means and standard deviation T-test for Paired Samples
of their salaries. At α =0.01, can he -​ is a statistical technique that is used
conclude that the private school teachers do to compare two population means in
not receive the same salary as the public the case of two samples that are
school teachers? Assume that the correlated. It is used in “before after”
populations are approximately normally studies, or when the samples are the
distributed matched pairs, or the case is a
control study

Solution:
Procedure for t-test
Step 1: State the hypotheses and identify
the claim. -​ H0: µ1 = µ2
H0: μ1 = μ2 H1: µ1 ≠ µ2, µ1 ≤ µ2, µ1 ≥ µ2
-​ Calculate the sample mean &
standard deviation & df = n - 1
-​ Calculate the value for the t-test
-​ Statistical Decision
If Z(computed) < Z(critical), do not
reject H0 (null hypothesis)
If Z(computed) ≥ Z(critical), reject
H0 (null hypothesis)

Example:
The management of Resale
Furniture, a change of second-hand
furniture stores in Metro Manila, designed
an incentive plan for salespeople. To
evaluate this innovative incentive plan, 10 Step 4: Compute the test value
salespeople were selected at random, and Determine the mean of the differences
their weekly incomes before and after the
incentive plan were recorded.

Determine the standard dev. of the


differences

Was there a significant increase in


the typical salesperson’s weekly income
due to the innovative incentive plan? Use
the 0.05 significance level.

Solution:
Step 1: State the hypotheses and identify
the claim.
H0: μD ≤ 0
The income after the incentive plan is not
Step 6: Conclusion
greater than before the incentive plan.
We can conclude that there is
H1: μD> 0
enough evidence to support the claim that
The income after the incentive plan is
the income after the incentive plan is
greater than before the incentive plan.
greater than before the incentive plan.
Step 2: Level of significance is α= 0.05.
Step 3: The t-critical value is 1.833
Z-test between two proportions
(one-tailed test)
-​ It deals with the procedures for
drawing inferences about the
difference between populations
whose data are nominal
Step 6: Conclusion
There is enough evidence to support
the claim that there is difference in the
proportions.

CORRELATION AND REGRESSION


Example:
In a sample of 240 store customers, Correlation
72 used Visa card. In another sample of -​ refers to the departure of two
190, 76 used a MasterCard. At α= 0.10, is random variables from
there a difference in the proportion of independence.
people who use each type of credit card?
Solution: Pearson’s product-moment correlation
Given: coefficient
𝑝1 be the proportion of Visa card -​ also known as the simple correlation
users and coefficient (or Pearson’s r), is a
measure of the linear strength of the
𝑝2 be the proportion of Master association between two variables.
card users -​ Founded by Karl Pearson.
-​ The value of the correlation
coefficient varies between +1 and
–1.
Correlation Coefficient

Step 1: State the hypotheses and identify


the claim.
H0: p1= p2 H1: p1≠ p2 (claim)
Step 2: Level of significance is α= 0.10.
Step 3: The z-critical value is ±1.645
(two-tailed test)
Step 4: Compute the test value
Pearson’s product-moment correlation 𝑝 < 0 𝑝 < 0
coefficient (The (The
correlation in correlation in
the population the population
is different is different
from 0) from 0)

Test of significance Step 2 Level of Level of


significance significance

Step 3 df = n-2 and df = n-2 and


t-critical value t-critical value

Step 4 Calculate the Calculate the


value of value of p.
Pearson’s r.

Step 5 Calculate the Calculate the


value of the t value of the t
value value
Correlation Coefficient & Strength of
Step 6 If t(computed) If t(computed)
Relationships < t(critical), do < t(critical), do
●​ 0.00 – no correlation, no relationship not reject H0 not reject H0
●​ ±0.01 to ±0.20 – slight correlation, (null (null
almost negligible relationship hypothesis) hypothesis)
●​ ±0.21 to ±0.40 – slight correlation,
If t(computed) If t(computed)
definite but small relationship ≥ t(critical), ≥ t(critical),
●​ ±0.41 to ±0.70 – moderate reject H0 (null reject H0 (null
correlation, substantial relationship hypothesis) hypothesis)
●​ ±0.71 to ±0.90 – high correlation,
marked relationship Step 7 Conclusion Conclusion
●​ ±0.91 to ±0.99 – very high
correlation, very dependable Example:
relationship The owner of a chain of fruit shake
●​ ±1.00 – perfect correlation, perfect stores would like to study the correlation
relationship between atmospheric temperature and
sales during the summer season. A random
Assumptions sample of 12 days is selected with the
-​ Subjects are randomly selected and results given as follows:
independently assigned to groups.
-​ Both populations are normally
distributed

Pearson’s Spearman

Step 1 H0: 𝑝 = 0 H0: 𝑝 = 0


(The (The
correlation in correlation in
the population the population
is zero.) is zero.)
H1: H1:
Plot the data on a scatter diagram.
𝑝 ≠ 0 𝑝 ≠ 0 Does it appear there is a relationship
𝑝 > 0 𝑝 > 0 between atmospheric temperature and
sales? Compute the coefficient of
correlation. Determine at the 0.05
significance level whether the correlation in
the population is greater than zero.
Scatter Plot

The atmospheric temperature and


total sales indicate a very high positive
correlation (a very dependable
relationship)–that is, an increase in
atmospheric temperature is highly
associated with the increase in total sales of
fruit shake
Step 1: State the hypotheses.
H0: r = 0
There is no correlation between
atmospheric temperature and total sales of
fruit shakes.
H1: r ≠ 0
There is a correlation between
atmospheric temperature and total sales of
fruit shake.
Step 2: Level of significance is α= 0.05. Step 6: Conclusion
Step 3: df = n – 2 = 12 – 2 = 10 & t-critical We can conclude that there is
value is ±2.228. evidence that shows a significant
Step 4: Compute the Pearson’s r. association between the atmospheric
temperature and the total sales of fruit
shakes.

Correlation Between Ordinal Variables

SPEARMAN RANK ORDER


CORRELATION COEFFICIENT

Where:
D = is the difference between subjects ranks
on the two variables
N = is the number of subjects

Example:
The owner of a chain of fruit shake
stores would like to study the correlation
between atmospheric temperature and
sales during the summer season. A random
sample of 12 days is selected with the
results given as follows:

Plot the data on a scatter diagram.


Does it appear there is a relationship
between atmospheric temperature and
sales? Compute the coefficient of
correlation. Determine at the 0.05
significance level whether the correlation in
the population is greater than zero.
The atmospheric temperature and
Scatter Plot total sales indicate a very high positive
correlation (very dependable
relationship)–that is an increased in
atmospheric temperature is highly
associated with the increased in total sales
of fruit shake.

Step 1: State the hypotheses.


H0: r = 0
There is no correlation between Step 6: Conclusion
atmospheric temperature and total sales of ​ We can conclude that there is
fruit shakes. evidence that shows a significant
H1: r ≠ 0 association between the atmospheric
There is a correlation between temperature and the total sales of fruit
atmospheric temperature and the total sales shakes.
of fruit shakes.
Step 2: Level of significance is α= 0.05. Simple Regression Equation
Step 3: df = n – 2 = 12 – 2 = 10 & t-critical Regression analysis is a simple
value is ±2.228. statistical tool used to model the
Step 4: Compute the p. dependence of a variable on one (or more)
explanatory variables.
A simple linear regression is the
least estimator of a linear regression model
with a single predictor (or one independent
variable)

The least squares model


determines a regression equation by
minimizing the sum of squares of the
vertical distances between the actual Y
values and the predicted values of Y.

Assumptions of Linear Regression


Equation
-​ Linearity – The mean of each error
component is zero.
-​ Independence of Error Terms The standard error of estimate is
–The errors are independent of each the standard deviation of the observed Y
other. values about the predicted Ŷ values.
-​ Normally Distributed Error Terms
–Each error component (random
variable) follows an approximate
normal distribution.
-​ Homoscedasticity –The variance of
the error components is the same for
each value of the independent
variable. The coefficient of determination is
the measure of variation of the dependent
Estimating the Coefficient variable that is explained by the regression
line and the independent variable

Coefficient of non-determination
is the proportion in the dependent variable
that is left unexplained by the independent
variable, determined by 1 –r2.

The Enddd

You might also like