0% found this document useful (0 votes)
23 views35 pages

Understanding STTF in Probability

The document discusses probability theory and its foundational concepts, including sets, events, and types of probability. It explains the use of Venn diagrams, tree diagrams, and various probability calculations, such as conditional and independent events. Additionally, it covers the characteristics and conditions of probabilities, as well as examples of calculating probabilities in different scenarios.

Uploaded by

NoteGhost
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views35 pages

Understanding STTF in Probability

The document discusses probability theory and its foundational concepts, including sets, events, and types of probability. It explains the use of Venn diagrams, tree diagrams, and various probability calculations, such as conditional and independent events. Additionally, it covers the characteristics and conditions of probabilities, as well as examples of calculating probabilities in different scenarios.

Uploaded by

NoteGhost
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter 8

Probabilitytheoryprovides
a foundation for statistical internet

about population data

⑧ - is the universal set-denotes the set of all elements under consideration .

·
N

·
Venn used between
diagrams can be to
display sets and the relationships sets visually .

A is a subset of f
Union of A B
and

A C .
B
A UB

A B
A

Intersection of A and B Complement of A

A 1B AC

note . AUAC-1
A
A B

AC

Mutually
S

exclusive or disjoint sets A and B

A 1B = 0

A B

Experiment -

the process of making an observation or


taking a measurement that leads to collection of outcomes

Outcome space-collection of all possible outcomes of an experiment (1).


Tree diagram

A bag contains 10 marbles . Three of the marbles and the 7 blue Kate draws one marble
remaining
are orange are .

places it back and then draws a second marble .

Draw illustrate above information .


a tree diagram to the

blue outcomes 0 .
49

-
blue Fo orange
3
10
BO 0 21 .

Kate

Fo
-D blue O 0 21 .

orange
to orange 00 0 09.

and X

OR

Determine Kate ?
the
probability that draws z orange marbles

33
10 X 10

= T
=
0 .
09

Determine the
probability that Kate draws blue marble ?

+ += = 0 9
.

OR

Determine the that Kate draws leastI orange marble ?


pobability at

0 .
09 + 0 .
21 + 0 .
21 =
0 .
91

-(an((mm)
Distributive : De :
law Morgan's law

NB#
-n (Buc) = (anB) =
nub V = OR

n = and

A v (B1C) = (AUB) 1 (Aud (ArB] = A


-
1B

Probability
·
is a numerical measure of the chance that an event will occur
Types of probability

subjective probability
-
the that will is based educated intuition .
probability event occur and guess export opinion
an on or
,

-
Can't be verified
statistically
-

.
e weather conditions .
g
.

objective probability :

Random predicted
or Stochastic event that cannot be with
certainty
-
the relative frequency with which they occur in a
long series of trials is stable

-
can be verified
statistically
-
.
e
.
g occurrence of number on the roll of die is near 116
any a

characteristics of probability
0 the event is to occur .
very unlikely
=

.
0 S =
the occurrence the it
of event is just as likely as is
unlikely.

1 =
the event is almost certain to occur .

Conditions of probabilities
·
the I
probability of each outcome must be between O and

·
the sum of the
probabilities of all the outcomes in the sample space = 1

Probability of events

equally likely events

Probability
·
of an event :

P(a) =
Where :

I
A = an event of a finite 1 n (A) =
number of outcome in event A

P(A) =

probability of event A ↑ (1) =


number
of outcomes in th
occurring

e .

g
.

Ten people four and six women in a meeting At random they select to preside .
men are . person
a
,

wome a

What is
selected

bility
that is

the p
the person a

Multiplication rule

sl
>
2x2Xz
= 8
me
= 2 x 3X2 = 12
There pictures
are
many have to
arrange from 1st position will 6th.

Spicydhave spisSe
↓ choose

6 x5x4X3x2x1

= 720
ways

OR

6 ! = 720

Only 3 picture can be chosen for positions

6 x 9 x4 = 120 Formula :

=
OR Cr
"

)!
C
=

*
2
! XS ! X2 :
↑ ↑ ↑
No . of units No .

of group No .
of group
pink blue

ENGI ENG2 ENG 3 MTHS 1 MTHS 2

① ② ③ S

Rearrange on bookshelf and order doesn't matter ?

5 ! =
120
no of items =

English book and maths book have to be together

-
2 ! x3 ! x2 ! =
24

Noof
ot

E E
We have 7 items line . What
to
arrange in a is the
probability that table and ball will be place

next to each other ?

+table
-all
Total

7 ! = 5040

&
ENGI ENG2 ENG 3 #* MTHS2 MTHS 3 Eng 4

① ② ③ ⑤ 6

We have book (3 English) . What


to 7
arrange on a shelf maths and 4 is the
probability that the

maths books will be next to each other ?

7! = 5048

=> =

Codes
7
We are creating a character code :

the (1-9)
>
-

first 2 characters must be

(A-2)
-

the characters must letter


remaining be

Repetition is allowed

y
a
choices
no

choices &1 26 +
26x26x26x26

code : 67 BFT 92 x 265 =


962391456

words

with words :
working

MATHS

Repetition is not allowed

How 5 letter "words" can be formed ?


many
5 ! =
120

Repetition is allowed

g5 = 3125

The word must start with a M and end with an A

Repetition is allowed

1 x 93x1 = 125

Repetition is not allowed .

1x 3 ! x1 = G

Working with words :

Soccer

Repetition is not allowed

How
many
6 letter words can be formed ?

6 ! = 720
if

there are
reapting letters you have to divide
by number of reaping letter factors a

I
Working with words :
Working with words :

CHOCOLATE CHOCOLATE

Repetition is not allowed ·

Repetition is not allowed

·
Word must start with 0 O
an
Word
·

must start and end with an

How a letters words be formed ?


many can
How
many
a letters words can be formed ?

1 x 8 ! =
40320

x7=O
·

10320 =
2016

Working with words :

REAPPEARS

Repetition is not allowed

How words can be formed ?


many

9 letters

= x21x2) =
22680

word must start and end with .


R

1x7 ! x 1 =
5046

-
O 2 ! = 636

Permutations

1)(r 2)
r ! =
r(r -
-

...
2xI

Only 3 picture can be chosen for positions

6 x 9 x4 = 120 Formula :

=
OR Tr

"Pr
=

Combinations

unordered
-
the number of possible subsets of objects chosen from n distinct

object
.

Formula : "Our !
There are S qualified applicants for 2 editorial positions
university
on a

Two
newspaper . of these
applicant are men and t h re e women . If the positions

are filled by randomly selecting two of the five applicants ,


what is the

?
probability that netheir of the men are selected

=
n

S P(A)

10

=
10 = 0 .

n(a)
=-2)x!
= 3

Conditional probability
·
the conditional of event A event B occurred
probability an
, given that an is
equal to

P(AIB) =

PLAIB) = "probability of A
given
B

provided P(B) > 0

S balanced tossed Calculate al


uppose a die is once. the
probability of obtaining given that an

Odd number was obtained. A = observe a 1 .


B =
observe an odd number .

: P(AIB)
A 1B
A
= 1
=

P(AnB) =
P(B) = 5 = E
=
= 0 .
33

Independent events

When
·
events are dependent , they can be affected by previous events.

·
Thus ,
these events depend on what happened before

·
When events are dependent ,
the chances or
probability that an event will happen will change .

·
when events independent they are affected previous
are , not
by events.

e
Thus , these events do not depend on what happened before

Event A and
B are independent
.

P(A) = 0 .

P(B) =
0 .

Determine P(AUB) ?

P(AuB) =
P(A) + P(B) -

P(AnB)

=
0 4
. + 0 3
.
-
10 4)(0 3)
.
.

= 0 .
58
Formula for independent :

Determine P(AUB)" ? PAMB) = P(A) P(B)


-

1 -

0 .
58 = 0 . 42

Are the event A and B ?


Complimentary

. 4
0 + 0 3 .
= 0 7 .

· No

Are the events A and B exclusive


?
mutually
Mutually exclusive
P(A1B) = 0 4X0 3
. .

P(A1B) =
0

= 0 . 12

: They are not


mutually exclusive .

·
two A and B said independent :
events are to be if any one of the
following holds

P(AIB) =
P(A)

P(BIA) =
P(B)

P(A1B) = P(A) P(B)

Otherwise , the events are said to be


dependent.

:
example I

A 1B = 0 P(A) x P(B) =
E x =
P(A1B) =
016 = 0

No
,
AnB are not
independent because PLA(B) # PIAXP(B)

A nC = (1) P(A) x P(c) =


E x = = 5
P(anc) =
5

Yes ,
A and C are independent because PCARC) = PCAXP(C)

Mutually P(A)
exclusive
P(AUB) = + P(B)
The complement law

P(A)) = 1 -

P(A) P(A) = 0 .
4

P(A) = 1 -
P(A)

P(a)) = 1 -
0 .
4

= 0 . 6

General additive law

P(AUB) = P(A) + P(B) -

P(A1B)

Students were random assigned to one of six different sections of an


introductory English course e .

Sections 1 ; 2 ; 3 ; 4 ; S or 6 .

A : Susan even-numbered section .


is
assigned to an

B : Susan is
assigned to a section numbered lower than .
3

A = (2 : 4i6) = 5 =
B = (1 : 2) = =

AnB =
(2)
P(AnB) =
5

PCAUB) = P(a) + P(b) -


P(AB)
I
T
2 + -

5
2
=
3

= 0 .
667

Multiplicative law of
probability
P(A -B) = P(a) P(BIA)

=
P(B)P(A / B)

= PBXPLBI
P(B , nB1)
Z

=
0 . 143

a
= E(X) = ni N
= ni

o
" = var(X) = nir(1 -

i) Ov = Mi (1 -
il)

o = sd(f) = + m)1 -

π) or
1 - i
Random and
variables
probability distributions

note the difference in the


symbols

me I standar deviation variana


#
mean

J g2
I

Normal probability distribution

·
the normal distribution distribution It describes
is common
probability used in statistics . how the

values of continuous random variable are distributed in a


symmetric , bell-shaped curve.

·
Bell-shaped and
symmetrical : the curve is
symmetric about its mean
(M) , meaning
that the left and right sides of curve are mirror images . Most data points are clustered around the

mean with fewer and fewer further from


,
occurring as you move the mean .

·
Mean = Median = Mode : in a normal distribution , the mean , median ,
and mode

are all the same and located at the peak of the curve .

·
Standard Deviation (0) : This measures the spread or dispersion of data . A small standard deviation

means the data points are close to the mean , while a large standard deviation indicates they are spread .
out

The curve become wider or narrower based on the value of .


o

·
Asymptotic : The tails of the curve approach but never touch the horizontal axis , meaning the

extreme possible but less further from the


values are
likely as
you
move mean .

Empirical rule

· I 68 %
of the data lies within I standard deviation of the mean (between i-o and m + o

· 95 % of the data lies within 2 standard deviations of the mean (between M-20 and n + 20) .

·
99 7 % .

of data lies within 3 standard deviations of the mean


(between M-30 and M + 30)

Example :

So
follow normal distribution with

population
a Melo a n d a

and 110
Density
for
distributionit

normalprobability
curve

=
the = for -
<
Ld
+ (x)
-

Where :

M
= mean

o Standard deviations

TV = 3 .
14159

e natural
is logarithm
=
e
e.g .

Notation : x -N(M 02) ,

2- score and -table

standard data
a z-score tells you how
many deviations a
point is from

the standard normal distribution It standardize


mean in a .

helps values so
you can

Compare them to the standard normal distribution , where the mean is 0 and the standard

deviation is 1 .

2 score formula :

z =
Let :

X the value data point


=
or

M =
the population mean

& standard
the population deviation
=

Example
200 50
;
200-100
100
M x o . 0
2
=
= =
; =

X = R2CO is 2standard deviations increments of RSO units about the mean Roo

The 2-score formula transforms a normal distribution into a standard normal distribution which allows
you to use

the Z-table to find probabilities .

↑ Z-table gives the from


cumulative
probability the far left of the curve to a
particular

2 value .
The table helps you find the probability that a random variable is less than a

a certain value . If you know the score , you can look it


up in the 2-table to find the

corresponding probability
.
.g
. standard deviation IS
population 100
E
= =

Steps :

①2-score : #100 = 1 .

00

& look up z = 1 . 00 in 2-table · It equal 0 8413


.

meaning 84 13
.
% chance of

selecting a value below IS


:
example

=20
and a =

contains more than 200 calories.

(1) 2-value :

=
=
208
200 -

= 1 .
60

(2)
Probability ?

.
8 9452P(X > 208) = P(zs1 .
5)

= 1 -

(0 9452) .

= 0 .
0548

· there is S 48 % more than 208


a .

probability that the salad that


you
select on the menu contains

Parameters vs Statistics

·
Parameter : This is a fixed , unknown value that describes a characteristic of the entire population .

The population mean (M) or


population standard deviation (c) are parameters.

·statistic :Thicalculated froma sample(


which a subset of the
population . Th sam see

key difference :

· a parameter is a number that describes the whole population , and it doesn't change .

·
Central limit theorem (CLT)

·
CLT one of most important It states
is the concepts in statistics . that regardless of the

population's distribution , the sampling distribution of the sample mean will approach a normal distribution

as the sample size increases , = 3


typically when n

key points :

T and normalityevenit thepopulationisnotnormay distributethe


T
guarantees that the distribution of thesame see

·
Implication : the LIT allows you to make inferences about the population mean
using the sample mean , even if the

original data is skewed or non-normal .

2
X =
sample mean

M =
population mean

o standard deviation
population
=

n =
sample size
Calculating Probabilities using the CLT

The allows to calculate probabilities do


CLT
you involving sample means , just like
you with

individual data points z-scores .


using

:
Steps

① Standardize the sample mean :

the sample into the formula :


convert mean a z-score
using

=
2

this 2-score now corresponds to a position on the standard normal distribution .

② Use the z-table :

Look the 2-value in the z-table to find the cumulative probability.


up ,

Example :

Suppose that the population distribution of the


gripping strengths of industrial workers is known to have a mean

Of 110 and standard deviation of 10. for a random sample of 75 workers , what is the
probability that the sample mean
gripping strength

will be :

(a) between 109 and 112 ?

2 -087 and =1 . 73

P[10 <* 112] =


P[-087(2 < 1 .

75]
=

p[2 <1 .
73] -
P[z -0 .

87]

= 0 9582
.
-
0 .
1972

=
0 .
766

% 76 6 % .

of the industrial workers have a mean gripping strength between 109 and 112

(b)
2
=
=
0 87
.

P[X > 111] = P[z > 0 .


87]

= 1
-

[20 .

87]

=
1 - 0 8078 .

= 0 . 1922

: 19 22 % of the industrial workers have strength then 111 .


mean
gripping
a
greater
.
Definitions :

·
Normal distribution : a
probability distribution that is
symmetric and bell-shaped , characterized

deviation .
by its mean and standard

· Z-Score : the number of standard deviations a data point is from the mean ,
used

to standardize values.

·
Parameter : A fixed , unknown value that describes the entire population .

·
Statistic : A value calculated from a sample , used to estimate a population parameterr

·
Sampling distribution : the
probability distribution of a sample statistic .

·
Standard Error (SE) : the standard deviation of the
sampling distribution ,

showing how much the


sample mean varies from the
population mean ,

·
Central limit theorem (CLT) : the theorem that states distribution of
the
sampling
the sample mean will be
approximately normal for
large sample sizes.

Using the t-distribution to calculate probabilities


The t-distribution is used instead of the normal distribution when :

·
the
sample size is small (n < 30)

· the standard (S) is used instead .


population deviation

The t-distribution has heavier tails than the normal distribution , which means accounts for more

variability As sample -distribution


, especially with smaller sample sizes .
the size increases ,

approaches the normal distribution .

Formula :

Degrees of freedom (df) .

The t-distribution defined freedom (df) equal


is
by its degrees of which is to n -

Using the -distribution table :

To calculate probabilities the -distribution :


using

① find the t-score


using the formula :

& Determint the


degrees of freedom (df) : df =
n -1

③ use the -distribution table to find cumulative probability for the calculated t-score

and degrees of freedom .


Example

youwattotestifanew teachingmethodimprovesstudent performanceAsampe sstudents asa of

① calculate the +-score :

6 Degrees of freedom :
af = 25-1 =
24

() t = 2 . S ; df =
2) can use linear interpolation it
necessary to get the precise probability
you

value .

Calcutting
probabilitie- T) + p
,

Example :

Calculate thefollowing probably 24 degrees of freedom

from the t-table :

#
: PITC

1873 o
P2 = 0 965
.

+
= 1 -

0 .

963

Pl = 0 . 960 = 0 .
037

Using the
Chi-squared distribution to calculate probabilities

The
chi-squared distribution is
commonly used in two situations :

O To test
hypotheses about the variance of a
population .

&
In -

goodness-of-fit-tests and tests of independence

keyproperties" distribution is skewed to the right especially for small


degree of freedomh e

·
It is defined by its of ,
which depends on the problem [e g . .
df = n-1 when
testing variance or df = (r -1) (c -
1)

for
contingency
tables]

·
Random variable this distribution take
with can
only on
positive values and its

density is skewed to the


right ,
but becomes more symmetric as It increases.
Steps to calculate
probability using the Chi-square distribution

O determine the degree of freedom (df) : the degrees of freedom are


typically determined by the problem

context
. In cases for test of independence or
goodness-of-fits of is related to the number
categories
many
a
,

minus
1.

② Formulate the problem : determine whether


you are
looking for P(X > K)
,

P(X(k) or P(a < X <


b) , where X is the Chi-square random variable ·

③ Use a
Chi-square table or software : the
chi-square table provides the critical values for different

probabilities and degrees of freedom . If the table does not


give you the probability .
exactly you want

Example 1
:

Finding P(X(K) for a


given value of K

Let's the P(XX10) where X follows


say you want to calculate probability , a

Chi-square distribution with 6 degrees of freedom .

O Find the cumulative


probability for P(X < 10) :

Lock the table for 6


at Chi-square degree of freedom
·

for X 10 and df 6 P(X (10) is 0 8013


the cumulative
probability approximately
-

= =
,
.

.
2 Find P(X > 10) :

P(X (10) = 1 -

P(X(10)

= 1 -0 8013 .

= 0 .
1987

Interpretation : the
probability that the Chi-square random variable X with 6 degrees of freedom is greater
than 10 is

approximately 0 1987
.
(19 87 . % )
Example 2 : Find P(X(K) for a
given value of K

·
Interpretation : the probability that X is less than S when X follows a
Chi-square distribution with 8

degrees of freedom is
approximately , So2
0 or 15 02 %
.

Example 3 : Pla < x (b)


finding

Let's say you want to calculate P(scX(15) for Chi-square distribution with 7
degrees of

freedom .

D
findPX are table for of 7 we find PXEos

② Find P(x < 5) :

·
again the table for df 7 P(X(5)
using
= = 0 2887
, .

③ find P(scX(1s) = P(X(1s) -

P(X(S)

= 0 8650 .
-

0 2887
.

= 0 5763
.

interpretation : the
probability that X lies between 5 and IS for a
chi-square distribution with 7
degree

of freedom is
approximately 0 5763
.
(or St .

63 )
+

The F-distribution

It
hypothesis testing . compares the variances of two populations

to determine if
they are
equal . The shape of the f-distribution is
asymmetrical and skewed to the
right
,

values concentrated left side but extend right


meaning that most of the are on the
indefinitely towards the .

Key properties :

a
· itisdefinedbytwodegrees otdenotedas Vunerator/aum andVe(denominator
o

negative .

·
As VI and V2 increase ,
the F-distribution becomes more symmetric and approaches the normal

distributio n.
Degree of Freedom (df)

i
-

V2 (denominator degrees of freedom) · This is the


degrees of freedom for the variance in the denominator .

This is
usually associated with the residuals or error terms and equals the total number of observations

minus the number of groups.

F-table

An f-table provides critical values for f-distribution certain levels (often


the
given significance & 0 05 or
= .

a = 0 .
01) and degrees of freedom for numerator and denominator .

How read the +-table :

vz).
·
the denominator freedom (df
rows correspond to the
degrees of =

·
the columns correspond to the numerator degrees of freedom (dt = V, .
)

·
The intersection of and column critical -value for particular
a row a
gives the a
significance

level .

Calculating probabilities Using the F-table .

To calculate the observed f-value critical


probability that an is less than or
greater than a value ,
you

can follow these steps :

Determine freedom
1 .
the
degrees of :

number of total observation


-

VI : the of associated with the variance in the numerator e number of groups


.

.
g

V2 : the of associated with the variance the denominator. number of total observation
in e .

g .
-
number of groups

.
2 Select the significance level /X) : common choices areo s (5%) or 0 01
. (1 %)

. Look
3 the Critical value the F-table and and
up in
using the
corresponding V , V2 values chosen

significance level .

4
. Compare the observed f-value to the critical value :

if
·

the observed f-value is


greater than the critical value from the f-table , the result is

Statistically significant
.

·
if the observed f-value less critical the result
than not
statistically
is the value , is

significant

&

of :

Vi = 3 -
1 = 2

Vz =
1S -
3 = 12
For would f-table and
a
significance level oos,
you refer to an for the critical F-value
corresponding v1 2
=
,

12 Let's the table value 3 89 If calculated F-value is


greater than 3 89,
V2 = .

say give your


. . .

the result is
statistically significant
-

Note :

·
the f-distribution used
is
only in
right-tailed tests because it test for differences in variances .

·
it assumes that the from which samples drawn distributed.
populations the are are
normally

Definitions"

·
-distribution : a distribution used for small samples where the population standard deviation is unknown .

·
t-score : standardized score used in the -distribution .

·
Degree of freedom (df) : the number of independent of information used to estimate
pieces a parameter.

·
Chi-squared distribution (x2) : A distribution and
used for
testing independence , goodness-of-fit ,

population variance.

·
f-distribution : a distribution used in variance comparisons and ANOVA
.

·
F-statistic : the ratio of variances used in
testing hypotheses
Chapter 9

Point Estimators vs interval Estimators

Pointestimatorsoint estimator provides a


single value as an estimate of a population parameter based on sample

data .

i
-

sample mean (52) : Estimate of the population mean (M)

-sample variance Estimate ofthe population varianceon a


Interval Estimators

: which parameter expected lie


definition an interval estimator
provides a range of values within the population is to ,

with a specified level of confidence.

Purpose : To
quantify the
uncertainty
to the estimate and provide a
range where the true parameter is
likely to be

found

·
confidence interval for the population proportion : provides for the true proportion based on
a
range the

sample proportion .

Example

Certain computer components temperatures.


A manufacture of computer components
can be
very sensitive to
high

conducts experiments to determine the temperature at which the components will fail

let mean
temperature at fail .
M which the components
=

Point So ° 90 % fo a

confidence interval
estimate :

J -

Calculating point Estimators

·
Point Estimator : sample Mean (5)

·
formula : c =
in S " =,
i

-
e :: each observation in the
sample

sample size
② Population variance (a2) :

·
Point estimator :
sample variance (s2)
~
formula : S
2
- Zi = (xi -
<]2
=
Sample mean

n =
sample size

③ Population proportion (p) :

(p)
· :
point estimator sample proportion

·
formula : p =
:
x number of successes in the sample

n : sample size

Characteristic of Estimators

unbiasedness

·
definition : unbiased expected the
an estimator is if its value equal true population parameter .

: (M) .
·
example the sample mean (c) is an unbiased estimator of the
population mean

Consistency
:

·
An estimator is
consistent if , as the sample size increases , it
converges in
probability to

the true population parameter .

·
the sample becomes
e .

.
g mean closer to the population mean as the sample size increases

Efficiency
·
An estimator is efficient If smallest all estimators
it has unbiased
variance
among

·
Among different estimators of the population mean , the sample mean typically has the smallest variance .

Sufficiency :

·
An estimator is sufficient if it captures all the information about the parameter contained in the

sample .

Calculating and
interpreting confidence intervals for population Mean

formula : 5 = z( Example ① standard error

JC : sample mean 5 = S St = =0 = 1

2 : z-value o = 10

O : standard deviation n 100


population
=

: 2 =
aSt
n sample size .

·
③ interval

lower bound : 50 - 1 .
96 = 48 04.

upper bound : So + 1 96 .
=
51 .
96

: the 95 % confidence interval to


population is [48 .
04 ; 51 .
95]

large samples : >30

CI =
p = 22

where :

↑ =
sample proportion

2/2 =
2-distribution

n =
sample size

Example

A of 500 . Compute the


survey finds brand
320 prefer a particular
people that people

9 Sy Confidence interval for


·

the population proportion .

Step 1 :

sample proportion
-

p =0 = 0 .

64

Stepaner = 0004608 = 0 . 2 15
.

Step 3 201 = 1 .

96

:
Step 5 confidence interval

CI = 0 64
.
10 0421 . =
[0 .
5979 ; 0 .
6821)

& We are as % confident that the true proportion of people who prefer the brand 59 . 79 % and 68 21 %
.
Small samples

n130 S is unknow , use -distribution

Formula :

(I = c = + x(x

t 2/2 = + -

distribution (df = n-)

S =
sample standard dreiation

Example :

a random sample of I5 students' test scores has a mean of 78 and a standard deviation

of 10 . Calculate the 95%· Confidence interval for the population mean

si =
78

S2SE = = 2 .
582

53 tal :

= 2 .
145

S4 ME =
2 . 145 x 2 .
582 = S .

S4

55 (1 = 78 1 s S4 . =
[72 .
46 183 .
54]

: aSX. Confident that true mean test score for the population is between 72 .
46 and

83 54 . .

kell points :

·
For
large sample proportions ,
use the 2-distribution and normal approximation .

·
For small sample means , use the -distribution,

·
the confidence interval within population
gives a
range which the true parameter is
likely to

fall with a
given level of confidence .
Hypothesis testing
·

Hypothesis = claim or statement about a


population parameter or in

Hypothesis
·
= A standard procedure claim population parameter
testing
test for a about a -

decision
regarding the validity of the claim

·
A manufacturer turnover exceeds R1 million
of car
tyres claims : His annual mean

Test a
single population mean

·
Panasonic Claims : that one out of three households is in
possession of Panasonic

(Test it
product. single population proportion :

steps of hypothesis testing

E
.
g
.

Step 1 : State the hypotheses


left-side (less than)
·
Null
hypothesis (Ho) :
Ho: M = 5

Represents
-

the status quo or no effect


.

Il
" Ha:M 5
-
E The is R50 000
.
g mean
salary of
employees
.

Right-side (more than 3


Ho : M = 50 , 000

Ho : M = 5

Ha : m = 5
·
Alternative Hypothesis (H , orHa) :
Two-side (not equal)
Represents the

HoHe
-
research claim effect want to .
test
or
you

"The R5000" (Him 30000


-
E not
.

g
. mean salary of employees
is

Step 2 : choose the level (a)


significance
·
The
significance level (a) is the probability of rejecting the null hypothesis when it is
actually true (Type I .
error)

Common value of a are :

·
a 0 05 :
= .

5% chance of rejecting. Ho when it's true .

·
a 0 01 : 1% Chance of rejectingHo it's true
when
=
.
.

·
The lower the significance level , the
stronger the evidence required to
rejectHo.

:
Step 3 collect data and compute the test statistic

·
for small sample (nc30) , use a .
t-test

·
for
large sample (n = 30) or known population variance , use a 2-test

·
2-test formula (for large samples)
-

.
2 = F
·
T-test formula (for small samples) : T =

Step 4 : determine the P-value or Critical value

P-value
·

Compare the test statistic to a critical value or the to the


significance level .
·Dave approachthe pataret. .
·
Critical value : rejectH o
approach if the test statistic exceeds the critical value ,

Step S : Make a decision

RejectHo : if the evidence is


strong enough (P-value La or test statistic exceeds critical value)
!

·
fail to Ho
reject If the evidence not
is
strong enough.
.

Example 10 . 1

A manufacturer of a new cell


phone battery claims that the mean lifetime of

30 hours . An claim less than 30 hours


the
battery is
employee that the mean is

A random sample of 81 batteries renders a


sample mean of 28 7
.

hours with a &

sample standard deviation 8 hours. Test the


hypothesis at 5 % significance

level .

I
③z -

Mo = 30

Ho : 30
M =

n = 81

Ha : Mc 30 Y = 28 7.

2 = 1 463
S 8
.

z b
P = p(z -
1 . 46)
= 0 .
0721

a = 0 0.
.
5

⑤ p
> a 2 = -
1 .
46 critical v : -
1 .
645

. Do not
reject Ho value > critical
- V alue

: the population mean lifetime is not less than so hours

The
employees' s statement is incorrect
z -

value = -
1 .
46 critical value = -1 .
64S

2 value > Critical value

··
L

Example 10 2 .

showed that the mean number of tries per match was 62


.
with a standard deviation 3 . .
2 Does the sample contain

Sufficient information to contradict the


rugby enthusiast's statement ? Assume X = 0 1
. .
Assume that the

population distributed .
is
normally

① M =
S

Ho : M = S n =
25
T

Ha MS i 6 2
=
·
.

S =
3 2 .

T
=
X = 0 .
1

T
= 1 . 875
-
# ·
-

1.87s 0 1 . 875

2 P(T 879)

Tot
> 1
p = .

= 2 (1 - P( + <1 .
875))

T2 =
1 . 896

P , =09,

pLx

:. Reject Ho
Example 10 3 .

A sample of size 6 is available to test if the mean mass of horses in a

The The
large hard exceeds 450kg. significance level Sy . sample values are as

follows : 499 ; 521 ; 452 ; 459 ; 538 ; 588 .

M = 450 n =6

* =

E
-

s
= 501 .

197

S = 56 .
433

Ho : M 450kg
=

Ha : M> 450

T
=
SI : X = 0 05 .


501 .
167
-

450
56 .
933

T =
5

. 201
2

# p = P(T2 201) .

=
1[0960)
x (2 20-2a
.
o a

J
=
1 -0 .

960

0 0
4
= .

PLX
:
Reject Ho

Mass of horses is
larger herd exceeds
456kg
Example 10 4 .

A manufacturer clarms that S% of goods manufactured are defective . Out of 200

items Is
randomly chosen from the
supply
store 14 items were defective ·
there

sufficient evidence to show that more than 5 % of items are defective ? Assume that

X = 0 .
1

T = 0 .
0S

① n =
200

Ho : Y 0 0 s

e
= .

Ha : < 0 .
08

P(z s 30) = 0 07
p
= .
.

② = 1 -

p(z(1 30) .

p -
Th = 1 -0 9032
.

[L (1 -

mu)
z = u =
0 .
0968

a = 0 .
1 ⑤

pa


· Reject Ho

0 07 0 05 More than 5 % manufactured Items are defective


of the
-

. .

0 05(1 -
0 05)
2
. .

=
200

2 = 1 .
298

-u
8 1 30
.
Example 10 .
5

A manufacturer claims that his market share is 60 % However , a random sample

of 500 clients show that 275 supports his product. Test the manufacturer
only
Statement at 1 % level
significance ,

① T =
0 6 .

Ho : Y 6 (P(z > 2 28)


p
= 0 .

=
.

n = 500

28))
M

Ha .
Yo 0 .
6 = 2(1 -

p(z2 .
x =
275

1
=
2 -

2(0 9887) .

p
= 0 55
.

② =
0 .

0226

p -
Th

[L (1 -

mu)
z = u ⑤

>L
p

X = 0 . 01

: Do not reject Ho

⑤ The manufacturer's market share is not different than

0 55-0 6
. . 60 %

z =

2 =
-

2 .

282

S
-

-

2 .
28 O 2 28
.
The difference between two
population means

·

·
Alternative
hyphotheses .

left-sided : Hai Mi -

M2 < - D Ha : u + M2

: Ha :
right-sided : M1-M20 Ha W , > M2

Hypothesis testing
for the difference between two
population means

large samples (both n , and n2


> 30)

2
Example 11 .

A done difference and


study is to estimate the in salaries of teachers in private government

schools in a
specific education community A random sample of 100 teachers at schools
. private

showed annual of R10 S 000 with standard deviation R15200. A random


an mean
salary a
of

sample of 200 realised annual of 2000


teachers at
government school a mean
salary
standard deviation R13500 Test the hypothesis of
with a of .
that the mean
salary teachers

schools higher of schools. Test at 5%


at
private is than that teachers at
government a

significance level .

ni 100 200
=
n2 =

51 105 000 88000


x1
= =

S1 = 19200 =
135 00

Ho : ul =
M2
p
=

= 1
P(z

p(zc9 47)
> 9 .

47)

.
-s
2/2

as
Ha : M , >M2
< 1-0
.
9998

C 0 .
001

~ O
10 5 000 -

88 . 000
a
z =
45200 ↑ (12600
pa

z = 9 , 471
: Reject Ho

mean of private school is higher than public school .


Example 11 .
2

A transport in Helsinki Finland if


company &
,
wishes to know the mean travel time

A
to work differs between commuters
travelling by bus or
by train .
survey as

conducted bus and train commuters which resulted the


among , in
following statistics

&

Test the hypothesis that travel time for train commuters is shorter

than travel bus (c 1)


time for commuters
travelling by
= 0 .

He : Mi =

M2 ④
Hai M , <
M2 p-value

#1 T2 P(z( 2 40)
p
-
=
.
-

z
=r
l P(z) z 40
-
-
= .

=
0 .
0085

2 = 0 1
.


2 48
-

= p
:
Reject Ho

z =
-

2 , 40

small samples : either n , or ne 30

1 -
* z

T-Spanthe

where

sp = (n , -
1)S , + (nz -
15
ni + nz -

2
Example 11 3 .

Xi = 68 X2 = 60

Si =
7 8 .

S2 = 8 .

ni = 11 n2 =
14

Hoi Mr T
2 = 0 01 .


1 -
* z

T-Spanthe

where

sp = (n , -
1)S , + (nz -
15
ni + nz -

where

Sp = (1) -
1) (7 8]2 .
+ (14 -

18 43
.

11 +
14 2
-

=
8 .
145

60 -
60

T =
sins

= 2 .
438

① p-value

Tg = 2 .

438 p = 2p(T > 2 .


438)

If = 29 = 2(1-p(T <2 438) .

T1

2()P P) (Ty T
J
P
2 313 +
-
-

=
x
.

=
T2 2 500
-

= .

2-c[(90098s) J
985
P1 =
0 .

x (2 -

438-2 .

313) + so

P2 = 0 941 .
=

= 2 -
2(0 988)
.

= 0 024
.

p >

.. Do not Ho
reject
Example 11 .
4

x
= [2 =
M
x

=
#
+ 3 02 + 0 38 .
. . .
.
=
036 .

11
II

7 76

1474
= .

= 3 585
,

S2 =
703 .
529- (11) /7 .
76)2

= 18

·
= 2 .
028

&: m =
M ni = 11

Ha .
Mi
<
M2 n2 =
11

df = 22 -
2 =
20

(11-7(2 02872
2

Sp = ( 11 -
1) (1 628) .
+ .

3 885.
- 7 . 76

859)
11 + 11-2 += (1 .

1
839
=
+ 4 942
. -

= .

① p
=
P( + -

4 .

942)
=
1 -

P(T2-4942)
L 1 0
-
.

995

L0 005 .

The difference between two


population proportions

·
Alternative :
hypotheses
left-sided : Ha :it 1
-
The <O -Ha :, < Th2

: Ha Th - it > -* Ha : Th >Ttz
right-sided :

two-sided : Ha :t ,
-
The oD Ha : Ft2
Test statistic for population proportions

↑, -P2
z = p(1-) + m

where p = X1 + X

Ni + nz

Example 11 .
5

= ↑ =
p

ni =
175 1 n2 = 150 2

X =
60 X 2 = 40 = 0 267
.

=
0 .
343

& = 0 .
1

& Ho : = The

Ha it ,
:
site


= 0 . 308

z-+
=
1 . 479

① P-value :

p(2> 479)
p
=
.

= 1 -

P(z <1 .
479)

=
1 -
0 .
9306

= 0 .

0694

⑤ : cornbix
PLE is more popular

Common questions

Powered by AI

A chi-squared distribution is commonly used in tests for independence in contingency tables and goodness-of-fit tests, as well as hypotheses about population variance . Key properties include its right-skewness, particularly with small degrees of freedom, and its definition by degrees of freedom based on problem context, such as (r - 1)(c - 1) for contingency tables . Chi-squared variables are positive values only and the distribution becomes more symmetric as the degrees of freedom increase, allowing for flexible application in statistics .

The F-distribution is used primarily in variance comparisons, such as in Analysis of Variance (ANOVA), where it tests for significant differences between group variances . To calculate probabilities using an F-table, first determine the degrees of freedom for the numerator (number of groups minus one) and the denominator (total observations minus number of groups). Next, choose a significance level, typically 0.05 or 0.01, and find the corresponding critical F-value in the F-table. Compare the observed F-value to this critical value to determine if the result is statistically significant .

The t-distribution is utilized instead of the normal distribution particularly when dealing with small sample sizes (n < 30) or when the population standard deviation is unknown, and the sample's standard deviation is used instead . The t-distribution accounts for more variability with its heavier tails compared to the normal distribution and compensates the uncertainty present in smaller samples by providing more conservative probability estimates . It approaches the normal distribution as the sample size increases, offering flexibility for various sample sizes in statistical analysis .

The Central Limit Theorem (CLT) is a fundamental principle in statistics that states regardless of the population's distribution, the sampling distribution of the sample mean will approximate a normal distribution as the sample size increases, typically when n = 30 . This theorem is crucial because it enables statisticians to make inferences about population parameters using sample statistics, even if the original population distribution is skewed or non-normal . The CLT allows for the calculation of probabilities involving sample means similarly to individual data points through z-scores, facilitating more robust statistical analysis .

Tree diagrams help solve probability problems by visually representing all possible outcomes of an experiment and their associated probabilities. They are particularly useful for sequential events where outcomes of earlier events affect later ones. For example, when drawing two marbles from a bag containing 3 orange and 7 blue marbles, a tree diagram can illustrate scenarios such as drawing one marble, replacing it, then drawing another, showing possibilities like blue-blue, orange-orange, etc., aiding the calculation of probabilities for these events .

A normal distribution is crucial in statistical inference due to its properties: it is symmetric, bell-shaped, and fully described by its mean and standard deviation . This distribution serves as a foundational reference for inferential statistics because many statistical tests assume normality for data distribution. Combined with the Central Limit Theorem, which states that the distribution of sample means approaches normality as sample size grows, it enables the use of z-scores and other parametric tests to make valid inferences about population parameters even when the original data is not normally distributed .

The standard error of a sample mean refers to the measure of the variability or dispersion of sample means around the population mean, calculated as the sample standard deviation divided by the square root of the sample size . It is crucial in statistical analysis because it indicates how accurately a sample mean estimates the population mean. A smaller standard error suggests that the sample mean is a more precise estimate of the population mean, thus enhancing the reliability of statistical inferences drawn from the sample data .

Subjective probability involves the use of personal judgment, opinions, or intuition to estimate the likelihood of an event occurring and cannot be verified statistically, making it reliant on expert experience, such as predicting weather conditions . In contrast, objective probability uses the relative frequency of events over a long series of trials to quantify the likelihood of events, allowing it to be statistically verified. For instance, the probability of rolling a certain number on a die is objective as it is based on stable, long-term frequencies .

Permutations refer to the number of ways to arrange a certain number of objects, where the order matters, calculated as n!/(n-r)! for choosing r items from n. They are used when sequence is crucial, such as arranging books on a shelf . Combinations, however, describe the number of ways to select objects from a set when order does not matter, calculated using n!/[r!(n-r)!], such as picking a committee from a group regardless of positioning . Both concepts are fundamental in probability theory to determine event likelihoods under varying conditions of order importance .

Venn diagrams visually represent the relationships between sets including the universal set, subsets, and various set operations such as union and intersection . They help illustrate concepts like disjoint sets, where two sets A and B have no common elements (their intersection is empty). Venn diagrams also aid in understanding the complement of a set, which includes all elements not in the set, and the mutual exclusivity of events, crucial for understanding probability calculations .

You might also like