0% found this document useful (0 votes)
236 views15 pages

Estimating Shot Success Probabilities

This document introduces several standard discrete probability distributions: - The Bernoulli distribution describes experiments with two possible outcomes (success/failure) with probability p of success. The binomial distribution describes the sum of n independent Bernoulli trials. - The multinomial distribution generalizes the binomial to experiments with more than two outcomes. - The hypergeometric distribution describes sampling without replacement from a finite population. It provides examples of medical and psychic experiments that can be modeled by these distributions. The goals are to understand the assumptions and properties of these distributions like their means, variances, and probability mass functions.

Uploaded by

Carbideman
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
236 views15 pages

Estimating Shot Success Probabilities

This document introduces several standard discrete probability distributions: - The Bernoulli distribution describes experiments with two possible outcomes (success/failure) with probability p of success. The binomial distribution describes the sum of n independent Bernoulli trials. - The multinomial distribution generalizes the binomial to experiments with more than two outcomes. - The hypergeometric distribution describes sampling without replacement from a finite population. It provides examples of medical and psychic experiments that can be modeled by these distributions. The goals are to understand the assumptions and properties of these distributions like their means, variances, and probability mass functions.

Uploaded by

Carbideman
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

UNIT 8 STANDARD PROBABILITY

DISTRIBUTIONS : PART I
Structdre
d. 1 Introduction
Objectives
8.2 The Bernoulli Distribution
8.3 The Binomial Distribution
8.4 The Multinomial Distribution
8.5 The Hypergeometric Distribution
8.6 Summary
8.7 Solutions and Answers

8.1 INTRODUCTION
-

In this unit and the next, we describe some frequently encountered discrete probability
distributions. But what do we mean by a probability distribution? To answer this question
we consider the two different situations below. Each involves repeated trials of a random
experiment. At the end of these trials we have to take an appropriate decision based on the
results of the experiment.
1) Suppose 25 patients of high blood pressure are given a drug and their blood
.
pressures are measured before and after the administration of the drug. Let ul, . .,
~ u ' ~ ., . ., u ' denote
u 2 and ~ ~ the measurements on the corresponding patients before
and after the drug is administered. Medical experts say that the drug is a success for
the j-th patient if uj > u; and a failure if uj S u;. Is it possible to decide whether the
drug is effective for patients of high blood pressure on the basis of the measurements
on the 25-patientsincluded in the experiment?
2) A lady claims that she has psychic powers and that she can identify the cards drawn
from a pack of 52 playing cards by any person sitting in another room. The lady is
able to correctly identify 3 of the ten cards drawn with replacement from a well-
shuffled pack of cards. Is her claim of psychic powers justified?
These two real life situations have one thing in common. Both involve repetitions of a
random experiment a specified number of times. We have 25 repetitions in the first
situation, and 10 in the second. Each trial results either in a "success" or in a "failure". Thus,
success stands for ui > ufi in the first case and, a correct guess by the lady in the second case.
The questions we asked in each case can be answered on the basis of the total number X of
successes. This number X is a random variable. In Block 4 you will study the techniques
which would enable you to answer the questions raised in the above illustrations. However,
these techniques require that the probability distribution of the total number of successes
be known.
In order to obtain this knowledge, statisticians make certain assumptions. For example, in
the situations described above, we may assume that
i) the successive trials are independent, i.e. the outcomes of different trials are treated as
independent events, and
ii) the probability of success at each trial is the same.
These assumptions would enable us to obtain the distribution of the total number of
successes in n trials. We would like to emphasise here that (i) and (ii) are assumptions made
by the statisticians, and there is no reason why the natural phenomena described above
should follow these assumptions.
Standard Probability
Nevertheless, assumptions of the above type provide us an initial solution to the problems Distributions :Part I
we face.

The assumptions like those made above and the resulting probability distribution of the total
number of successes are said to constitute a probabilistic or a stochastic model for our
random experiment. We begin with the simplest of such models. Here we would like to
point out that the models we are about to describe have been found useful in a wide variety
of situations and in different disciplines like medicine, agriculture, biology, industrial
engineering, psychiatry etc.

The discrete probability distributions which we shall be discussing in this unit and in Unit 9,
are called standard discrete distributions, because of their wide applicability and simplicity.
In this unit we'll take up the study of the Bernoulli, the binomial, the multinomial and the
hypergeometric distributions. Make sure that you achieve the following objectives by the
end of this unit.

Objectives
After reading this unit you should be able to:
state the assumptions uiiderlying the binomial, maltinomial and thz hypergeometric
distributions,
e compute their means and variances,
obtain the distribution of the sum of two independent binomial variates,
compute probabilities of events associated with these standard probability distributions.

THE BERNOULLI DISTRIBUTION

We begin with the simplest probability distribution, which is the distribution of a r.v. X
which assumes two values, 0 and 1. Let
P[X=O] = 1 - p a n d P [ X = I ] = p ,
. . . (1)
o r P I X = x ] = p X ( l- p ) ' - X x = O , 1,
where p is a number such that O I p I 1. What happens when p = 0 or p = l ? When p = 0,
P[X = 01 = 1,l.e. X is degenerate at zero and when p = 1, P[X = I] = I i.e. X is degenerate at
one. We shall usually ignore tliese cases.

Notice that the probability distribution of the r.v. X changes with p. Thus, in fact, (1) defines
a family or a class of probability distributions of the same kind. Every member of this family
is uniquely determined by the value of p and to every value of p in the interval [0, 11, there
is aiunique probability distribution specified by (1). It is for this reason that p is called the
parameter of the distribution of the r.v. X.
The r.v. X. and its probability distribution specified by the p.m.f. (1) are, respectively, called
the Bernoulli variate and the Bernoulli distribution in honour of Jacob Bernoulli
(1654-1705). He made a systematic study of problems connected with this distribution.
Jacob Bernoulli (1654-1705)
Can you think of an example of a Bernoulli variate? what about the toss of an unbiased
coin? Here p = 112. If. however, the coin is not a balanced one, p can be any value in [0, 11.
In most practical situations we would not know the value of p. Therefore, it is best to study
the properties of the Bernoulli distribution for a general p. In fact, we shall adopt this
approach in the study of all the standard discrete distributions in Units 8 and 9. We shall
study their properties in terms of their general parameter or parameters, without specifying
their numerical values.

If X has the Bernoulli distribution given by (I), then


E(X)=O.(l-p)+1 .p=p
and
var(x)=02.(1 -p)+ I ~ . ~ - ~ ~
= D (1 -D).
~
1 Probability on Discrete Sample Notice that Var (X) I E(X).
.
Its moment generating function (m.g.f.) is
I
M,(t, p) = ~ [ e " ]= (1 - p) + pet,
i which is valid for all real t and for 0 2 1 p I1. We have deliberately introduced p in the
~
I symbol M, (t, p) for the m.g.f. of X to emphasise its dependence on the parameter p of the
distribution.
~ The Bernoulli distribution is useful whenever the random experiment has only two possible
outcomes, which may be labelled as success and failure. In the two situations discussed in
the Introduction, we could identify success and failure. But in both these situations we were
interested in a specified number of repetitions of the experiment. In other words, in each
case, we were interested in the distribution of the sum of independent Bernoulli variates
with the same value p of the parameter. We discuss this in the next section.

8.3 THE BINOMIAL DISTRIBUTION

In this section we are going to talk about the distribution of the sum of independent
Bernoulli variables. You will see, in Theorem 1, that such a sum has a binomial distribution.
But what is a binomial distribution?
We begin with the following definition.
~ e f i k t i o n1:1 We say that a random variable X has a binomial distribution with
parameters (n, p) if its p.m.f. is given by

b(j;n,p)=P[X=j]=
GI
where n is a positive integer and 0 c p c 1.
pJ(1-p)"-J,j=0,1, ..., n, . . . (2)

Why is it called a binomial distribution? It's because b( j; n, p) is the ( j + 1)th term in the
binomial expansion of ip + (1 - p);n. This observation also leads to the conclusion that

J
;
=o
b(j: n.p)=
J =o
6)~ (1 pin-J

= ( 1 -p+p)"= 1,
which is as it should be.
We can interpret binomial distribution as the distribution of the total number of successes in
n independent trials, each with the same probability p, of success. With this interpretation
you will see that there are many situations in which this distribution can be applied. We
'
have given some such situations in the examples a little later. Now, suppose X1, Xz, . . .,Xn
are independent Bernoulli [Link]. with the same p.m.f.,
I
P[Xj = 01 = 1 - p, P[Xj = 11 = p, j = 1, . . ., n.
We have discussed independent
trials m Sec. 6.6 We may identify a success at the j-th trial with the even [XJ= 11 and a failure at the j-th trial
.
with the event [XJ = 01. Then X = X1 + . . + Xn is the total number of successes in n trials.
To understand this, let us consider the coin-tossing experiment again. Suppose we toss an
unbiased coin 5 times, i.e. n = 5. Now, the result of each toss could be either H or T.
Suppose we call the result H a success. Then H at the jth toss is equivalent to Xj= 1 and T at
the jth toss is equivalent to XJ = 0, j = 1,2, . . . 5. So, if X = X1 + X2 + . . .X5, and if we get
H in the first, second and the fifth toss and T in the rest, then X takes the value
x = 1 + 1 0 + 0 + 1 = 3, which is the number of Hs, or successes, in the 5 tosses.
Now let us obtain P[X = j] = 0, 1, . . ., v.
Notice that the sum X1 + . . . + X, equals j iff j or the X,'s are equal to 1 and the remaining
(n - j) are all equal to zero. The probability that a specific set of j Xi s equal one and the
IQ remainine X s eaual zero 1s o 'I I - D)" -J. This 1 5 so because there is one factor D for each
Xj = 1 and one factor (1 - p) for each Xi which is zero. The j factors p and (n - j) factors Standard Probabiiity
Distributions :Part I
(1 - p) get multiplied because of independence. However, there
ways of choosing the j Xi s which equal one, the rest, n - j, of the
Hence by the finite additivity property (P7of Sec. 6.2.2),

Hence the distribution of the total number of successes under the above conditions is the
binomial distribution.
We have thus proved the following theorem.
Theorem 1: Let XI. . . . X, be n independent Bernoulli [Link]. with common p.m.f.
P I X j = l ] = p , P I X j = O ] = l - p , j = 1 , 2 ,..., n.
0 < p < 1. Then X = XI + . . . + Xn has binomial distribution with parameters n and p,
specified by (2).
Given this interpretation of a binomial distribution, Let us look at some situations where this
distribution is useful.
Example 1 :A machine produces identical units. The proportion of defective units produced
by the machine is known to be 1/20. We also know that successive units are statistically
independent. Let us obtain the p&bability that in a sample of 10 units, there are at most 2
defectives.
If X denotes the number of defectives in a sample of 10 units, then X has binomial
distribution with n = 10 and p = 1/20. Hence,

Example 2: The probability that a person recovers from a serious disease is 0.40. Let's find
the probability that at least one of the 8 persons admitted to a hospital will survive.
For this, let us assume that the recovery or otherwise of the B patients is independent of each
other. Thus, we want to know P[X 2 l],.when X has binomial distribution with n = 8 and
p = 0.40.
Observe that
P[X2 1]=1-P[X=O]

Have you understood how we have solved these examples? See if you can solve some on
your own now.

El) Ten workers use electric power intermittently. Each worker has the same probability
p = 115 of requiring a unit of power. If they work independently, find the probability
that six or more workers require electric power simultaneously. If the supply is
adjusted to five powerunits, this is the probability that the system would be
overloaded.
E2) How many independent trials each with p = 0.01 must be performed to ensure that the
n r n h a h i l i t v nf a t l p a c t nnp EIIFPPCE i c n Mnr mn-7
Probability on Discrete Sample
Spaces
The calculation of probabilities associated with the binomial distribution is often complex.
We now ask you to prove a result which is quite useful in this connection.

E3) h o v e that
a) b ( j : n , p ) = b ( n - j ; n , I -p),

We can use this result to c a b l a t e b( j; n, p) recursively), starting with b(0; n, p).


Now let us find the mean and variance of the binomial distribution.

Theorem 2: If X has binomial distribution with parameters n and p, then


E(X) = np, Var(X) = np(1 - p).
ProoT: By definition

= Z j 7 - p j (n!
1 -p)n-~
j! (n - j)!
j= 1
where we have omitted the term corresponding to j = 0,since it is zero. Simplifying by using
the relations n! = n((n - I)! ) and = --
I - we have
j! (j-I)!'
n
(n - I)! - p ~ - l (l-p)n-l-(j-l)
E ( x ) = ~ P Z( j - I)! ln- 1 - ( j - l)]!
j= l

=np(l - p + p ) n - l
= np.
Now let's compute the variance.

You know that

, var (XI = E(x*) - [E(X)]~.


Since we have already computed E(X), we can find out Var(X) if we are able to calculate
E(x2). The computation of E(x2) is simplified if we bse the fact.
E(x2) = E[X(X - I)] + E(X).
Now,

since the first two terms corresponding to j = 0 and j = 1,vanish.


Standard Probability
Distributions :Part I

Have you noticed that in this computation we have carried out simplifications which are
simirar to the ones used in the computation of E(X)? Finally

Var (X) = E[X(X - I)] + E(X) - { E ( x ) ~


= n(n - l)p2+ np - (np)2
= np( 1 - p),
as required.
There is an easy consequence of this theorem, which we would like you to prove now.

E4) If Y = X/n denotes the proportion of successes in n independent Bernoulli trials with
constant probability p of success, then

E5) Use the results about the mean and variance of the sum of n independent [Link]. in Unit
7 for an alternative derivation of the mean and variance of a binomial r.v.

We conclude our discussion of the binomial distribution by obtaining its rn.g.f.

i
Theorem 3: The moment generating function Mx (t) of the binomial distribution with
parameters n and p is

Proot: By definition
M,(t) = E[exp(tX)]

= {i + p(e' - i)jn,
which is the required result.

From the m.g.f. also you can see that E(X) = np and Var(X) = np(1 - p).
We now prove that the sum of two independent binomial variates with common probability
'p' of success is again a binomial variate.
Corollary: Let X and Y be independent binomial variates with parameters (n, p) and (m,p),
respectively. Then X+Y has a binomial distribution with parameters (m + n, p).
Proof: Now, X and Y can be regarded as the sum of n and m independent Bernoulli variates,.
respectively. Suppose
. X = X 1 + X 2 + .. . + X , a r ~ d Y = x , , + ~ + X , +...+&+,
~+
ThenX+Y = X 1 + X 2 + . . . +X,+,andXi,i= 1,2,. . . ,n + mare independent.

Thus, X + Y is a sum of n+m independent Bernoulli variates with probability, p, of success.


Un-n- Y IV ...
Lno h ; - ~ m : n l rl;o+AL,.Gam :+L-.s~-,,-+-,, I, L ,\,
j Probability on Discrete Sample We have mentioned earlier that a binomial distribution is the distribution of the total number
Spaces of successes in n trials, each with the same probability of success, where
1) each trial can result in two mutually exclusive outcomes and
2) successive trials are independent.
Now, there are two ways in which we can generalise the binomial distribution. We can,
assume that either
1) each trial can result in more than two mutually exclusive butcomes, or
2) successive trials are not independent.
We shall follow the first approach in the next section and the second approach in Sec. 8.5

8.4 THE MULTINOMIAL DISTRIBUTION ---

Sometimes we come across situations where a trial of an experiment may result in more than
two outcomes. Here are some examples of such situations.
i) A group of 100 persons is classified according to their blood- groups 0 , A, B and AB.
Let rl, r2, r3 and r4 denote the number of persons with the blood groups 0 , A, B ahd
AB, respectively. Then r l , r2, r3 and r4 are non-negative integers with
r + r2 + r3 + r4 = 100. Here each person can be classified into one and only one of the
k = 4 classes. ----
ii) In a game of bridge, the 52 playing cards are divided amongst k = 4 players such that
each player gets 13 cards.
iii) The population of a town can be classified into k = 21 different age groups, 0 - 2,
3 - 7 , 8 - 12, . . ., 98 and above.
iv) The teachers in a university can be classified into k = 3 categories; lecturer. rqader and
professor.
V) In a Lok Sabha constituency, there are 5 candidates. Before the polling date, tlie voters
can be classified into six classes, five according to their choice of the candidate, the
sixth class being of those who are still undecided.
To deal with such situations, we first need to find the total number of ways in
which n distinct objects can be classified into k different classes so that r, belong to
.
Class 1, r2 belong to Class 2, . . , and rk to Class k. Of course, it is necessary to have
r l + r 2 +. . . + r k = n .

You are already familiar with the case, k = 2.


1
When k = 2, we can classify, n objects into two classes such that rl belong to Class 1 and
r2 = (n - r l ) belong to Class 2 in

[ n!
--- - n!
r,! (n = rI)! rI! r2!

ways. This is so because we can choose rl objects out of n objects in


such choice leaves a unique group of n - rl = r2 objects which

We now generalise this argument in the following theorem.


Theorem 4 :The number of ways of classifying n distinct objects in k classes, such that
r, belong to Class 1, r2 belong to Class 2, . . ., rk belong to Class k, subject to the condition

n!
r,! r2! ,. . . rk!

Proof :We know that the r l objects belonging to Class.1 can be chosen in
I:[ ways out of

You might also like