0% found this document useful (0 votes)
16 views14 pages

Binomial and Poisson Distributions Explained

The document discusses two probability distributions: 1) The binomial distribution describes the probability of getting m successes in N trials of a Bernoulli process with probability of success p. The probability is given by the binomial coefficient. 2) The Poisson distribution models the number of rare, independent events occurring in a fixed time period or space. It approximates the binomial when the number of trials N is large and the probability of success p is small, but Np remains approximately the same.

Uploaded by

pankaj_dogra7070
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views14 pages

Binomial and Poisson Distributions Explained

The document discusses two probability distributions: 1) The binomial distribution describes the probability of getting m successes in N trials of a Bernoulli process with probability of success p. The probability is given by the binomial coefficient. 2) The Poisson distribution models the number of rare, independent events occurring in a fixed time period or space. It approximates the binomial when the number of trials N is large and the probability of success p is small, but Np remains approximately the same.

Uploaded by

pankaj_dogra7070
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Binomial and Poisson Probability Distributions

Binomial Probability Distribution


 Consider a situation where there are only two possible outcomes (a “Bernoulli trial”)
Example:
 flipping a coin
The outcome is either a head or tail
 rolling a dice
For example, 6 or not 6 (i.e. 1, 2, 3, 4, 5)
Label the possible outcomes by the variable k
We want tofind the probability P(k) for event k to occur
James Bernoulli (Jacob I)
Since k can take on only 2 values we define those values as:
born in Basel, Switzerland
k = 0 or k = 1 Dec. 27, 1654-Aug. 16, 1705
 Let the probability for outcome “k” to occur be: P(k = 0) = q He is one 8 mathematicians
(remember 0 ≤ q ≤ 1) in the Bernoulli family.
 something must happen so (from Wikipedia)
P(k = 0) + P(k = 1) = 1 (mutually exclusive events)
P(k = 1) = p = 1 - q
 We can write the probability distribution P(k) as:
P(k) = pkq1-k (Bernoulli distribution)
 coin toss: define probability for a head as P(1)
P(k=1) = 0.5 and P(0=tail) = 0.5 too!
 dice rolling: define probability for a six to be rolled from a six sided dice as P(k=1)
P(k=1) = 1/6 and P(k=0=not a six) = 5/6.

[Link]/Sp07 P416 Lecture 2 1


 What is the mean () of P(k)? Mean and Variance of a discrete distribution
1
 kP(k)
0  q 1 p
  k0
1
 p (remember p+q=1)
q p
 P(k)
k0
 What is the Variance (2) of P(k)?
1
2
 k P(k )
  2  k  01   2  02 P (0)  12 P (1)   2  p  p 2  p (1  p )  pq
 P(k )
k 0
 Let’s do something more complicated:
Suppose we have N trials (e.g. we flip a coin N times) what is the probability to get m successes (= heads)?
 Consider tossing a coin twice. The possible outcomes are:
no heads: P(m = 0) = q2
one head: P(m = 1) = qp + pq (toss 1 is a tail, toss 2 is a head or toss 1 is head, toss 2 is a tail)
= 2pq we don't care which of the tosses is a head so
two heads: P(m = 2) = p 2
there are two outcomes that give one head
Note: P(m=0)+P(m=1)+P(m=2)=q2+ qp + pq +p2= (p+q)2 = 1 (as it should!)
 We want the probability distribution P(m, N, p) where:
m = number of success (e.g. number of heads in a coin toss)
N = number of trials (e.g. number of coin tosses)
p = probability for a success (e.g. 0.5 for a head)

[Link]/Sp07 P416 Lecture 2 2


 If we look at the three choices for the coin flip example, each term is of the form:
CmpmqN-m m = 0, 1, 2, N = 2 for our example, q = 1 - p always!
coefficient Cm takes into account the number of ways an outcome can occur without regard to order.
 for m = 0 or 2 there is only one way for the outcome (both tosses give heads or tails): C0 = C2 = 1
for m = 1 (one head, two tosses) there are two ways that this can occur: C1 = 2.
Binomial coefficients: number
   m!( N  m)!

N! of ways of taking N things m at time
N
CN ,m  m

0! = 1! = 1, 2! = 1·2 = 2, 3! = 1·2·3 = 6, m! = 1·2·3···m


Order of occurrence is not important
 e.g. 2 tosses, one head case (m = 1)
we don't care if toss 1 produced the head or if toss 2 produced the head
Unordered groups such as our example are called combinations
Ordered arrangements are called permutations
For N distinguishable
N! objects, if we want to group them m at a time, the number of permutations:
PN,m 
(N  m)!
example: If we tossed a coin twice (N = 2), there are two ways for getting one head (m = 1)
 example: Suppose we have 3 balls, one white, one red, and one blue.
  Number of possible pairs we could have, keeping track of order is 6 (rw, wr, rb, br, wb, bw):
3!
P3,2  6
(3  2)!
 If order is not3!important (rw = wr), then the binomial formula gives
C3,2   3 number of “two color” combinations
2!(3  2)!


[Link]/Sp07 P416 Lecture 2 3


 Binomial distribution: the probability of m success out of N trials:
P (m, N , p)  C N , m p m q N  m   p
N
m
m N m
q 
N!
m!( N  m)!
p mq N  m
p is probability of a success and q = 1 - p is probability of a failure
0.40 0.14
Expectation Value
Expectation Value 0.12  = np = 50 * 1/3 = 16.667...
0.30  = np = 7 * 1/3 = 2.333...
0.10

P (k , 50, 1/3)
P (k , 7, 1/3)

0.08
0.20
0.06

0.04
0.10
0.02

0.00 0.00
0 2 4 6 8 10 0 5 10 15 20 25 30
k k

 To show that the binomial distribution is properly normalized, use Binomial Theorem:
k
( a  b) k  
l 0
 a
k
l
k l l
b
N
 P(m, N , p )  
m0
N

m0
 p
N
m
m N m
q  ( p  q) N  1

 binomial distribution is properly normalized


[Link]/Sp07 P416 Lecture 2 4
Mean of binomial distribution:
N
 mP (m, N , p )
 m 0
N
N
  mP (m, N , p )   m
m0
N

m0
 p
N
m
m N m
q
 P(m, N , p )
m 0

A cute way of evaluating the above sum is to take the derivative:


p  m  0
 
  N N m N m 
 m p q 

0

N
m
m0
 p
N
m
m 1 N  m
q  
N

m0
 p
N
m
m
( N  m)(1  p ) N  m 1  0

p 1
N
m
m 0
 p
N
m
m N m
q  N (1  p ) 1
N

m0
 p
N
m
m
(1  p ) N m
 (1  p ) 1
N
m
m0
 p
N
m
m
(1  p ) N  m

p 1  N (1  p) 1  1  (1  p ) 1 
  Np

Variance of binomial distribution (obtained using similar trick):


N
 (m   )2 P(m, N, p)
 2  m0 N
 Npq
 P(m, N, p)
m0

[Link]/Sp07 P416 Lecture 2 5


Example: Suppose you observed m special events (success) in a sample of N events
 The measured probability (“efficiency”) for a special event to occur is:
m

N
 What is the error (standard deviation) on the probability ("error on the efficiency"):
 Npq N (1  )  (1  )
  m    we will derive this later in the course
N N N N
The
sample size (N) should be as large as possible to reduce the uncertainty in the probability measurement.
Let’s relate the above result to Lab 2 where we throw darts to measure the value of .
If we inscribe a circle inside a square with side=s then the ratio of the area of the circle

to the rectangle is:
area of circle r 2  ( s / 2) 2  r
 2  2

area of square s s 4
s
So, if we throw darts at random at our rectangle then the probability () of a dart landing inside the
circle is just the ratio of the two areas, /4. The we can determine  using:
.
The error in  is related to the error in  by:
 (1   )
  4
N
We can estimate how well we can measure  by this method by assuming that = (3.14159…)/4:
 (1   ) 1.6
  4  using    / 4
N N
This formula “says” that to improve our estimate of  by a factor of 10 we have to throw 100 (N) times as
many darts! Clearly, this is an inefficient way to determine .

[Link]/Sp07 P416 Lecture 2 6


Example: Suppose a baseball player's batting average is 0.300 (3 for 10 on average).
 Consider the case where the player either gets a hit or makes an out (forget about walks here!).
prob. for a hit: p = 0.30
prob. for "no hit”: q = 1 - p = 0.7
 On average how many hits does the player get in 100 at bats?
= Np = 100·0.30 = 30 hits
 What's the standard deviation for the number of hits in 100 at bats?
 = (Npq)1/2 = (100·0.30·0.7)1/2 ≈ 4.6 hits
we expect ≈ 30 ± 5 hits per 100 at bats
 Consider a game where the player bats 4 times:
Pete Rose’s lifetime
probability of 0/4 = (0.7)4 = 24%
batting average: 0.303
probability of 1/4 = [4!/(3!1!)](0.3)1(0.7)3 = 41%
probability of 2/4 = [4!/(2!2!)](0.3)2(0.7)2 = 26%
probability of 3/4 = [4!/(1!3!)](0.3)3(0.7)1 = 8%
probability of 4/4 = [4!/(0!4!)](0.3)4(0.7)0 = 1%
probability of getting at least one hit = 1 - P(0) = 1-0.24=76%

[Link]/Sp07 P416 Lecture 2 7


Poisson Probability Distribution
 The Poisson distribution is a widely used discrete probability distribution.
 Consider the following conditions:
p is very small and approaches 0
 example: a 100 sided dice instead of a 6 sided dice, p = 1/100 instead of 1/6
 example: a 1000 sided dice, p = 1/1000
N is very large and approaches ∞ Siméon Denis Poisson
June 21, 1781-April 25, 1840
example: throwing 100 or 1000 dice instead of 2 dice
radioactive decay
The product Np is finite number of Prussian soldiers kicked
 Example: radioactive decay to death by horses per year !
Suppose we have 25 mg of an element quality control, failure rate predictions
very large number of atoms: N ≈ 1020 (avogadro’s number is large!)
Suppose the lifetime of this element  = 1012 years ≈ 5x1019 seconds
probability of a given nucleus to decay in one second is very small: p = 1/ = 2x10-20/sec
BUT Np = 2/sec finite!
The number of decays in a time interval is a Poisson process.
 Poisson distribution can be derived by taking the appropriate limits of the binomial distribution
N!
P(m, N , p)  p m q N m
m!( N  m)!
N! N ( N  1)    ( N  m  1)( N  m)! N>>m
  Nm
( N  m)! ( N  m)!
N m N m p 2 ( N  m)( N  m  1) ( pN ) 2
q  (1  p)  1  p ( N  m)       1  pN       e  pN
2! 2!
x a(a  1) x a (a  1)(a  2)
2 3
using : (1  x) a  1  xa    ...
2! 3!
[Link]/Sp07 P416 Lecture 2 8
N m m  pN
P(m, N, p)  p e
m!
Let   Np
e   m m m e   m m  m
P(m,  )  note :  P(m,  )   e 
  e  e  1
m! m0 m 0 m! m0 m!

 m is always an integer ≥ 0
 does not have to be an integer The mean and variance of
It is easy to show that: a Poisson distribution are the
 = Np = mean of a Poisson distribution same number!
 = Np =  = variance of a Poisson distribution
2

 Radioactivity example with an average of 2 decays/sec:


i) What’s the probability of zero decays in one second?
e2 2 0 e2 1 2
p(0,2)    e  0.135 13.5%
0! 1
ii) What’s the probability of more than one decay in one second?
e2 20 e2 21
p( 1,2)  1 p(0,2)  p(1,2)  1   1 e2  2e2  0.594  59.4%
 0! 1!
 iii) Estimate the most probable number of decays/sec?

P(m,  )  0
m m*
  To solve this problem its convenient to maximize lnP(m, ) instead of P(m, ).

e  m 
ln P(m,  )  ln    m ln   ln m!
  m! 

[Link]/Sp07 P416 Lecture 2 9


 In order to handle the factorial when take the derivative we use Stirling's Approximation:
ln m! m ln m  m
  ln10!=15.10 10ln10-10=13.03 14%
ln P(m,  )  (  m ln   ln m!)
m m ln50!=148.48 50ln50-50=145.601.9%

 (  m ln   m ln m  m)
m
1
 ln   ln m  m 1
m
0
m*  
The most probable value for m is just the average of the distribution
 This is only approximate since Stirlings Approximation is only valid for large m.
 Strictly speaking m can only take on integer values while  is not restricted to be an integer.

If you observed m events in a “counting” experiment, the error on m is
  m



[Link]/Sp07 P416 Lecture 2 10


Comparison of Binomial and Poisson distributions with mean = 1

0.5 0.4

0.35
0.4
poisson  0.3
binomial N=3, p=1/3
Probability

binomial N=10,p=0.1

Probability
0.3 0.25
poisson 
0.2
0.2 Not much
0.15
difference
0.1
0.1 between them!
0.05
0 0
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0
0 1 2 m 3 4 5
N m
N
For N large and  fixed: Binomial Poisson

[Link]/Sp07 P416 Lecture 2 11


Uniform distribution and Random Numbers
What is a uniform probability distribution: p(x)?
p(x)=constant (c) for a x b
p(x)=zero everywhere else
Therefore p(x1)dx1= p(x2)dx2 if dx1=dx2  equal intervals give equal probabilities

For a uniform distribution with a=0, b=1 we have p(x)=1


1 1 1 1
 p ( x)dx  1   cdx  c  dx  c  1
0 0 0 p(x)
What is a random number generator ? 0 x 1
A number picked at random from a uniform distribution with limits [0,1]
All major computer languages (FORTRAN, C) come with a random number generator.
FORTRAN: RAN(iseed)
The following FORTRAN program generates 5 random numbers:
iseed=12345 0.1985246 If we generate “a lot” of random numbers
do I=1,5 0.8978736
all equal intervals should contain the same
y=ran(iseed) 0.2382888
0.3679854 amount of numbers. For example:
type *, y
enddo 0.3817045 generate: 106 random numbers
end expect: 105 numbers [0.0, 0.1]
105 numbers [0.45, 0.55]

[Link]/Sp07 P416 Lecture 2 12


A Fortran program to throw dice see the
1234567 FORTRAN statements start in column 7 fortran tutorial
implicit none character: variables that are alphanumeric, e.g. A, a, osu
character key real: variables with a decimal point, e.g. 321.45. 0.0567
real count(6), RAN integer: variables without a decimal point, e.g. 5, 321
integer i, dice, roll, seed seed: an integer variable used by the random number generator RAN
seed = 432211111
10 continue A continue statement is a place holder. 10 is the statement number
do dice = 1, 6 This is a “do loop”. It will repeat 6 times.
count(dice) = 0
end do This question appears on the screen (unit 6)
write(6,*)'How many rolls of dice :' Here you answer the question (unit 5)
read(5,*)roll
do i = 1, roll Another “do loop”.
dice = INT(1 + 6*RAN(seed)) Here we are rolling the dice, using the random number generator.
count(dice) = count(dice) + 1 We keep track of the results in the array “count”.
end do
Another “do loop”.
do i = 1, 6 Here we are writing the results to the monitor screen
write(6,*)i, count(i)
end do
write(6,*) ‘Hit <return> to continue or type q or Q', Allows you to continue the program again or quit
> ' and <return> to quit the program.'
read(5,20)key
20 format(a1) If you don’t want to quit then we jump back up to line 10
if ([Link].'q'.[Link].'Q') go to 10 “end” tells the Fortran compiler that this is the end of the program
end
[Link]/Sp07 P416 Lecture 2 13
Throwing a dice using a computer
When we throw an “honest” 6 sided dice we expect each number to
appear 1/6 of the time.
To simulate this on the computer we want a program that generates the integers
[1, 2, 3, 4, 5, 6] in a way that each number is equally likely.
DICE=INT(1+6*RAN(ISEED)) FORTRAN reads from
RAN(ISEED)  gives a number [0,1], e.g. 0.33 the right to the left.
6*RAN(ISEED)  1.98 (multiply by 6)
1+6*RAN(ISEED)  2.98 (add 1 to the number)
INT(1+6*RAN(ISEED))  INT truncates 2.98 to an integer, 2
As a result of the above code DICE will be equal to “2”
We just rolled a “2”
How would we roll two dice with a computer?
DICE1= INT(1+6*RAN(ISEED))
DICE2= INT(1+6*RAN(ISEED))
TWODI=DICE1+DICE2
The variable TWODI is an integer [2,12] which corresponds to the sum of the
numbers on two independent rolls of the dice.

[Link]/Sp07 P416 Lecture 2 14

You might also like