Probability : Random variables and distribution : Probability
distributions
TGT | mathematics | English
Probability: Random variables and distribution: Probability distributions
1. Random Variable
A random variable is a function that assigns a real number to each outcome in the sample space of a random
experiment. It is typically denoted by an uppercase letter, such as X, Y, or Z.
Definition: Let S be the sample space of a random experiment. A function is called a random variable.
Types of Random Variables:
Random variables can be classified into two types based on the nature of their possible values:
1.1 Discrete Random Variable
A discrete random variable is a random variable that can only take a finite number of values or a countably infinite
number of values. The values are typically integers or can be put into a one-to-one correspondence with the set of
natural numbers.
Examples:
The number of heads obtained when flipping a coin 3 times (possible values: 0, 1, 2, 3).
The number of defective items in a sample of 10 items (possible values: 0, 1, 2,..., 10).
The number of customers arriving at a store in an hour (possible values: 0, 1, 2, 3,...).
The outcome of rolling a fair six-sided die (possible values: 1, 2, 3, 4, 5, 6).
1.2 Continuous Random Variable
A continuous random variable is a random variable that can take any value within a given range or interval. The set of
possible values is uncountably infinite.
Examples:
The height of a person (can take any value within a range, e.g., 1.50 m to 2.00 m).
The temperature of a room (can take any value within a range, e.g., 20°C to 25°C).
The time taken for a student to complete an exam (can take any value within a range, e.g., 0 minutes to 180
minutes).
The exact weight of a packet of sugar (can be 1 kg, 1.001 kg, 1.0005 kg, etc., within a tolerance).
2. Probability Distribution
A probability distribution describes the likelihood of obtaining each possible value that a random variable can
assume. It is a fundamental concept in probability theory and statistics that helps in understanding and predicting
the behavior of random phenomena.
2.1 Probability Distribution of a Discrete Random Variable
For a discrete random variable X, the probability distribution is given by the Probability Mass Function (PMF).
Definition: The Probability Mass Function (PMF) of a discrete random variable X is a function that gives the
probability that X takes on a specific value x.
where:
1. for all possible values x.
2. , where the sum is over all possible values of x.
Example: Consider rolling a fair six-sided die. Let X be the random variable representing the outcome. The
sample space is S = {1, 2, 3, 4, 5, 6}. The PMF is:
The sum of probabilities is .
2.2 Probability Distribution of a Continuous Random Variable
For a continuous random variable X, the probability distribution is given by the Probability Density Function (PDF).
Since the probability of a continuous random variable taking any single specific value is zero, the PDF does not give
probabilities directly. Instead, the area under the PDF curve over a given interval represents the probability that the
variable falls within that interval.
Definition: The Probability Density Function (PDF) of a continuous random variable X is a function such that:
1. for all x.
2. .
The probability that X lies between two values a and b (where ) is given by the integral of the PDF from a to b:
Example: Suppose the height of adult males in a certain region is a continuous random variable X with a PDF for
, where is the mean height and is the standard deviation. The probability that a randomly selected male has a
height between 1.70 m and 1.80 m would be calculated by .
3. Cumulative Distribution Function (CDF)
The Cumulative Distribution Function (CDF) is a function that gives the probability that a random variable X takes
on a value less than or equal to a specific value x. It is denoted by .
3.1 CDF for Discrete Random Variable
For a discrete random variable X, the CDF is given by:
The sum is taken over all possible values t of X such that .
Properties of CDF for Discrete RV:
.
is non-decreasing: If , then .
.
.
Example: For the fair die roll example, the CDF is:
for
for
for
for
for
for
for
3.2 CDF for Continuous Random Variable
For a continuous random variable X, the CDF is given by:
where is the PDF of X.
Properties of CDF for Continuous RV:
.
is non-decreasing: If , then .
.
.
is continuous.
.
The PDF can be obtained from the CDF by differentiation: .
Example: Consider a continuous random variable X uniformly distributed over the interval [0, 2]. Its PDF is for ,
and 0 otherwise. The CDF is:
for
for
for
So,
4. Mathematical Expectation (Expected Value)
The expected value of a random variable is the weighted average of all possible values that the random variable can
take. It represents the mean or average value of the random variable over a large number of trials. It is denoted by or
.
4.1 Expected Value of a Discrete Random Variable
For a discrete random variable X with PMF , the expected value is:
The sum is taken over all possible values x of X.
Example: For the fair die roll (X = outcome), the expected value is:
This means that, on average, the outcome of rolling a fair die is 3.5.
4.2 Expected Value of a Continuous Random Variable
For a continuous random variable X with PDF , the expected value is:
Example: For the uniform distribution on [0, 2], the expected value is:
4.3 Expected Value of a Function of a Random Variable
If Y is a function of X, say , then the expected value of Y is:
For discrete X:
For continuous X:
This property is known as the Law of the Unconscious Statistician.
Linearity of Expectation: For constants a and b, and random variables X and Y:
(This holds regardless of whether X and Y are independent).
5. Variance and Standard Deviation
The variance measures the spread or dispersion of the probability distribution around its mean. It is the expected
value of the squared difference between the random variable and its mean.
5.1 Variance of a Discrete Random Variable
The variance of a discrete random variable X with mean is denoted by or .
An alternative formula for variance is:
where .
5.2 Variance of a Continuous Random Variable
The variance of a continuous random variable X with mean is:
The alternative formula is:
where .
Properties of Variance:
.
for a constant a.
for constants a and b.
If X and Y are independent, .
The standard deviation is the square root of the variance. It is often preferred because it has the same units as the
random variable itself.
Dimension of Variance: , Dimension of Standard Deviation: .
Example: For the uniform distribution on [0, 2]:
The standard deviation is .
6. Common Probability Distributions
There are several important probability distributions that model various random phenomena. These can be broadly
categorized into discrete and continuous distributions.
6.1 Discrete Probability Distributions
6.1.1 Bernoulli Distribution
A Bernoulli random variable represents the outcome of a single trial of an experiment that has only two possible
outcomes, typically labeled "success" and "failure".
Parameters: , the probability of success ().
Possible Values: 0 (failure) and 1 (success).
PMF: , .
Mean: .
Variance: .
Example: Flipping a fair coin once, where X=1 for heads (success) and X=0 for tails (failure). Here, .
6.1.2 Binomial Distribution
The Binomial distribution models the number of successes in a fixed number of independent Bernoulli trials.
Parameters: (number of trials), (probability of success in each trial).
Notation: .
Possible Values: .
PMF: The probability of getting exactly k successes in n trials is given by:
where is the binomial coefficient.
Mean: .
Variance: .
Conditions for Binomial Distribution:
1. The experiment consists of a fixed number of trials, .
2. Each trial has only two possible outcomes (success or failure).
3. The probability of success, , is the same for each trial.
4. The trials are independent.
Example: The number of heads obtained when flipping a fair coin 10 times. Here, , . The probability of getting
exactly 7 heads is .
6.1.3 Poisson Distribution
The Poisson distribution models the number of events occurring in a fixed interval of time or space, given a known
average rate of occurrence. It is often used to model rare events.
Parameters: (lambda), the average number of events in the interval. .
Notation: .
Possible Values: (countably infinite).
PMF: The probability of observing exactly k events is:
where is the base of the natural logarithm.
Mean: .
Variance: .
Approximation: The Poisson distribution can approximate the Binomial distribution when is large and is small,
such that is moderate. For example, if and , then .
Example: The number of phone calls received by a call center in an hour, given that the average rate is 15 calls
per hour (). The probability of receiving exactly 10 calls in an hour is .
6.1.4 Geometric Distribution
The Geometric distribution models the number of Bernoulli trials needed to achieve the first success.
Parameters: , the probability of success in each trial.
Notation: .
Possible Values: (number of trials until the first success).
PMF: The probability that the first success occurs on the k-th trial is:
Mean: .
Variance: .
Memoryless Property: The Geometric distribution is memoryless, meaning that the probability of future
successes does not depend on the number of past failures. .
Example: The number of times a fair coin needs to be tossed until the first head appears. Here, . The probability
that the first head appears on the 4th toss is .
6.1.5 Negative Binomial Distribution
The Negative Binomial distribution generalizes the Geometric distribution. It models the number of trials needed to
achieve a fixed number of successes (r successes).
Parameters: (number of successes), (probability of success in each trial).
Notation: .
Possible Values:
PMF: The probability that the r-th success occurs on the k-th trial is:
Mean: .
Variance: .
Note: When , the Negative Binomial distribution reduces to the Geometric distribution.
Example: The number of coin flips needed to get 3 heads, where the probability of heads is . If we want to find
the probability that the 3rd head occurs on the 7th flip (), then:
6.1.6 Hypergeometric Distribution
The Hypergeometric distribution models the probability of k successes in n draws, without replacement, from a finite
population of size N containing K successes.
Parameters: (population size), (number of success states in the population), (number of draws).
Notation: .
Possible Values: .
PMF: The probability of getting exactly k successes in n draws is:
Mean: .
Variance: . The term is the finite population correction factor.
Approximation: When is very large compared to (e.g., ), the Hypergeometric distribution can be approximated
by the Binomial distribution with .
Example: An urn contains 10 red balls and 5 blue balls ( for red). If 4 balls are drawn without replacement (), what
is the probability that exactly 3 are red?
6.2 Continuous Probability Distributions
6.2.1 Uniform Distribution
A continuous random variable is uniformly distributed if it has an equal probability density over a given interval.
Parameters: (lower bound), (upper bound), .
Notation: .
PDF:
CDF:
Mean: .
Variance: .
Example: The time until the next bus arrives, if the bus arrives every 15 minutes and its arrival is unpredictable
within that interval. Let the interval be minutes. . The PDF is for . The probability that the bus arrives within the
next 5 minutes is .
6.2.2 Normal Distribution (Gaussian Distribution)
The Normal distribution is perhaps the most important continuous distribution, widely used in statistics and natural
sciences. It is characterized by its bell-shaped curve.
Parameters: (mean), (variance), where .
Notation: .
PDF:
CDF: The CDF does not have a simple closed-form expression and is usually denoted by for the standard normal
distribution.
Mean: .
Variance: .
Standard Normal Distribution: A special case where and . It is denoted by . The PDF is , and the CDF is .
Standardization: Any normal random variable X can be converted to a standard normal variable Z using the
transformation: .
Central Limit Theorem (CLT): A fundamental theorem stating that the sum (or average) of a large number of
independent and identically distributed random variables, regardless of their original distribution, will be
approximately normally distributed. This is why the normal distribution appears so frequently. For example, the
distribution of sample means from a population, especially if the population distribution is not skewed, tends to
be normal for sample sizes .
Empirical Rule (68-95-99.7 Rule): For a normal distribution:
Approximately 68% of the data falls within 1 standard deviation of the mean ().
Approximately 95% of the data falls within 2 standard deviations of the mean ().
Approximately 99.7% of the data falls within 3 standard deviations of the mean ().
Example: The IQ scores of a population are often assumed to be normally distributed with a mean of 100 and a
standard deviation of 15 (). The probability that a randomly selected person has an IQ between 85 and 115 is .
Since and , this probability is approximately 68% according to the empirical rule.
6.2.3 Exponential Distribution
The Exponential distribution models the time until an event occurs in a Poisson process, i.e., a process where events
occur continuously and independently at a constant average rate. It is closely related to the Poisson distribution.
Parameters: (rate parameter), where . Note that is the same parameter as in the Poisson distribution.
Notation: .
PDF:
CDF:
Mean: . (This is the average time between events).
Variance: .
Memoryless Property: Like the Geometric distribution, the Exponential distribution is memoryless. The
probability that an event will occur in the next time interval is independent of how much time has already
elapsed. .
Example: The lifetime of an electronic component that fails randomly is often modeled by an exponential
distribution. If the average failure rate is failures per hour, then the average lifetime is hours. The probability that
the component lasts longer than 150 hours is .
6.2.4 Chi-Squared Distribution ()
The Chi-Squared distribution is a continuous probability distribution that arises in statistics, particularly in
hypothesis testing and confidence interval estimation for a population variance.
Parameters: (degrees of freedom), .
Relationship to Normal Distribution: If are independent standard normal random variables, then the sum of
their squares follows a chi-squared distribution with degrees of freedom: .
PDF: The PDF is non-negative and defined for . It involves the Gamma function:
Mean: .
Variance: .
Shapes: The shape of the distribution depends on the degrees of freedom (). For small , it is skewed to the right.
As increases, the distribution becomes more symmetric and approaches a normal distribution.
Uses: Used in goodness-of-fit tests, tests for independence in contingency tables (e.g., Pearson's chi-squared
test), and confidence intervals for variance.
Example: If we have 5 independent standard normal variables, their sum of squares follows a distribution with 5
degrees of freedom. The mean of this distribution is 5, and the variance is .
6.2.5 Gamma Distribution
The Gamma distribution is a flexible continuous distribution that is often used to model waiting times, or the sum of
exponentially distributed random variables.
Parameters: (shape parameter, ), (rate parameter, ). Sometimes a scale parameter is used.
Notation: .
PDF:
where is the Gamma function ( for integer n).
Mean: .
Variance: .
Relationship to Exponential: If , then X follows an Exponential distribution with rate . The Gamma distribution
with shape parameter can be viewed as the distribution of the sum of independent exponential random
variables with rate .
Relationship to Chi-Squared: If , then X follows a Chi-Squared distribution with degrees of freedom.
Uses: Modeling lifetimes, waiting times, rainfall amounts, and financial data.
7. Key Concepts Summary
Random Variable: Maps outcomes of random experiments to real numbers.
Discrete: Takes countable values (e.g., integers).
Continuous: Takes values in an interval.
Probability Distribution: Describes the likelihood of a random variable's values.
PMF (Discrete): . .
PDF (Continuous): . . .
CDF: . Non-decreasing, .
Expected Value (Mean): . Weighted average of values.
Discrete: .
Continuous: .
Variance: . Measures spread around the mean. .
Standard Deviation: .
Common Distributions:
Discrete: Bernoulli, Binomial, Poisson, Geometric, Hypergeometric.
Continuous: Uniform, Normal, Exponential, Chi-Squared, Gamma.
8. Key Points to Remember
The sum of probabilities for a discrete distribution must equal 1.
The total area under the PDF curve for a continuous distribution must equal 1.
The Normal distribution's shape is determined by its mean and variance.
The Central Limit Theorem is crucial for understanding the prevalence of the Normal distribution in applied
statistics.
Memoryless property is a key characteristic of Geometric and Exponential distributions.
Binomial distribution requires independent trials with constant probability of success.
Poisson distribution models counts of events over time/space at a constant average rate.
Hypergeometric distribution applies to sampling without replacement from a finite population.
9. Possible Exam Questions
1. Explain the difference between a discrete and a continuous random variable, providing examples for each.
2. Define Probability Mass Function (PMF) and Probability Density Function (PDF). What are their key properties?
3. Derive the formulas for the mean and variance of a Binomial distribution, .
4. Discuss the conditions under which a Poisson distribution can be used to model a random phenomenon. How
does it relate to the Binomial distribution?
5. Explain the properties of the Normal distribution. What is the significance of the Standard Normal distribution
and how is it used?
6. A manufacturer produces light bulbs with a lifetime that follows an Exponential distribution with a mean lifetime
of 1000 hours. Calculate the probability that a bulb lasts more than 1500 hours. Also, verify the memoryless
property for this distribution.
7. Suppose a batch of 20 items contains 4 defective items. If 5 items are randomly selected without replacement,
what is the probability that exactly 2 are defective? Identify the distribution used and state its parameters.
Prepared by Lokesh Sir Notes | Quality at its peak