Continuous Random Variable
Ashek Seum
Lecturer, Dept. of CSE, AUST
[Link]@[Link]
Outline
• Random Variables
• Probability Distributions
• Normal Distribution
• Covariance Matrix
• Correlation coefficient
2
Random Experiment / Event
• Experiment that produces outcomes in a way that is unpredictable
• Even if we know all the possible outcomes
• Example 1:
• Tossing a coin
• Known Outcomes: Heads (H) or Tails (T)
• But we can’t predict whether it will be H or T before tossing
• Example 2:
• Rolling a dice
• Known Outcomes: 1, 2, 3, 4, 5, or 6
• Each number has an equal chance of appearing
3
Random Variable
• Variable whose possible values are numerical outcomes of a random
phenomenon.
• For example:
• It takes the unpredictable outcomes of an experiment (like flipping a coin)
and assigns numbers to them (heads = 1, tails = 0).
• Mathematically:
• A random variable can take on a specific set of values.
• Each possible value of a random variable has a certain probability of
occurring.
• Usually denoted with capital letters like, X or Y
4
Random Variable Types
• Discrete Random Variables: Random variable that may take on only a
countable number of distinct values.
• Example: Coin toss, Number of children in a random family etc.
• Continuous Random Variable: Random variable that takes an infinite
number of possible values. They are usually measurements.
• Example: Exact weight of a random animal, Exact time to finish a
marathon etc.
5
Probability Distributions
• Probability Mass Function: Gives the probability of each possible outcome
for a discrete random variable.
• Example:
• Let’s say discrete RV, X = number of ‘heads’ after three coin toss
• Sample space = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
• P(X = 0) = 1/8; P(X = 1) = 3/8; P(X = 2) = 3/8; P(X = 3) = 1/8
• Distribution:
3/8
Probabilities
1/8
Valus for X
6
Probability Distributions
• Probability Mass Function: Gives the probability of each possible outcome
for a discrete random variable.
• Example:
• Let’s say discrete RV, X = number of ‘heads’ after three coin toss
• Sample space = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
• P(X = 0) = 1/8; P(X = 1) = 3/8; P(X = 2) = 3/8; P(X = 3) = 1/8
• Distribution:
3/8
• So, the PMF function: Probabilities
1/8
f(x) = P(X = x), where, discrete
variable X has value x Valus for X
7
Probability Distributions
• Probability Density Function: Describes the likelihood of a continuous
random variable taking on a range of values.
• Example:
• Let’s say, continuous RV, Y = exact amount of rain tomorrow.
• Now, what is the probability of raining exactly 2 inches tomorrow ?
Or, P( Y =2 ) = ?
8
Probability Distributions
• Probability Density Function: Describes the likelihood of a continuous
random variable taking on a range of values.
• Example:
• Let’s say, continuous RV, Y = exact amount of rain tomorrow.
• Now, what is the probability of raining exactly 2 inches tomorrow ?
Or, P( Y = 2 ) = ?
• It’s 0 (zero).
• Probability of any exact value is virtually 0, as it’s not possible to measure
to the exact molecular level.
• So, we measure the probability within a range of values.
• Instead of asking P(Y=2) = ?, we’ll ask for example, P(1.9 < Y < 2.1) = ?
9
Probability Distributions
• Probability Density Function: Describes the likelihood of a continuous
random variable taking on a range of values.
• Example:
• Let’s say, continuous RV, Y = exact amount of rain tomorrow.
• Instead of asking P(Y=2) = ?, we’ll ask for example, P(1.9 < Y < 2.1) = ?
• If we want the probability, we actually
need the marked area under the curve.
• So,
• In general terms, 1.9 2.1
10
Probability Density Function (PDF)
• A famous PDF: Normal (or Gaussian) distribution.
a
• Why is this important ?
• It Models Many Real-World data like Heights, weights, test scores,
measurement errors, etc.
• It describes data that tends to cluster around an average value (the
"center") and spreads out symmetrically on both sides.
11
Normal Distribution
• Mathematically, normal distribution:
a
• Here,
= The value of the random variable we’re evaluating (e.g., a height or a test score).
• The normal distribution is often referred to as
12
Normal Distribution
• Mean ( )
• If you know every value of the population &
all values are equally likely, then mean:
• Example:
• If you have the ages of all students in a class: [15, 16, 17, 18], the
mean is: = 15 + 16 + 17 + 18 / 4 = 16.5
13
Normal Distribution
• Mean ( )
• If all values are not equally likely, for discrete
RV, X with probabilities ,
• Here, Instead of treating all values equally, we give more importance (or
"weight") to the values that have a higher probability of happening.
• For random variables, mean represents it’s “expected value”.
• If we repeat the random experiment many times, is the average
value we would expect to obtain.
• We can write:
14
Normal Distribution
• Mean ( )
• If all values are not equally likely, for discrete
RV, X with probabilities ,
• Here, Instead of treating all values equally, we give more importance (or
"weight") to the values that have a higher probability of happening.
• Example:
• Imagine rolling a weighted die where the outcomes - {1, 2, 3, 4, 5, 6 }
• Their probabilities - {0.1, 0.2, 0.3, 0.1, 0.2, 0.1}
• So, here mean, = (1*0.1) + (2*0.2) + (3*0.3) + (4*0.1) + (5*0.2) + (6*0.1)
= 3.4
15
Normal Distribution
• Mean ( )
• For Continuous RV, X with probabilities,
• Instead of probabilities P(x), we use a PDF f(x), which tells us the relative
likelihood of X being near a specific value.
• The integral in the continuous RV is analogous to summation in the discrete
case.
• It "sums up" the contributions of all possible values x over the range.
16
Normal Distribution
• Variance ( )
• Tells us how spread out the values of a random variable are from its mean
• For a random variable X, variance (or Var[X]):
• Here,
: The deviation of a value from the mean
: The squared deviation ensures all differences are positive
17
Normal Distribution
• Variance ( )
• Tells us how spread out the values of a random variable are from its mean
• For a random variable X, variance (or Var[X]):
• Here,
: The deviation of a value from the mean
: The squared deviation ensures all differences are positive
• For Discrete Random Variables:
• For Continuous Random Variables:
18
Normal Distribution
• Standard Deviation ( )
• A small variance (thus std. deviation) means the data points are close to the
mean (less spread out).
• A large variance (thus std. deviation) means the data points are far from the
mean (more spread out).
19
Normal Distribution
• For a normal (Gaussian) distribution, about 68% of the data values are within one standard
deviation of the mean, and about 95% are within two standard deviations
20
Covariance
• Covariance is a statistical measure that indicates the similarity between two random variables, say
X1 and X2
• It tells us whether an increase in one variable corresponds to an increase or decrease in the other.
• Mathematically,
• If X1 = X2, covariance becomes variance:
• If > 0: X1 and X2 tend to increase together (positive relationship).
• If < 0: X1 and X2 tend to decrease (negative relationship).
• If = 0: No relationship between X1 and X2. They are independent.
21
Covariance Matrix
• A square matrix that contains the covariances between all pairs of random variables in a dataset.
• If we have � random variables, the covariance matrix is � × �
• For the case of three random variables (i.e., features), the covariance matrix is:
• The diagonal entries are Variances of variable
• For off-diagonal entries:
22
Covariance Matrix Example
• Let’s consider three random variables (or features) � , � and �
• They represent height ( � ), weight ( � ) and age ( � ) of individuals in a dataset.
• Suppose we calculated the following -
• Variance:
• Var(X) = 4; variance of height (X)
• Var(Y) = 9; variance of weight (Y)
• Var(Z) = 16; variance of age (Z)
• Covariance:
• Cov(X,Y) = 2; Covariance between height and weight.
• Cov(X,Z) = -1; Covariance between height and age.
• Cov(Y,Z) = 3; Covariance between weight and age.
23
Covariance Matrix Example
• Suppose we calculated the following -
• Variance:
• Var(X) = 4; variance of height (X)
• Var(Y) = 9; variance of weight (Y)
• Var(Z) = 16; variance of age (Z)
• Covariance:
• Cov(X,Y) = 2; Covariance between height and weight.
• Cov(X,Z) = -1; Covariance between height and age.
• Cov(Y,Z) = 3; Covariance between weight and age.
• The covariance matrix Σ is: or,
• Cov(X,Y) = 2 : means height and weight tend to increase together
• Cov(X,Z) = -1: means as height increases, age tends to decrease (possibly younger individuals are
taller in this group)
• Cov(Y,Z) = 3: means older individuals tend to have higher weights
24
Covariance Matrix Example
• Correlation coefficient:
• Normalizes the covariance between -1 to +1
• by dividing it by the product of the standard deviations of the two variables
• Correlation is always between − 1 and 1, where:
25
THE END
26