0% found this document useful (0 votes)
2 views10 pages

Understanding the Normal Distribution

The document summarizes key aspects of the normal distribution including its bell-shaped curve defined by a mean and variance. It discusses how the normal distribution can model quantitative variables and provides examples. It also outlines how to calculate probabilities using the normal distribution, including using z-scores and standard normal tables. Finally, it introduces sampling distributions and the central limit theorem, noting how sample statistics like the sample mean are used to estimate population parameters.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views10 pages

Understanding the Normal Distribution

The document summarizes key aspects of the normal distribution including its bell-shaped curve defined by a mean and variance. It discusses how the normal distribution can model quantitative variables and provides examples. It also outlines how to calculate probabilities using the normal distribution, including using z-scores and standard normal tables. Finally, it introduces sampling distributions and the central limit theorem, noting how sample statistics like the sample mean are used to estimate population parameters.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

The Normal distribution

The normal probability distribution is the most common model for relative frequencies of a quantitative variable. Bell-shaped and described by the function 1 { 12 (y )2 } f (y ) = e 2 , 2 where y is the quantitative variable whose relative frequencies we are modeling, is the mean of the distribution, and 2 is the variance. For xed values of and (e.g., 0 and 1) we can evaluate f (y ) for a range of values of y . The plot of f (y ) against y has the familiar bell shape. Figures.

Normal distribution (contd)


Examples of possibly normally distributed variables: Sale price of houses in a given area Interest rates oered by nancial institutions in the US Corn yields across the 99 counties in Iowa. Examples of variables that are not normally distributed: Number of trac accidents per week at each intersection in Ames. Proportion of US population who voted for Bush, Gore, or Nader in 2000. Proportion of consumers who prefer Coke or Pepsi.

Normal distribution (contd)


If the normal distribution is a good model for the relative frequencies of y , then we say that y is distributed as a normal random variable and write y N (, 2). We will be interesting in estimating and 2 from sample data. Given estimates of and 2, we can also estimate the probability that y is in the interval (a, b). From calculus: Prob(a < y < b) =
b a f (y )dy.

No need to know calculus! Tables of probabilities under the normal distribution can be used to calculate any probability of interest. Example: Table C.1 in text. Entries in the table give probability under the curve between 0 and z , where y z= The variable z is called a standard normal random variable and its distribution is a normal distribution with mean = 0 and variance 2 = 1.
3

Normal distribution (contd)


Example 1: y N (50, 225). Then y 50 N (0, 1). 15 To compute the probability that y is between 35 and 70, we rst standardize and then use the table: z= 1. Standardize by subtracting mean and dividing by the standard deviation. For y = 70: 70 50 = 1.33, z= 15 and for y = 35: 35 50 = 1.00. z= 15 2. Get area under curve between 0 and 1.33 and between -1.00 and 0, and add them up: 0.4082 + 0.3413 = 0.7495. 3. Interpretation in English: if y is normal with mean 50 and standard deviation 15, we expect that the probability that y is between 35 and 70 is about 75%.

Normal distribution (contd)


Example 2: y N (10, 64). Find the probability that y is bigger than 15 and also the probability that y is less than 7. y 10 15 10 Prob(y > 15) = Prob( > ) 8 8 = Prob(z > 0.625) = 1 Prob(z < 0.625) = 1 (0.5 + 0.2357) = 0.264. See gure. y 10 7 10 < ) 8 8 = Prob(z < 0.375) = 1 Prob(z > 0.375) = 1 (0.5 + 0.1480) = 0.352.

Prob(y < 7) = Prob(

See gure.

Normal distribution (contd)


For any value of (, 2), y is equally likely to be above or below the mean: Prob(y < ) = Prob(y > ) = 0.5), because the normal distribution is symmetric about the mean. Because of symmetry, the mean, median and mode of a normal variable are the same. A normal random variable can take on any value on the real line, so that Prob( < y < ) = 1. The probability that y is within a standard deviation of the mean is approximately 0.68: Prob( < y < ) = Prob(1 < z < 1) 0.68. and also: Prob(2 < y < 2 ) = Prob(2 < z < 2) 0.95 and Prob(3 < y < 3 ) = Prob(3 < z < 3) 0.99.

Sampling distributions
We use sample data to make inferences about populations. In particular, we compute sample statistics and use them as estimators of population parameters. is a good estimator of the pop The sample mean y 2 is ulation mean and the sample variance S 2 (or a good estimator of the population variance 2. How reliable an estimator is a sample statistic? To answer the question, we need to know the sampling distribution of the statistic. This is one of the most dicult concepts in statistics!

Sampling distributions (contd)


Suppose that y N (, 25) and we wish to estimate . We proceed as follows:

1. Draw a sample of size n from the population: y1, y2, ..., yn. = n1 i yi. 2. Compute the sample mean: y Two things to note: 1. The sample is random! If I had collected more than one sample of size n from the same population, I would have obtained dierent values of y . is also a random variable. Then, y as an estimator 2. The larger n, the more reliable is y of . Example using simulation: pretend that we have y N (20, 25) and draw 30 random samples, each of size n = 10 from the population using the computer. With . each sample, compute y

Sampling distributions (contd)


The sampling distribution of a statistic computed from a sample of size n > 1 has smaller variance than the distribution of the variable itself. Theorem: If yi, y2, ..., yn are a random sample from some population, then: Mean of y equals mean of y : E ( y ) = y = . 2 Variance of y is smaller than variance of y : y = 2 n . The larger the sample, the smaller the variance . (or the higher the reliability) of y Central Limit Theorem: For large n, the sam has a distribution that is approximately ple mean y normal, with mean and variance 2/n, regardless of the shape of the distribution from which we sample the y s. The larger the sample, the better the approximation (see Figure 1.14). Example in lab.

Sampling distributions (contd)


, we can make Given the sampling distribution of y probability statements such as:
< y < + 2 Prob( 2 n n ) = 0.95. a Prob( y > a) = Prob(z > / n ) and use the standard normal table to get an answer.

A preview of coming attractions: if it is true that < + 2 ) = 0.95 Prob( 2 < y n n then we can estimate the following interval using sample data: + 2 ) ( y 2 , y n n We call it a 95% condence interval for the population mean. , the interval is also random and As in the case of y varies from sample to sample. If we drew 100 samples of size n from some population and computed 100 intervals like the one above, about 95 of them would cover the true but unknown value of .

10

You might also like