0% found this document useful (0 votes)
39 views2 pages

Statistical Analysis Questions Using R

This document outlines 5 questions that can be solved using R to analyze statistical data. Question 1 deals with the normal distribution of cream cheese package weights. Question 2 involves determining the necessary sample size and constructing a confidence interval for a population proportion. Question 3 requires calculating a confidence interval for the proportion of smokers among male students. Question 4 estimates the proportion of days with no car accidents before and after a new highway surveillance system. Question 5 constructs confidence intervals for a car's average mileage based on samples of drivers.

Uploaded by

MuraliManohar
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views2 pages

Statistical Analysis Questions Using R

This document outlines 5 questions that can be solved using R to analyze statistical data. Question 1 deals with the normal distribution of cream cheese package weights. Question 2 involves determining the necessary sample size and constructing a confidence interval for a population proportion. Question 3 requires calculating a confidence interval for the proportion of smokers among male students. Question 4 estimates the proportion of days with no car accidents before and after a new highway surveillance system. Question 5 constructs confidence intervals for a car's average mileage based on samples of drivers.

Uploaded by

MuraliManohar
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Few Questions to be Solved Using R

1. Suppose packages of cream cheese coming from an automated processor have


weights that are normally distributed. For one day’s production run, the mean is
8.2 ounces and the standard deviation is 0.1 ounce.
(a) If the packages of cream cheese are labelled 8 ounces, what proportion of the
packages weigh less than the labelled amount?
(b) If only 5% of the packages exceed a specified weight w, what is the value of
w?
(c) Suppose two packages are selected at random from the day’s production. What
is the probability that the average weight of the two packages is less than 8.3
ounces?

2. In a study to determine the percentage of children that are overdue for vaccination
in a small town, the researcher took a sample out of 580 children served by the
only hospital in town.
(i) What sample size would be necessary to estimate the proportion with 95%
confidence with a margin of error 0.107.
(ii) Suppose a sample of 120 was taken, of whom 27 were not overdue for
vaccination. Give a 95% confidence interval for the percentage of children
not overdue for vaccination.

3. Suppose we are interested to estimate the proportion (p) of smokers among the
male students of IIMV. Suppose a sample of size 100 is chosen out of the male
students, and the sample proportion of smokers is found to be 0.2. Give an interval
based on your data so that you are 95% confident that the true value of the
unknown proportion lies inside it. Also, state the assumptions that you have to
make for finding this interval. Could you check these assumptions?

4. Suppose a new surveillance system is installed on highways to prevent the drivers


from speeding. Out of the next 100 days after installation, the number of days with
no major car accident is observed to be 60. Find an interval estimate of the
proportion of days with no major accident that is expected with the new system in
place with confidence level 95%. Before the installation of the system, this
percentage was 40.

(i) Based on the 95% confidence interval, would you believe that the
installation of the system has led to an improvement?
(ii) State the assumptions that you have made for drawing the conclusion.
Could you check these assumptions?
5. Suppose a car manufacturing company claims that the average mileage of model
M is 20 (distance it could travel with one gallon of gas). The average and the
standard deviation of mileages of 16 individuals owning car of model M, who are
chosen at random, are found to be 19.5 and 4, respectively.
(i) Find a 95% confidence interval of the average mileage.
(ii) State the assumptions that you have made for finding this interval.
(iii) Find the confidence interval if the number of individuals is increased to 36
(keeping other values same). Do you need the assumptions you stated in (ii)
for finding the confidence interval.

Common questions

Powered by AI

The proportion of packages weighing less than 8 ounces can be found using the Z-score formula. The Z-score is calculated as \( Z = (X - \mu) / \sigma \) where \( X = 8 \), \( \mu = 8.2 \), and \( \sigma = 0.1 \). The Z-score is -2, indicating that approximately 2.28% of the packages weigh less than 8 ounces .

The sample proportion \( \hat{p} = 27 / 120 = 0.225 \). The standard error \( SE = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} = \sqrt{\frac{0.225 \times 0.775}{120}} = 0.0392 \). The confidence interval is \( \hat{p} \pm Z \times SE \), where \( Z \) is approximately 1.96 for 95% confidence. The interval is \( 0.225 \pm 1.96 \times 0.0392 \), resulting in (0.148, 0.302).

Increasing the sample size decreases the standard error, narrowing the confidence interval. Original interval calculations were \( 19.5 \pm t_{15} \times 4 / \sqrt{16} \). With 36 samples, it becomes \( 19.5 \pm t_{35} \times 4 / \sqrt{36} \) with a reduced margin, making the confidence interval estimation more precise. The impact demonstrates a better estimation precision due to larger samples .

To find the weight \( w \) where only 5% of packages exceed this weight, one must identify the 95th percentile of the normal distribution. For a standard normal distribution, this corresponds to a Z-score of approximately 1.645. Using the formula \( w = \mu + Z\sigma \), where \( \mu = 8.2 \) and \( \sigma = 0.1 \), the value of \( w \) is approximately 8.3645 ounces .

The necessary sample size \( n \) can be calculated using the formula \( n = \frac{Z^2 \cdot p(1-p)}{E^2} \), where \( Z \approx 1.96 \) for 95% confidence, \( E = 0.107 \), and \( p = 0.5 \) (assuming maximum variability). Solving this gives \( n \approx 87 \). A sample size of at least 87 is required to estimate the proportion with 95% confidence and the given margin of error .

The probability can be determined using the central limit theorem and the sampling distribution of the sample mean. The standard deviation of the sample mean \( \sigma_{\bar{x}} = \sigma / \sqrt{n} = 0.1 / \sqrt{2} = 0.0707 \). The Z-score for 8.3 ounces is \( Z = (8.3 - 8.2) / 0.0707 \approx 1.414 \), which corresponds to a probability of 0.9217, indicating a 92.17% chance the average weight is less than 8.3 ounces .

The sample proportion is \( \hat{p} = 0.6 \). The 95% confidence interval using \( Z = 1.96 \) is calculated as \( 0.6 \pm 1.96 \cdot \sqrt{0.6\times0.4/100} \), resulting in (0.502, 0.698). Since the interval is above the pre-installation proportion of 0.4, this suggests the system may have led to an improvement .

Possible errors include sampling bias and measurement errors. Sampling bias can occur if the sample is not representative of the population, possibly mitigated by ensuring proper random sampling techniques. Measurement errors can stem from inaccurate data collection, mitigated by standardized data collection protocols. Mitigation ensures that the confidence interval is reflective of the true population proportion .

The assumptions include: 1) Random sampling from the population, 2) The sample size should be large enough for the normal approximation to be valid, generally \( n\hat{p} \geq 5 \) and \( n(1-\hat{p}) \geq 5 \). Here, both \( 100 \times 0.2 = 20 \) and \( 100 \times 0.8 = 80 \) are greater than 5, satisfying the condition. These assumptions can be checked through data collection methods and calculation verification .

Key assumptions include the normal distribution of the mileage data and randomness of sample selection. Verification can be done through visual inspection using a Q-Q plot or a Shapiro-Wilk test for normality, which should be consistent with a normal distribution. Randomness can be ensured by reviewing the sampling method from the population of cars .

You might also like