Statistical Inference Exam Questions
Statistical Inference Exam Questions
Type I error occurs when the null hypothesis is rejected despite being true, with probability denoted by α. Type II error occurs when the null hypothesis is not rejected despite being false, with probability denoted by β. These errors impact conclusions by indicating the risk of wrong decisions: a Type I error leads to a false positive, while a Type II error results in a missed detection or false negative. The balance between these errors is managed by adjusting the significance level (α) and power (1-β) of a test, influencing the reliability of statistical inferences .
To estimate the parameter 'θ' from a discrete distribution given data (e.g., x=0,1,2,3 with corresponding probabilities), you calculate the likelihood function based on the observed data. For example, using observations from x={3,0,2,1,3,2,1,0,2,1}, the probabilities P(X=x) might be given as functions of θ, such as 2θ, 3θ/3, 2(1−θ)/3, 1−θ/3. Compute the likelihood for the given observations and differentiate it with respect to θ. Set the derivative to zero to solve for θ, which gives the MLE. Careful computation using observed frequency matches against the probability function helps isolate 'θ' accurately .
To construct a confidence interval for the population variance σ² when the population mean is unknown, use the chi-square distribution. The formula is based on (n−1)S²/σ², which follows a χ² distribution with n−1 degrees of freedom. The confidence interval for σ² is given by [(n−1)S²/χ²_(α/2,n-1), (n−1)S²/χ²_(1-α/2,n-1)], ensuring a (1-α) confidence level. This method is appropriate because it accounts for sample variability by using sample variance (S²) and the chi-square distribution accurately reflects the distribution of the sample variance for normally distributed data .
A simple hypothesis completely specifies the population distribution, such as H0: μ = 5, where all parameters (e.g., mean and variance) are specified. A composite hypothesis provides partial specification, such as H0: μ > 5, where the distribution's full characteristics aren't fully defined. Simple hypotheses allow straightforward calculation of test statistics and p-values, while composite hypotheses require broader testing procedures often invoking broader confidence intervals and complex test statistics as they account for greater parameter uncertainty. This complexity affects both the sensitivity and specificity of tests as applied to data analysis scenarios .
Showing detailed work and using a scientific calculator in statistical examinations ensures clarity, precision, and accuracy in the completion of complex numerical calculations and logical reasoning steps required for statistical inference problems. It allows examiners to follow the student's reasoning process, verifying correctness and understanding, thus facilitating awarding partial credits. This protocol is crucial in examinations such as those at Bahir Dar University, where adherence to systematic calculation and documentation standards ensures fair evaluation of each candidate's actual competency and grasp of statistical concepts .
The t-distribution is used instead of the standard normal distribution when constructing a confidence interval for the difference between two population means because the sample size is typically small and the population standard deviation is unknown. The t-distribution accounts for the additional uncertainty introduced by estimating the population standard deviation from the sample. It has heavier tails than the normal distribution, providing wider confidence intervals that better encompass the true parameter with a given probability, which is particularly critical when sample sizes are not large enough for the Central Limit Theorem to ensure normality .
A pivotal quantity is a function of the sampled data and unknown parameter which has a known probability distribution independent of the parameter. It is critical in constructing confidence intervals because it enables the derivation of intervals without directly knowing the parameter distribution. For instance, (n−1)S²/σ² follows a χ² distribution regardless of σ², allowing calculation of σ²'s confidence interval. This detachment from the parameter distribution simplifies interval estimation and ensures consistent estimators across trials, significantly enhancing the reliability and accuracy of parameter estimates under uncertainty .
Sample variance plays a crucial role in constructing confidence intervals for variance or standard deviation because it acts as an unbiased estimator of the population variance when the mean is unknown. Using the sample variance, (n−1)S², corrects for bias in estimating variability without needing to specify the mean. In constructing intervals for variance, reliance on sample variance allows the chi-square statistic to effectively compensate for missing mean information, thereby maintaining interval accuracy and providing conservative estimates reflective of real-world conditions with unknown population means .
To derive the MLE for 'p' in a Bernoulli distribution, consider a random sample, which follows the probability mass function f(x;θ)=p^x(1−p)^(1−x). The likelihood function is L(p)=p^Σx(1−p)^(n−Σx), where Σx is the sum of all observed data points. Taking the natural logarithm yields the log-likelihood lnL(p) = Σx*ln(p) + (n−Σx)*ln(1−p). Differentiating with respect to 'p' and solving for zero gives x − np = 0, leading to the MLE, ^p = Σx/n .
The Maximum Likelihood Estimator (MLE) is used in statistical inference to find the parameter values that maximize the likelihood function, which represents the probability of obtaining observed data. MLE provides a method to estimate parameters of probability distributions such as the mean, variance, and probability in various statistical models. It is especially significant because MLE estimators have desirable properties, such as being consistent, asymptotically normal, and efficient under certain regularity conditions .