0% found this document useful (0 votes)

28 views3 pages

Statistics Exam Format and Guidelines

Q: How is the correlation coefficient calculated, and what do its properties indicate about the relationship between two variables?

The correlation coefficient, calculated using the formula r = (nΣXY - ΣXΣY) / √((nΣX² - (ΣX)²)(nΣY² - (ΣY)²)), quantifies the strength and direction of a linear relationship between two variables. A value close to 1 indicates a strong positive correlation, -1 a strong negative correlation, and 0 no correlation. Properties include symmetry (r(X,Y) = r(Y,X)), invariance under linear transformations, and sensitivity to outliers. Calculating it for a sample with given statistics indicates the degree to which the variables move together .

Q: Describe the process and statistical significance of randomization in a complete randomized design experiment.

Randomization in a completely randomized design involves assigning experimental units to treatments entirely by chance, eliminating selection bias and balancing other confounding factors that may influence the outcome. This process ensures that the treatment effects are measured without bias. A significant F value in this context implies that at least one treatment mean differs significantly from the others, indicating a treatment effect worth further investigation .

Q: Under what conditions does the use of a finite correction factor become important, and when can it be safely ignored?

The finite correction factor becomes important when sampling without replacement from a finite population, especially when the sample size is more than 5% of the population size. It adjusts for potential bias in standard error estimation due to the finite nature of the population. The correction can be safely ignored if the sample size is less than 5% of the population size because the bias becomes insignificant, ensuring negligible effect on precision .

Q: How does the residual analysis in regression contribute to the validation of the regression model?

Residual analysis in regression involves examining the residuals (observed minus predicted values) to validate the appropriateness of the model. Residuals should generally add up to zero, implying no systematic deviation from the regression line. Analyzing patterns in residuals helps identify model inadequacies, such as nonlinear relationships or heteroscedasticity, and ensures assumptions of linear regression like homoscedasticity and normality of errors are met .

Q: Explain the purpose of frequency distribution and how the mode can be approximately derived using a known mean and median.

Frequency distribution organizes raw data in a summarized form to show the frequency of occurrence of each value or range of values. It helps in understanding the distribution pattern of the data. The mode of a frequency distribution can be approximated using the formula Mode ≈ 3(Median) - 2(Mean). Given a mean of 40.5 and a median of 36, the mode can be estimated by substituting these values into the formula, resulting in Mode ≈ 3(36) - 2(40.5) = 28.5 .

Q: What are the main characteristics and applications of Poisson distribution?

The Poisson distribution is characterized by its mean λ being equal to its variance. It models the number of times an event occurs in a fixed interval of time or space, given the events occur independently. It is applied in situations where events occur randomly and independently, such as modeling the number of emails received in an hour or the arrival of customers at a service point. The Poisson distribution is a limiting form of the binomial distribution when the probability of event is small and the number of trials is large .

Q: In the context of statistics, what is the significance of skewness, and how can it be measured for a probability density function?

Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. It is essential for understanding the shape of the distribution. A suitable measure of skewness is the third standardized moment or adjusted Fisher-Pearson coefficient. For the probability density function f(x) = k(x - x²) defined on [0, 1], skewness can be calculated by integrating the third moment about the mean and normalizing by the cube of the standard deviation. This calculation can help identify if the distribution leans towards the left (negative skewness) or the right (positive skewness).

Q: What role does the sampling distribution of the mean play in statistical inference, and how can it be derived from a given sample?

The sampling distribution of the mean is critical for statistical inference as it helps estimate population parameters and assess variability among sample means. It is derived by taking all possible samples of a given size from a population, calculating their means, and analyzing their distribution. This distribution allows the use of the Central Limit Theorem, which states that the distribution of sample means approaches a normal distribution as the sample size increases, regardless of the original population distribution .

Q: How is the paired t-test utilized to evaluate changes in weight before and after a specific treatment, and what does a 0.05 significance level imply?

The paired t-test compares two related samples, such as weights before and after treatment, to ascertain if there is a statistical difference in means. Using differences between paired observations, the test calculates the t-statistic, which follows a t-distribution under the null hypothesis. The 0.05 significance level implies a 5% chance of incorrectly rejecting the null hypothesis, indicating the test's threshold for determining statistical significance in weight changes due to treatment .

Q: What is a test statistic, and how is it related to hypothesis testing and Type I and Type II errors?

A test statistic is a standardized value derived from sample data used to perform hypothesis testing. It is compared against a critical value to determine whether to reject the null hypothesis. The p-value calculated from the test statistic indicates how extreme the observed data is under the null hypothesis. Type I error occurs when a true null hypothesis is mistakenly rejected, while Type II error occurs when a false null hypothesis is not rejected, both defined by the test's chosen significance level .

The document outlines the structure and requirements for the Competitive Examination-2025 for recruitment to BS-17 posts under the Federal Government in Statistics. It includes instructions for two parts: Part-I consists of multiple-choice questions (MCQs) with a maximum of 20 marks, while Part-II involves answering four questions from two sections for a total of 80 marks. The document also specifies rules regarding answer sheets, scoring, and the use of calculators.

Uploaded by

Muhammad Rafique Kalhoro

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views3 pages

Statistics Exam Format and Guidelines

Uploaded by

Muhammad Rafique Kalhoro

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

FEDERAL PUBLIC SERVICE COMMISSION

` Roll Number
COMPETITIVE EXAMINATION-2025 FOR RECRUITMENT
TO POSTS IN BS-17 UNDER THE FEDERAL GOVERNMENT
STATISTICS
TIME ALLOWED: THREE HOURS (PART-I MCQs) MAXIMUM MARKS: 20
PART-I (MCQs) : MAXIMUM 30 MINUTES (PART-II) MAXIMUM MARKS: 80
NOTE: (i) First attempt PART-I (MCQs) on separate OMR Answer Sheet which shall be taken back
after 30 minutes.
(ii) Overwriting/cutting of the options/answers will not be given credit.
(iii) There is no negative marking. All MCQs must be attempted.
PART-I (MCQs)(COMPULSORY)
Q.1. (i) Select the best option/answer and fill in the appropriate Box on the OMR Answer Sheet.(20x1=20)
(ii) Answers given anywhere else, other than OMR Answer Sheet, will not be considered.
1. Statistics deals with:
(A) Individuals (B) Particular facts (C) Isolated Items (D) Aggregative facts
2. The branch of Statistics that deals with procedure and methodology for obtaining valid conclusion
is called:
(A) Descriptive (B) Advance (C) Inferential (D) All of these
3. Sum of the absolute deviation is least when deviation is taken from:
(A) Mean (B) Median (C) Mode (D) Geometric Mean
4. In t-distribution, which one is true?
(A) Mean = Median = Mode (B) Mean > Median > Mode
(C) Mean < Median < Mode (D) All of these
5. What is the probability of sure event?
(A) 1 (B) 0 (C) 0.5 (D) 0.2
6. Which formula represents the probability of the complement of event A:
(A) 1 + P (A) (B) 1 - P (A) (C) P (A) (D) P (A) -1
7. In regression analysis, the variable that is being predicted is:
(A) Dependent variable (B) Independent variable (C) Intervening variable (D) None of these
8. Paired t-test is applicable when the observations in the two samples are:
(A) Equal in number (B) Paired (C) Correlation (D) All of these
9. The degree to which numerical data tend to spread about an average, is called:
(A) The Dispersion (B) Regression (C) Correlation (D) None of these
10. The types of estimates are:
(A) Point estimate (B) Interval estimates (C) Estimation of confidence region (D) All of these
11. Ranking scale also include the properties of ______________ scale.
(A) Nominal (B) Interval (C) Ratio (D) All of these
12. The difference between a statistic and the parameter is called:
(A) Sampling error (B) Random error (C) Non-random error (D) Probability
13. The standard deviation of any sampling distribution is called:
(A) Standard error (B) Non-sampling error (C) Type- I error (D) Type- II error
14. A survey conducted by a sampling design, is called:
(A) Sample survey (B) Population survey (C) Systematic survey (D) None of these
15. The sum of the frequencies of the frequency distribution of a statistic is equal to:
(A) Sample size (B) Population size (C) Possible samples (D) Sum of X values
16. Sampling error can be reduced by:
(A) Non-probability sampling (B) Increasing the population
(C) Decreasing the sample size (D) Increasing the sample size
17. The range of test statistic-t is:
(A) 0 to ∞ (B) 0 to 1 (C) -∞ to +∞ (D) -1 to +1
18. The probability associated with committing Type-I error is:
(A) β (B) α (C) 1 – β (D) 1 – α
19. The degree of freedom for paired t-test based on n pairs of observations is:
(A) 2n - 1 (B) n – 2 (C) 2(n - 1) (D) n – 1
20. Experimental error is due to:
(A) Experimenter’s mistakes (B) Extraneous factors
(C) Variation in treatment effects (D) None of these
**********
Page 1 of 3
STATISTICS
PART-II
NOTE: (i) Part-II is to be attempted on the separate Answer Book.
(ii) Attempt FOUR questions in all by selecting TWO Questions each from SECTION.
ALL questions carry EQUAL marks.
(iii) All the parts (if any) of each Question must be attempted at one place instead of at different
places.
(iv) Write Q. No. in the Answer Book in accordance with Q. No. in the [Link].
(v) No Page/Space be left blank between the answers. All the blank pages of Answer Book
must be crossed.
(vi) Extra attempt of any question or any part of the question will not be considered.
(vii) Use of Calculator is allowed.

SECTION-A
Q. No.2. (a) What is the purpose of frequency distribution and what are its desirable qualities? For a (10)
certain frequency distribution, the mean was 40.5 and the median 36. Find the mode
approximately using the formula connecting the three.

(b) Define Statistics. Discuss the importance of study of statistics by giving examples. How (10) (20)
it can help the extension of scientific knowledge?

Q. No.3. (a) Explain what is meant by the skewness of the distribution, and define a suitable measure (10)
of Skewness. For given p.d.f.
f (x) = k (x – x2), 0≤x≤1
Find the Skewness and discuss.

(b) Define the normal distribution and obtain its mean and variance. Show that for the (10) (20)
normal distribution, the mean, mode and median are the same.

Q. No.4. (a) If X has a binomial distribution, then show that E(X) = np, Var(X) = npq. Derive the (10)
m.g.f of the binomial distribution and explain its uses?

(b) What are the main characteristics of the Poisson distribution? Explain with the help of (10) (20)
examples. Also give its properties, applications and relationship with other distributions.

Q. No. 5.(a) From the following set of values: (10)

Y 6.5 5.3 8.6 1.2 4.2 2.9 1.1 3.0

X 3.2 2.7 4.5 1.0 2.0 1.7 0.6 1.9

1) Compute the residuals and verify that they add to zero and draw the conclusion
about the results.
2) Compute the standard error of estimate, sy.x.

(b) Under what conditions the correlation among the random variables exists. Also describe (10) (20)
the properties of correlation coefficient. Calculate the correlation co-efficient for a
sample of 20 pairs of observations, given that
Mean of X = 2, Mean of Y = 8, ∑ 𝑋 2 = 180, ∑ 𝑌 2 = 1424 and ∑ 𝑋𝑌 = 404
Also interpret its results.

SECTION-B

Q. No.6. (a) What is finite-correction factor? When is it appropriately used in sampling applications (10)
and when can it, without the great undesirable consequences, be ignored?

(b) Given the population 2, 4, 8, 8, 10, 10. (10) (20)

1) How many samples of size n = 2 can be drawn without replacement from this
population?
2) Compute and tabulate the sampling distribution of the mean for samples of size
n = 2.
Page 2 of 3
STATISTICS
Q. No.7. (a) Explain what is meant by: (10)
(i) a statistical hypothesis, (ii) test-statistic, (iii) test of significance,
(iv) level of significance, (v) type-I error and type-II error

(b) The weights of 4 persons before they stopped smoking and 5 weeks after they stopped (10) (20)
smoking are as follows:
Person 1 2 3 4
Before 148 176 153 116
After 154 176 151 121

Use the t-test for paired observations to test the hypothesis at the 0.05 level of
significance, that giving up smoking has no effect on a person’s weight.

Q. No.8. (a) Explain the procedure of randomization in a completely randomized design where we (10)
have 3 varieties of wheat and 18 experimental plots available. Also explain what is
explained by significant F value in an experiment.

(b) Four varieties of wheat were tried in a randomized complete block design in four (10) (20)
replications. Yield in kilogram per plot is shown in the table given below. Test the
hypothesis that there is no difference in the means of four varieties. α = 0.05.

Replicates Varieties
V1 V2 V3 V4
I 2 5 4 1
II 2 3 3 1
III 4 6 6 2
IV 1 4 2 3

***************

Page 3 of 3

Common questions