0% found this document useful (0 votes)
21 views2 pages

Statistics Assignment on Central Tendency and Probability

The document is an assignment worksheet for a Basic Statistics course at Addis Ababa University, containing various statistical problems and questions. Topics covered include measures of central tendency, skewness and kurtosis, probability calculations, and linear regression analysis. Students are required to solve specific questions related to these topics, demonstrating their understanding of statistical concepts and methods.

Uploaded by

Dems Zed Bami
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views2 pages

Statistics Assignment on Central Tendency and Probability

The document is an assignment worksheet for a Basic Statistics course at Addis Ababa University, containing various statistical problems and questions. Topics covered include measures of central tendency, skewness and kurtosis, probability calculations, and linear regression analysis. Students are required to solve specific questions related to these topics, demonstrating their understanding of statistical concepts and methods.

Uploaded by

Dems Zed Bami
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

ADDIS ABABA UNIVERSITY

SCHOOL OF COMMERCE
BASIC STATISTICS – Stat2131
Worksheet II/Assignment II
Assignment Questions: 2, 3, 5, 7, 9, 14, 18, 24, 25, 30 and 31

1. Why are measures of central tendency not sufficient to study a frequency distribution?
2. Briefly explain the meaning, purpose and types of Skewness and Kurtosis and give practical
examples.
3. In a final exam in a Statistics course, the mean mark of a group of 150 students was 78 and
the standard deviation is 8. In Mathematics exam of the same group, the mean was 73 and
the standard deviation was 7.6.
a) In which group was there a greater dispersion?
b) Suppose a student scored 75 in Statistics and 71 in Mathematics. In which exam was
his relative standing better?
c) Suppose Student “A” scored 80 in Statistics and 75 in Mathematics while student “B”
scored 75 in Statistics and 80 in mathematics, whose performance is better A or B?
4. Some characteristics of annually family income distribution (in Birr) in two regions is as
follows:
Region Mean Median Standard Deviation
A 6250 5100 960
B 6980 5500 940
a) Calculate coefficient of skewness for each region
b) For which region is, the income distribution more skewed. Give your interpretation
for this Region
c) For which region is the income more consistent?
5. Let E, F, and G be three events. Describe the following events using set notation.
a. Only E occurs
b. Both E and G but not F occurs
c. At most one of them occurs
d. At most two of them occur
e. Exactly two of them occur
6. If the permutation of the word WHITE is selected at random, find the probability that the
permutation
a) Begins with a consonant
b) Ends with a vowel
c) Has a consonant and vowels alternating
7. A lot consists of 10 good articles, 4 with minor defects, 2 with major defects. One article is
chosen at random. Find the probability that:
a) It has no defects.
b) It has no major defects.
c) It has either good or has major defects.
8. For a certain population of employees, the percentages passing and failing a job
competency exam, listed according to sex, were as shown in the accompanying table. That
is, of all the people taking the exam, 24% were in the male-pass category, 16% were in the
male-fail category, and so forth. An employee is to be selected 3 randomly from this
population. Let A be the event that the employee scores passing grade on the exam and M
the event that a male is selected.
a) Are the events A and M independent?
b) Are the events A’ and F independent?
c) P(A’UM) D. P(A|F)
sex
Outcomes Male(M) Female (F) Total
Pass (A) 24 36 60
Fail (A’ ) 16 24 400
Total 40 60 100

9. Let A and B be two events associated with a random experiment. Suppose that P(A)=0.4
and P(AUB)=0.7. Let P(B)=α
a. For what choice of α are A and B mutually exclusive?
b. For what choice of α are A and B independent?
c. What is the P(A|AUB)?
10. The probability that a freshman entering at Commerce School will survive first semester is
0.92. (From 2021/22 academic year statistics). Assuming this pattern remain unchanged
over the subsequent years, what is the probability that among 100 randomly selected
freshmen in first semester,
a. None will survive?
b. Exactly 97 will survive?
c. At least three will survive?
11. The following data were collected from a certain household on the monthly income (X) and
consumption (Y) for the past 10 months.
X: 650 654 720 456 536 853 735 650 536 666
Y: 450 523 235 398 500 632 500 635 450 360

a) Fit a linear regression line and comment on the result.


b) Calculate the correlation coefficient and comment on it.
c) Estimate the amount of consumption for monthly income amount of 900.

Common questions

Powered by AI

The coefficient of variation (CV) is the ratio of the standard deviation to the mean, expressed as a percentage, and is a standardized measure of dispersion. A higher CV indicates greater relative variability regardless of the units used. Comparing the CV of two groups allows for a direct comparison of their relative variability. In the provided data, the CV for the Statistics exam is (8/78)*100 ≈ 10.26%, and for the Mathematics exam, it's (7.6/73)*100 ≈ 10.41%. Although both are similar, the Mathematics scores exhibit slightly more relative dispersion.

The correlation coefficient, ranging from -1 to 1, quantifies the degree of linear relationship between two variables. A value close to 1 indicates a strong positive relationship, meaning as income increases, consumption also increases. Conversely, a value close to -1 indicates a strong negative relationship, with consumption decreasing as income increases. A value around 0 suggests no linear relationship. In this specific context, calculating the correlation will reveal if there is statistically significant linear dependence, and its magnitude speaks to how predictably changes in income affect consumption.

Identifying whether two events are independent is crucial because it affects probability calculations and statistical modeling. Independent events suggest that the occurrence of one event does not affect the probability of the other. This simplifies the calculation of joint probabilities, P(A and B) = P(A)P(B), and has implications for causal inference, interpretation of interactions in regression models, and experimental design. Ensuring independence helps maintain model validity, avoiding overestimation or underestimation of relationships in data analyses.

Mutually exclusive events cannot occur simultaneously. An example could be flipping a coin, where 'heads' and 'tails' are mutually exclusive because the coin cannot land on both sides at once. In terms of probability, for mutually exclusive events A and B, P(A ∩ B) = 0, meaning the probability that both occur is zero. To illustrate, one can construct a Venn diagram representing these events as non-overlapping circles, clearly visualizing the impossibility of their simultaneous occurrence.

The binomial distribution models the number of successes in a fixed number of binary trials, such as surviving a semester. Assuming a constant survival probability per trial (p=0.92) and independent trials, this distribution can estimate the probability of various survival scenarios among 100 students. For example, the probability of exactly 97 surviving can be calculated using the formula P(X=k) = C(n, k) p^k (1-p)^(n-k), where C(n, k) is the combination of n items taken k at a time. Key implications include the assumption of identical survival probability for each student and independent survival trials, which may not hold if influencing factors vary, such as changes in curriculum difficulty or student preparedness.

Skewness and kurtosis describe the shape of a probability distribution. Skewness measures the asymmetry of a distribution around its mean, indicating whether data tail to the left or right. Positive skewness means a long right tail, negative skewness a long left tail. Kurtosis describes the 'tailedness' of a distribution, indicating how outlier-prone it is. High kurtosis implies heavy tails or outliers, while low kurtosis indicates light tails. Unlike mean and standard deviation, which focus on data's central tendency and spread, skewness and kurtosis offer deeper insights into data distribution, indicating potential deviations from normality and guiding data transformations or statistical modeling approaches.

Using Bayesian statistics, the belief (prior probability) about the failure rate can be updated using sample data (likelihood). The posterior probability P(F|Data) can be calculated as P(Data|F)P(F)/P(Data), where P(Data|F) is the likelihood of observing the data given F (failure), P(F) is the prior probability of failure, and P(Data) is the marginal likelihood of the data. This approach allows incorporation of new information to refine the assessment of failure probability dynamically, crucial in environments where conditions and observations are subject to change.

Standard deviation measures the dispersion of data points from their mean, acting as an indicator of data consistency. A lower standard deviation implies that data points are closely clustered around the mean, indicating high consistency or reliability. Conversely, a higher standard deviation means data is more spread out, suggesting less consistency. In comparative analyses, datasets with lower standard deviations are deemed more predictable and stable, crucial in fields like quality control or when forecasting where reliable predictions are necessary.

In income distributions, skewness affects the relationship between mean and median. In a positively skewed distribution, the mean exceeds the median because high incomes shift the mean rightward. Conversely, in negatively skewed data, the median exceeds the mean as low incomes pull the mean leftward. Analyzing these differences helps identify skewness directionality. For instance, in Region A, with a mean of 6250 and a median of 5100, higher mean suggests positive skewness. Understanding these metrics aids in selecting appropriate statistical methods, such as log transformations for normalization or interpreting central tendency meaningfully.

Measures of central tendency, such as mean, median, and mode, provide information about the central value of a dataset but do not account for the spread or variability within the data. For instance, two datasets could have the same mean but drastically different distributions. To understand the distribution's shape and spread, measures such as variance, standard deviation, skewness, and kurtosis are necessary. Variability measures give insights into data consistency, outliers, and overall data distribution that central tendencies alone can't provide.

You might also like