0% found this document useful (0 votes)
24 views19 pages

Mann-Whitney Test for Medians

This document discusses nonparametric statistics (NPS) which are used for data that is not normally distributed. It covers several nonparametric tests including the sign test, Mann-Whitney test, and Kruskal-Wallis test. Examples are provided for both the sign test and Mann-Whitney test. The sign test is used to test if the median of a single sample differs from a hypothesized value, while the Mann-Whitney test compares the medians of two independent samples to determine if they are identical.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views19 pages

Mann-Whitney Test for Medians

This document discusses nonparametric statistics (NPS) which are used for data that is not normally distributed. It covers several nonparametric tests including the sign test, Mann-Whitney test, and Kruskal-Wallis test. Examples are provided for both the sign test and Mann-Whitney test. The sign test is used to test if the median of a single sample differs from a hypothesized value, while the Mann-Whitney test compares the medians of two independent samples to determine if they are identical.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

CHAPTER 3:

STATISTICAL
INFERENCES (WEEK 10)
3.1 Introduction
3.2 Sampling distribution WEEK 8

3.3 Inference for single population


3.4 Inference for two populations WEEK 9

3.5 Nonparametric statistics WEEK 10


NONPARAMETRIC STATISTICS (NPS)
INTRODUCTION TO NPS

NPS is the alternative to test a non-normal distribution such as


@ flat distribution
@ peaked distribution
@ skewed distribution
(refer figure 3.9 page 77)
NPS is referred as distribution-free tests.

NPS use median in making inferences about a population (parametric tests use mean).

NPS is also used to infer non-numerical data that require ranking approach such as
1. Nominal data
2. Ordinal data
3. Interval scale or ratio scale data but there is no assumption regarding the probability
distribution of the population where the sample is selected.
Normal distributed data ➔ Parametric test (CI, HT, t-test, F-test,
ANOVA)
Non-normal distributed data ➔ Nonparametric test.

NPS are:
❑ Sign Test
❑ Mann-Whitney Test
❑ Kruskal Wallis Test
❑ Wilcoxon Signed Rank Test
❑ Spearman’s Rank Correlation Test
Sign Test
➔The simplest NPS
➔Test the value of the median from a single sample.
➔Convert the data value into +/- sign.
Sign Test Procedures
1. State the hypotheses (determine the type of test: 2-tailed test,
left/right tailed test).

Note: Mo is the hypothesized median


2. Determine the sign

Put a + sign for the value greater than the hypothesized median value
Put a - sign for the value less than the hypothesized median value
Put a 0 for the value equal to the hypothesized median value

3. Compute test statistic, k


4. Find the critical value from sign test table.
Information needed:
1. significance level, α
2. size sample, n
where n = total number of signs + and - signs

5. Make a decision
Reject Ho if test statistic, k ≤ critical value

6. Conclusion
Example 1:
The following data constitute a random sample of 15 measurement of the octane rating of a
certain kind gasoline:
99.0 102.3 99.8 100.5 99.7 96.2 99.1 102.5
103.3 97.4 100.4 98.9 98.3 98.0 101.6

Test the null hypothesis median = 98 against the alternative hypothesis median > 98 at 0.05
level of significance.
Solution:
1. H0: median = 98
H1: median > 98 (Claim) ➔ Right-tailed test
2. 99.0 102.3 99.8 100.5 99.7 96.2 99.1 102.5
+ + + + + - + + + sign = 12
103.3 97.4 100.4 98.9 98.3 98.0 101.6 -sign = 2
+ - + + + 0 + +/- sign = 14

3. This is right-tailed test, so test statistic, k = number of - signs = 2


cont:

4. significance level, α = 0.05


size sample, n = 14
➔ critical value = 3

5. Since test statistic, k = 2 ≤ critical value = 3 so we reject Ho .

6. There is enough evidence to support the claim that the median of the octane rating of
a certain kind gasoline is greater than 98.
Example 2:
An owner of a souvenir shop hypothesizes that the median number of items sold per day is 40.
A random of 20 days yields the following data for the number of items sold each day. At α =
0.05 test the owner’s hypothesis.
18 43 40 16 22
30 29 32 37 36
39 34 39 45 28
36 40 34 39 52

Solution:
1. H0: median = 40 (Claim)
H1: median ≠ 40 ➔ two-tailed test
2.
- + 0 - -
- - - - -
+ sign = 3
- - - + -
-sign = 15
- 0 - - +
+/- sign = 18
3. This is two-tailed test, so test statistic, k = minimum number between + and - signs = 3
cont:

4. significance level, α = 0.05


size sample, n = 18
➔ critical value = 4

5. Since test statistic, k = 3 ≤ critical value = 4 so we reject Ho .

6. There is enough evidence to reject the claim that the median number of items sold per
day is 40.
Mann-Whitney Test (MWT)
• To determine whether a difference exist between two populations median of non-
normal distribution.
• Sometimes called as Wilcoxon rank sum test.
• Equivalent parametric test to MWT is the t-test for two independent samples.

Mann-Whitney Test Procedures


1. State the hypotheses (determine the type of test)
2. Rank the data values
[Link] all the data from the two samples (regard them as 1 sample).
2. Rank the data from smallest to largest (from 1 and so on, if there is tie
data, each of the data will get the average rank of the data).
3. Calculate the test statistic
1. Label sample 1 and sample 2. Let:
- sample 1 ➔ smaller sample size between the two independent samples.
- if the both samples have same size, either one can be regard as sample 1.
2. List the ranks of data values ranked in step 2 for both sample 1 and sample 2.
3. Calculate the sum of ranks for both samples.
Test statistic, T is based on:
where ƩR1 = sum of ranks from sample 1
n1 = sample size of sample 1
n2 = sample size of sample 2

Summary of test statistic for Mann-Whitney test


4. Find and calculate critical value, Tcv
Tcv = [TL , TU ]
where TL ➔ find from table of MWT for given α, n1 and n2.
TU = n1(n1+ n2+1) - TL

5. Make a decision base on:

Note:
means not included.

6. Conclusion
Example:
Data below show the marks obtained by electrical engineering students in an
examination:
Gender Marks
Male 60
Male 62
Male 78
Male 83
Female 40
Female 65
Female 70
Female 88
Female 92

Can we conclude


= 0.1
the achievements of male and female students identical at significance
level ?
Solution:
1. H0: There is no difference in the achievements of male and female students
H1: There is a difference in the achievements of male and female students ➔ two-tailed test
Gender Marks Rank
2.
Male 60 2
Male 62 3
n=4 Thus n1 is
Male 78 6
Male 83 7 sample from
Female 40 1 male and n2 is
Female 65 4 sample from
Female 70 5 n=5 female
Female 88 8
Female 92 9

n1 = 4this
3. Since
We have 5; T1 = test,
, n2is= two-tailed R1 =thus
2 + 3test
+ 6statistic, T1* = 4 ( 4 +(T51+, T
+ 7 = 18;T = minimum 1)1*−)18 = 22
T = min (T1 ,T1* ) = min (18, 22 ) = 18
4. Critical value, Tcv = [TL , TU ]
α = 0.1 ➔ α/2 = 0.05 ; n1 = 4, n2 = 5, thus from table TL = 13
Calculate TU = n1(n1+ n2+1) - TL = 4(4+5+1) – 13 = 27
➔ Tcv = [13,27]

5. Make a decision
T  TL , TU 
For two-tailed test, we reject H0 when
Since T = 18 ϵ Tcv = [13,27], thus we we fail to reject H0 .

6. Conclusion
There is not enough evidence to support the claim that there is a difference in the
achievement between male and female students.

Common questions

Powered by AI

The Spearman's Rank Correlation Test is preferable over Pearson's Correlation when the data have a skewed distribution, contain outliers, or are ordinal in nature. It measures the strength and direction of the monotonic relationship between two variables, using ranks rather than raw data, which makes it less sensitive to anomalies and non-parametric data distributions .

The Mann-Whitney Test, also known as the Wilcoxon rank-sum test, differs from the t-test as it is a nonparametric alternative that does not require normally distributed data. It is used to determine whether there is a difference between the medians of two populations with non-normal distributions. Steps include: 1) Stating the hypotheses, 2) Ranking all combined sample data, 3) Calculating test statistics based on rank sums, 4) Determining critical values from tables, and 5) Making a decision based on comparing test statistics with critical values. This test is equivalent to the t-test for two independent samples when assumptions for the t-test do not hold .

The procedure for conducting a Sign Test involves the following steps: 1) State the hypotheses to determine the type of test (two-tailed or one-tailed), 2) Assign '+' for values greater than the hypothesized median and '-' for values less; '0' for values equal to the median, 3) Compute the test statistic, k, which is the number of minus signs, 4) Find the critical value from the sign test table using the significance level and sample size, and 5) Make a decision by rejecting the null hypothesis if k is less than or equal to the critical value. This test is used to determine if the true population median differs from a specified value .

The test statistic in the Mann-Whitney Test is determined by assigning ranks to the combined data from two samples and calculating the sum of these ranks for each sample. The test statistic, T, is then the smaller or larger rank sum, depending on the sample strategy. Its value, when compared to critical values, indicates whether there is a statistically significant difference between the medians of the two samples, suggesting they do not come from the same distribution if significant .

Nonparametric tests differ from parametric tests as they do not assume a specific probability distribution for the population from which a sample is drawn, making them applicable for non-normal distributions such as flat, peaked, or skewed distributions. They use the median to make inferences about a population rather than the mean, which is used in parametric tests. Nonparametric tests are suitable for non-numerical data that require a ranking approach, including nominal, ordinal, interval scale, or ratio scale data .

The Kruskal Wallis Test is a nonparametric method used for comparing more than two groups when data do not necessarily follow a normal distribution. It extends the Mann-Whitney Test for multiple groups to test the null hypothesis that all groups have the same distribution. The test involves ranking all data together, calculating the sum of ranks for each group, and using these sums to determine the test statistic. A significant test statistic indicates at least one group distribution differs. This test is crucial in scenarios where parametric ANOVA assumptions are violated .

Nonparametric tests, including the Wilcoxon Signed Rank Test, are used instead of parametric tests when the data do not meet the assumptions necessary for parametric tests, such as normal distribution or when dealing with ordinal data. The Wilcoxon Signed Rank Test is particularly suitable for paired samples or repeated measures to test if their population median differences are zero. It ranks the absolute differences of pairs and evaluates signs, providing a robust alternative to the paired t-test for non-normally distributed data .

Using medians rather than means in nonparametric statistical inferences is significant because medians are robust measures not influenced by outliers or skewed data, making them more applicable for non-normally distributed data. This approach allows nonparametric tests to handle a wider range of data types effectively, focusing on the central tendency for ordinal and non-numerical data .

The advantages of the Sign Test include its simplicity and minimal assumptions, making it suitable for small sample sizes and nominal data. It only requires the data to be converted into '+' and '-' signs relative to a median, offering robust alternatives in these scenarios. However, the limitations include its low power compared to other tests, as it ignores the magnitude of differences, potentially overlooking real differences when sample sizes are larger or when more complex data analysis is required .

A researcher might choose the Sign Test over the Wilcoxon Signed Rank Test when the data do not meet the assumptions required for the Wilcoxon Signed Rank Test, such as symmetry of distribution or when the sample size is too small to provide reliable results from more complex tests. The Sign Test is simpler, converting data to '+' and '-' based on a median, and is used when only the signs of differences matter rather than their magnitude, offering a straightforward nonparametric alternative .

You might also like