0% found this document useful (0 votes)
18 views23 pages

Introduction to Advanced Statistics

The document provides an overview of statistics, defining it as the science of collecting, organizing, analyzing, and interpreting data to aid decision-making. It discusses the two main divisions of statistics: Descriptive and Inferential, along with basic concepts such as variables, constants, and types of data. Additionally, it covers methods for data collection, presentation, and the importance of statistical measures like central tendency in various fields.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views23 pages

Introduction to Advanced Statistics

The document provides an overview of statistics, defining it as the science of collecting, organizing, analyzing, and interpreting data to aid decision-making. It discusses the two main divisions of statistics: Descriptive and Inferential, along with basic concepts such as variables, constants, and types of data. Additionally, it covers methods for data collection, presentation, and the importance of statistical measures like central tendency in various fields.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd

ADVANCED STATISTICS

BY: DR. RAMON ROMANO

LESSON 1

PREVIEW OF STATISTICS

One of the peculiar characteristics that set man apart from other animals is his
capacity to capitalize from the achievements of his ancestors. Statistics offers a striking
example of man’s attempt to employ the records of his past in solving the problems of the
present and plot the course of the future.

There is a peculiar logic associated with the study of statistics but ultimately it is
really common sense. Although it is difficult to dissociate the purely mathematical
aspects in statistical activities but there is no reason why it should involve complex and
sophisticated mathematical skills. A basic background in mathematics is sufficient.
Moreover, with the availability of calculators, wherein lies most of the errors, had been
eliminated.

Definition and Divisions of Statistics

Statistics may be defined as the science that deals with the collection,
organization, presentation, analysis and interpretation collections of data in order to be
able to draw judgments or conclusions that help in the decision-making process.

There are two main divisions of statistics. These are: Descriptive Statistics and
Inferential Statistics.

Descriptive Statistics deals with procedures that organize, summarize and


describe quantitative data. It seeks merely to describe the data. The procedure basically
involves summarizing the data in various categories and then indicating the quantities and
percentages corresponding to each category. Data are organized in this way become
much easier to describe than if data were simply presented in raw or unorganized form. A
basic tool used in Descriptive Statistics is the Frequency Distribution.

Inferential Statistics deals with making judgment or a conclusion based on the


findings from a sample that is taken from the population. The members of the sample,
chosen as to be representative of the population consists of only a small part of the entire
population. Provided the sample is taken carefully, whatever conclusions are made about
the sample may also be considered true of the entire population.
Basic Concepts of Statistics

The following are basic terms used in statistics:

Variable is a quantity that may assume any of a set of values. Examples are
monthly income, average grade, volume, price and so forth.

Constant is a quantity that does not change its value. For example, the
mathematical symbol ( ) the Greek alphabet pi is constant because its value does not
change which is always 3.1416. Likewise, the equivalence of an inch is 2.54 centimeters
is constant.

Ungrouped Data are data which are not organized in any specific way. They are
simply the collection of data as they are gathered.

Grouped Data are organized into groups or categories with corresponding


frequencies. Organized in this manner, the data is referred to as a Frequency Distribution.

Population is the entire collection of all possible observations of a particular


characteristic of interest. Example, the population of the grades of all students.

Sample is a representative set of observations that reflects the characteristic of the


whole, that is, the population from which it is taken.

Parameter is any statistical characteristic of a population. Example, the Mean and


the Standard Deviation. Thus we say the population mean is a parameter of the
population.

Statistic is any statistical characteristic of a sample such as the Mean and Standard
Deviation. The Sample Mean is a sample statistic.

Uses and Importance of the Science of Statistics

Knowing what is the present and what is happening, derived from the mass of
information gathered is useful enough to many fields of human endeavor such as
business, medicine, politics, and others. But the ultimate utility of statistics is as an aid in
planning and decision making. The businessman may want to know how he fares in
relation with his competitors, what goods are most salable, how many shoes and sizes to
order in anticipation of a peak season. These he can decide with the aid of statistics.
Some notable business tycoons insist that rely more on gut feeling, intuition and hunches
but it is doubtful if they are really sincere on this, for all we know they may have
unconsciously digested statistical information in their heads and analyze these.

Admittedly, a decision-maker, in arriving at a decision, is influenced by many


factors, such as exigencies, government regulations and even his personal likes and
dislikes. But the greater part of the reason why he decided in a given manner is because
he expects it to be a good decision based upon his inferences from the information at
hand. In short, he should not be making a blind guess but an “educated guess”. A layman
may scoff at the idea that future events can be predicted with a reasonable amount of
certainty by mathematical means but this has been demonstrated many times and found to
be anchored on sound principles.

In fact, in the field of medicine many discoveries had been made, causes and
cures were found by experimental method relying heavily on statistical formulas.

The Nature of Statistical Data

Statistics can handle only quantitative data that is number. The quantity may be
derived from counting, which will give exact values or from measurements, which will
give approximate values because measured quantity can never be exact.

The values may be continuous (that will allow values in between) or dichotomous
(that which allows only two sides: absent-present; wrong-correct; male-female). In
several instances the study will involve qualitative or descriptive data which by
themselves cannot be subjected mathematical treatment. The procedure is to translate
these quantitative data to quantitative data by assigning weights or values to the
corresponding quality or description. For example: Always = 4; Often = 3; seldom = 2;
never = 1; or Excellent = 5; Very Good = 4; Good = 3; Fair = 2; and Poor = 1.

Inferences Derived from Statistical Data

A mass of raw data would be meaningless unless systematically arranged and


presented and therefrom would yield valuable inferences such as:

1. central tendencies
2. degree of variability, dispersion or scatter
3. proportions and percentages
4. trends and drifts and tendencies
5. skewness, kurtosis
6. degree of reliability
7. variations, fluctuations, cycles

These can be further used for projections, predictions and prognosis applying the
theories of regression analysis, which in turn can be the basis of decision making.

Methods of Generating Data

There are several methods of generating or collecting data which are to be


organized, analyzed and interpreted. These are the following:
 Registration Method. Under this method, data maybe gathered easily from both
government and private offices. Examples are registration data on births, deaths,
marriages, motor vehicles OFWs, etc.
 Questionnaire Method. In this method, data are acquired by means of a
questionnaire which consists of a number of carefully prepared questions aimed at
eliciting answers from respondents who may have been selected randomly.
 Interview Method. This method employs a person-to-person encounter between
the interviewer and interviewee. In this method, the questions are outlined before
an interview is arranged. The advantage of this method is that the interviewer can
modify or repeat questions to suit the interviewee and to elicit better responses.
 Observation Method. This method is used when the data that the
investigator/researcher wishes to gather pertain more to the behavior of an
individual or group of individuals. It is used especially when the subjects
observed are unable to convey information by talking or writing.
 Experiment Method. This method is used when experiments are performed in a
laboratory where conditions are controlled.

Methods of Data Presentation

 Textual Form. Findings are described and verbally explained, although figures
may be cited in the text.
 Tabular Form. This method uses statistical tables in presenting data.
The table consists of a number of columns with headings and several
rows of figures. Usually, the three basic columns are those of
category, quantity and the percentage.
 Graphical Form. This method usually goes together with the tabular
presentation of data. The graphs employed are usually a combination
of the bar graphs, pictographs and pi charts.

Classification of Data

After the information is gathered these will be sorted or classified in a systematic


manner. There are number of ways in which statistical data may be arranged but the
following are some of the types:

1. Classification based upon difference of kind


2. Classification based upon difference of degree of given characteristics
3. Geographical divisions
4. Time series

These data are best recorded in tabular form for easy reference and treatment.
LESSON 2

THE FREQUENCY DISTRIBUTION

Generally, data collected from different sources are usually unorganized and in a
form unsuitable for immediate interpretation. In any statistical investigation, once
pertinent data are already gathered, the next step is to present such data in organized form
using appropriate tables and graphs.

Steps:

1. Determine the highest and the lowest scores.

2. Get the value of the range. The range is denoted by R, refers to the difference
between the highest and the lowest value in the distribution.

Range = Highest Score – Lowest Score

3. Determine the interval size by dividing the range by the desired number of classes.
Divide the range by 10 and 20 in order that the size of class limits may not be less
than 10 and not more than 20 provided that such class will cover the total range of the
observations. These will meet the requirements of most sets of data. The rule says,
that we should prefer not less than 10 or not more than 20 class limits. And the Ideal
class limit is between 10 to 15, inclusive. In choosing the class interval, odd number
is preferable.

4. Determine the class limits of the class intervals. The bottom interval must include the
lowest score.

5. Tally the frequencies for each class interval. The tally should be carefully checked if
the sum is equal to the total number of scores (cases). At the bottom of Column F the
symbol N or ∑f in which ∑ (capital Greek sigma) stands for the “sum of” or the total
number of cases (N).

6. Get the sum of the frequency column and check it against the total number of
observations or cases.

Illustrative Example:

The following are the marks obtained by a group of 40 university students on a


Statistics examination:

80 85 55 75 61 64 66 89
77 56 53 72 82 57 70 96
76 54 60 84 77 52 62 95
75 84 88 59 75 84 65 87
60 63 76 62 92 72 90 92
Solution:
Range = Highest Score – Lowest Score

Highest Score = 96 Lowest Score = 52

Range = 96 – 52 44 / 16 = 2.45 or 3
= 44 interval with or i= 3

_______________________________________________________

Class Interval Tally Frequency <CF


_______________________________________________________

95-97 II 2 40
92-94 I 1 38
89-91 II 2 37
86-88 II 2 35
83-85 IIII 4 33
80-82 II 2 29
77-79 II 2 27
74-76 IIII 5 25
71-73 III 3 20
68-70 I 1 17
65-67 II 2 16
62-64 IIII 4 14
59-61 IIII 4 10
56-58 II 2 6
53-55 III 3 4
50-52 I 1 1
_______________________________________________________

Total n = 40
_______________________________________________________

LESSON 3

THE MEASURES OF CENTRAL TENDENCY

A measure of central tendency is popularly known as an average. This may be


referred to as a single number, which will be used in some definite way to indicate the
central value of an entire group of observations or individuals where this central value
represents all the figures in a group of which it is a part. In other words, an average is a
measure of central tendency where a single central value can stand for the entire group of
figures as typical of all the values in the group.

There are actually three measures of central tendencies, namely, Mean, Median
and Mode.

The Mean

The mean is the most frequently used measure of central tendency because it is
subject to less error; it is rigidly defined; and it is also easily calculated. Moreover, it
lends itself to algebraic manipulation; its standard error is less than the median, and the
sum of the deviation of the cases about the mean is zero.

The Computation of the Mean from Ungrouped Data:

The mean of ungrouped data is determined as the sum of all scores divided by the
number of cases. Consider the following scores: 8, 5, 10, 9, 7, 8, 11, 15, 14, 5, 4, 19, and
7. The mean of these scores is 9.47.

In general, fi the scores are represented by the symbols X 1, X2, X3, . . . Xk, the
mean in algebraic language is

X = X1, X2, X3, . . . Xk or ∑X


N N

The formula of finding the mean is simply written as

X = mean

∑X = sum of all scores

N = number of cases

X
8
5
10
9 X = ∑X ∑X = 142
7
10 N N = 15
8
11 = 142
15 15
14
10 = 9.47
5
4
19
7
142

The mean is 9.47. It is well to note that ∑X is equal to N x X. This information is useful
in a variety of situations

Moreover, ungrouped data may mean the number of cases is less than 30.

Finding the Mean of the Grouped Data

There are two methods in determining the mean of the grouped data, namely, (a)
by Midpoints Method, and (b) by Class-Deviation Method

a) The computation of grouped data using the Midpoints Method

Steps:

1. Compute the midpoints of all class limits by averaging the Lower


Limits and the Upper Limits.

Midpoint = LL + UL
2
2. Multiply the Midpoint by the Frequency
3. Sum the product of Midpoints times Frequencies.
4. Divide the sum by the total number of cases (N) to obtain the mean.
5. Apply the formula

X = ∑fM
------
N

Computation of the Mean Using the Midpoint Method


_______________________________________________________

Class Interval Midpoints Frequency


M f Mf
_______________________________________________________

95-97 96 2 192
92-94 93 1 93
89-91 90 2 180
86-88 87 2 174
83-85 84 4 336
80-82 81 2 162
77-79 78 2 156
74-76 75 5 375
71-73 72 3 216
68-70 69 1 69
65-67 66 2 132
62-64 63 4 252
59-61 60 4 240
56-58 57 2 114
53-55 54 3 162
50-52 51 1 51
_______________________________________________________

Total n = 40 2904
_______________________________________________________

X = ∑fM ∑fM = 2904


------
N N = 40

= 2904
40

= 72.6

b) Computation of the mean of grouped data using the Class-Deviation


Method. This method gives a shorter way in computing the mean in a form of
frequency distribution. The obtained in this method is the same as in the
Midpoint Method. This method is known as class-deviation method because it
deals with deviation of the observed values instead of raw scores from an
arbitrary origin in any of the class limits. The point of origin that we
arbitrarily choose is zero. If class limits are arranged from highest to lowest,
above zero deviation is positive and below, negative; and if arranged from
lowest to highest score, above zero is negative and below zero deviation is
positive.

Steps:
1. Choose a temporary arbitrary origin from any of the class limits either
at the center, bottom or at the top.

2. Assign to the class limits coded values, starting with zero at the origin
and with positive values above the zero and negative values below.
This deviation appears in Column d.

3. Multiply the d by the corresponding class frequency f to get fd. Thsese


products are shown in Column fd.

4. Sum the fd product algebraically. The symbol is ∑fd.

5. Compute the mean by using the formula

X = M0 + C ( ∑fd )
N

Computation of the Mean Using the Deviation Method


________________________________________________________________

Midpoints Frequency Deviation Frequency x


Class Limits (M) (f) (d) Deviation (fd)
________________________________________________________________

95-97 96 2 7 14
92-94 93 1 6 6
89-91 90 2 5 10
86-88 87 2 4 8
83-85 84 4 3 12
80-82 81 2 2 4 56
77-79 78 2 1 2
74-76 75 5 0 0
71-73 72 3 -1 -3
68-70 69 1 -2 -2
65-67 66 2 -3 -6
62-64 63 4 -4 -16 -88
59-61 60 4 -5 -20
56-58 57 2 -6 -12
53-55 54 3 -7 -21
50-52 51 1 -8 - 8
________________________________________________________________

Total 40 -32
________________________________________________________________

X = M0 + C ( ∑fd ) M0 = 75
N

= 75 + 3 ( -32 ) ∑fd = -32


40

= 75 + ( -96 ) C = 3
40

X = 72.6 N = 40

The mean value of 72.6 in using the Midpoint Method and the Class-Deviation is the
same.
The Computation of the Median

Another measure of central tendency that is commonly used is the median. The
median is defined as a point on a scale such that scores above or below it lie 50 percent of
the cases. It may or may not stand for a score.

a) The median from ungrouped data. The median of a set of ungrouped data is
obtained arranging the scores from highest to lowest, and pick out the middlemost score
from its order of magnitude if the set of scores is odd. When the set of values is even, the
median is obtained by computing the midpoints of two middle scores. To illustrate,
consider the set of scores below:

97
95
92
90 Median
88
85
80

In the above example, there are 7 scores. Locate a point such that 3 scores fall
above the median and 3 scores below. Thus, the median is 90.

When set of scores is even, compute the median be getting the average of the two
middlemost scores from its point of order of magnitude. To illustrate, consider the even
set of scores below:

85
84
82
80 80 + 75 155
75 --------- = ------
73 2 2
72
70 Median = 77.5

The above set of scores is 8. The median is between 80 and 75. Thus, 80 plus 75 equals
155 divided by 2 equals 77.5. Hence, the median of the foregoing set of scores is 77.5

b) The Median from Group Data

The median from grouped data in a form of frequency distribution is determined


by following the steps:

Step 1. Estimate the cumulative frequencies.

Step 2. Find N/2, or one-half of the number of cases in the distribution.

Step 3. Determine the class limit in which the 50 percent.

Step 4. Compute the median by using the formula

Median = L + C ( N/2 – ∑Cf < )


Fc

In which L = the exact lower limit of the median class

N = the total number of cases

∑Cf < = the sum of the cumulative frequencies “lesser than” up to but
below the median class

Fc = the frequency of the median class

C = the class interval

The Computation of the Median from Grouped Data


_______________________________________________________
Class Interval Frequency CF
_______________________________________________________

95-97 2 40
92-94 1 38
89-91 2 37
86-88 2 35
83-85 4 33
80-82 2 29
77-79 2 27
74-76 5 25
71-73 Fc = 3 20
68-70 1 17 ∑Cf <
65-67 2 16
62-64 4 14
59-61 4 10
56-58 2 6
53-55 3 4
50-52 1 1
_______________________________________________________
Total 40
_______________________________________________________

Median = L + C ( N/2 – ∑Cf < )


Fc

= 70.5 + 3 (20 – 17)


3

= 70.5 + 3 ( 0.1)

= 70.5 + 3

Median = 73.5

The Computation of the Mode


The mode is another measure of central tendency. It may be defined as a value in
a set of scores that occur most frequently, and can be found by mere inspection.

The mode from ungrouped data. In ungrouped data, mode can be easily be
determined by inspection. It is classified into unimodal, bimodal, trimodal, and
polymodal.

Illustrative Example:

Given these ungrouped scores, find the mode.

95, 89, 91, 89, 80, 90, 93, 89, 92, 87


The score that occurs so many times is 89. This is the mode (unimodal).

In a set of scores 90, 95, 89, 89, 80, 92, 87, 91, 89, 92, 93, 92, the scores that
appear three times are 89 and 92. These scores are the modes having equal number of
occurrence. These are called bimodal.

In these group of scores 89, 92, 89, 90, 87, 92, 91, 92, 91, 89, 90, 90, there are
three scores having equal frequencies 89, 90 and 92. Thus, they are the modal scores and
we call it trimodal.

SAMPLING
Sampling Design

This refers to the scheme of arriving at the sample which involves specification of
the target, the respondent population and the method of selecting them.

Basic Concepts of Sampling

Sampling – is the process of choosing a representative portion from population

Samples – are the representatives taken from the target population

Population – refers to the entire group or set of individuals or items which is the
focus of an investigation. It is also called the universe. A population is further
distinguished by its role in the study, thus, the following types:
a. Topic Population – may be people things, plants or animals. It is the group or
set about which generalization will be made.
b. Respondent Population – refers to a group or set of individuals who will
furnish the needed information on which the generalization is based.
c. Target Population – refers to the group or set of individuals or items from
which or about which representative information is originally desired.

There are basically two types of sampling, namely: (1) probability sampling and
(2) non-probability sampling.

Probability Sampling

1. Probability sampling is a type of sampling wherein the selection of samples is


done with the members of the population having equal chance to be selected as part of the
representative or sample. It is further classified into the following groups:

a. Simple Random Sampling. In this type of sampling every member of the


sample has an equal chance of being chosen to be included in the sampling. It
is the simplest probability sampling, which is usually done by using lottery or
raffle method of getting the samples. This method is done by listing all the
names of the members of the population from the first to the last member.
Write their individual names/numbers in small pieces of paper then place
these in a box and draw them after shaking the box very well until the total
samples are withdrawn.

b. Stratified Random Sampling. It is the selection of samples from the different


classes or strata of the population involved in the research. Each class is
treated as different population. A simple random sampling is then used in each
class with proportionate and equal percentage of representation from each
stratum.
c. Systematic Sampling. This technique involves the selection of the desired
number size in a list by arranging them systematically or logically in either
alphabetical arrangement of or any acceptable organization.

d. Cluster Sampling. This sampling technique involves the selection of the


samples in a group and is usually applied on a geographical basis in a
heterogeneous population. An example of this is selecting a sample of
teachers from different regions/cities which are involved in the study.

2. Non-Probability Sampling. This is a type of sampling wherein no system of


selection is employed and the samples may not be a proportion of the population and may
depend upon the situation as presented in the portion of the sampling design. Samples are
taken out of judgment and are not derived through procedure that will guarantee equal
chances of representation, hence, this is also called non-random sampling. It is further
classified into the following types:
a. Purposive Sampling. This is otherwise called deliberate sampling. In this
design, the respondents are selected based on the judgment of who best
qualify the objectives of research. For example, a researcher is interested in
finding out the students’ perception on the performance of the school officials
in a college/university. Instead of conducting a random sampling, the
researcher can just involve purposely the student leaders as respondents.

b. Quota Sampling. This method involves the taking of the desired number of
respondents with the required characteristics proportionate to the population
under study. An example is when a researcher would like to document the
experience of male and female scientists who have been involved in the
establishment of a marine station. He/she should look for these scientists until
the desired number of the respondent is met.

c. Convenience or Accidental Sampling. This sampling technique involves the


conduct of the study wherein respondents are selected based on the
convenience of the researcher. If an investigator will conduct a study on
among some Metro Manila residents to find out the pros and cons on the
implementation of the Expanded Value Added Tax, he/she may use interview
technique to gather the data. He/she can just stay in one place and ask anyone
whom he meets on the issue. This conduct of research uses convenience.

Formula:

n= N
------------------
1 + (N) (e)2
Where n = sample size

N = population size

e = desired margin error

Illustrative Example:

In your study, the size of the population is 10, 000. What is the sample size if you
allow 5% margin of error. Using the above formula the sample size could be computed as
follows:

n= N
------------------
1 + (N) (e)2
= 10, 000
-----------------------------
1 + (10, 000) (0.05)2
= 10, 000
-----------------------------
1 + (10, 000) (0.0025)

= 10, 000
--------------------------
1 + 25

= 10, 000
--------------------------
26

n = 384.615 or 385

Note: Samples must be expressed in a whole number.

Spearman’s Coefficient of Rank Correlation ( P )

The measure of disarray, ∑d2, is used in the definition of Spearman’s coefficient


of rank correlation. A coefficient of rank correlation is a statistic defined in such a way as
to take a value of +1 when the paired ranks are in the same order, a value of –1 when the
ranks are in an inverse order, and an expected value of 0 when the ranks are arranged at
random with respect to each other.

Steps:

1. Rank the scores of X and write the corresponding ranks in column Rx


2. Rank the scores of Y and write the corresponding ranks in column Ry
3. Get the difference between Rand X and Rank Y and the the answers in
column d
4. Square the deviations (the differences in ranks X and Y) and write the
answers in column d2
5. Find the sum of the squared deviations (∑d2)
6. Apply the formula:

P = 1 - 6 ∑d2
---------------
N (N2 - 1)

X Y Rx Ry (Rx – Ry) (Rx – Ry)2


d d2

24 18 1 1 0 0
21 16 2 2.5 -.5 0.25
18 10 3 6 -3 9
15 16 5 2.5 2.5 6.25
15 12 5 4 1 1
15 10 5 6 -1 1
12 10 7.5 6 1.5 2.25
12 8 7.5 8 - .5 .25
9 6 9.5 9 .5 .25
9 4 9.5 10 - .5 .25
----------------
∑d2 = 20.5

Solution:

P=1 - 6 ∑d2
---------------
N (N2 - 1)
=1 - 6 ( 20.5 )
---------------
10 (102 - 1)
=1 - 123
---------------
10 (100 – 1)

=1 - 123
--------------
10 ( 99 )

=1 - 123
--------------
990

= 1 - 0.12

= 0.88

DF = N – 2 Critical Value of P
= 10 – 2 at
=8 .05 = .643
The computed P of 0.97 is > the Critical Value of .643, thus significant. The
null hypothesis is rejected.

Pearson Product-Moment Correlation

When there are two sets of scores and the researcher would like to find if the
two sets are correlated, the Pearson Product-Moment correlation is used. The
correlation is called co-variation because analysis is concentrated mainly on
how the two determine the relationship between two variables with interval
type of data. An example is knowing how scores in an achievement test
correlates with the scores in a mental ability test.

Correlation may either be positive or negative. It is positive when the items or


cases or subjects who got low in one variable also got low in the other
variable and those who got high in one variable are also those who got high in
the other variable. It is the reverse, the correlation is negative, that is, those
who got high in one factor are the ones who got low in the other factor and
those who got low in one factor got high in the other factor.

Steps:

1. Square the scores of X and write the answers in column X2


2. Square the scores of Y and write the answers in column Y2
3. Multiply the scores of X and Y and write the products in column XY
4. Find the sum of each column
5. Apply the formula:

r= N ∑XY - ∑X ∑Y
---------------------------------------------------

[N ∑X2 -- (∑X) 2] [N ∑Y2 – (∑Y) 2]


X Y X2 Y2 XY

24 18 576 324 432


21 16 441 256 336
18 10 324 100 180
15 16 225 256 240
15 12 225 144 180
15 10 225 100 150
12 10 144 100 120
12 8 144 64 96
9 6 81 36 54
9 4 81 16 36
------------------------------------------------------------------------------------------
∑X = 150 ∑Y = 110 ∑X2 = 2466 ∑Y2 = 1396 ∑XY = 1824

Apply the formula:

r= N ∑XY - ∑X ∑Y
-------------------------------------------------

[N ∑X2 -- (∑X) 2] [N ∑Y2 – (∑Y) 2]

= 10 (1824) – (150)(110)
-------------------------------------------------------

[10 (2466) – (150)2] [ 10 (1396) – (110)2]


18240 – 16500
= -------------------------------------------------------

[24660 – 22500] [13960 – 12100]

1740
= -------------------------------------------------------

[ 2160] [ 1860 ]

1740
= -------------------------------------------------------

4017600

1740
-------------------------------------------------------
2004.40

= 0.868 DF = N - 2
= 10 - 2
= 8

.05 = .549

The computed r of 0.868 is > than the required Tabular value of .549 at .05 level
of significance. Thus, significant. The null hypothesis is rejected in this aspect.

1. T-Test for Independent (Uncorrelated) Means

The t-test for Independent Sample Means is used to determine if the observed
difference between the mean of two groups is statistically significant. It is,
therefore, a test for the observed difference between two sample means not
correlated with each other. It is used to compare the difference between the
average of cases of control and experimental groups and to determine if there
is a difference between the average of two intact groups.

2. T-Test of Dependent (Correlated) Means

The t-test for Dependent Sample Means is a more precise test with its use
limited to scores that are correlated and involving the pre-test and post-test.
The t-value is obtained from the table of critical t-value using the appropriate
degrees of freedom. It the computed t is greater than the tabular t, the
hypothesis of no difference between the pre-test and post-test is rejected.

3. The Biserial Correlation

Biserial correlation is used to correlate between two sets of continuous


variable and a set of dichotomous variable. A dichotomous variable is one that
can have only two sides such as male-female, present-absent, right-wrong.

7. F-Test

The F-Test is a one-way analysis of variance or one-way ANOVA. It is used


when the study compares the means of two or more groups. An F is a ratio of
two variances or mean squares and is expected to be equal to one if the two
population variances are equal. F values with varying degrees of freedom are
found in the F-Table with usually, F-values of .05 and .01 level of
significance. In a two-group comparison the obtained F is equal to t.

A. NON-PARAMETRIC STATISTICS

You might also like