AE114 (Statistical Analysis and Software Application)
Instructor: Zandro W. Payocong
Introduction:
Topic 1:
Overview of statistics: Basic concepts and terms
Modules:
1. Basic concepts and terms Part 1
2. Basic concepts and terms Part 2
Module 1:
Basic concepts and terms Part 1
Learning Outcomes:
At the end of the module, the students are expected to:
● Define and explain the concepts of statistics.
● Differentiate descriptive and inferential statistics.
● Differentiate population and sample.
● Define and explain the concepts of sampling and randomization.
● Identify how statistics is used in daily life.
Teaching – Learning Activity (Lesson Proper):
Statistics
In the broadest sense, “statistics” refers to a range of techniques and procedures for
analyzing, interpreting, displaying, and making decisions based on data.
[Link]
definition-of-statistics/2315
Descriptive and inferential statistics
Descriptive statistics are numbers that are used to summarize and describe data. The
word “data” refers to the information that has been collected from an experiment, a
survey, an historical record, etc. (By the way, “data” is plural. One piece of information
is called a “datum.”) If we are analyzing birth certificates, for example, a descriptive
statistic might be the percentage of certificates issued in New York State, or the
average age of the mother. Any other number we choose to compute also counts as a
descriptive statistic for the data from which the statistic is computed. Several descriptive
statistics are often used at one time to give a full picture of the data.
Descriptive statistics are just descriptive. They do not involve generalizing beyond the
data at hand. Generalizing from our data to another set of cases is the business of
inferential statistics, which you'll be studying in another section.
Source: [Link]
Source: [Link]
With inferential statistics, you are trying to reach conclusions that extend beyond the
immediate data alone. For instance, we use inferential statistics to try to infer from the
sample data what the population might think. Or, we use inferential statistics to make
judgments of the probability that an observed difference between groups is a
dependable one or one that might have happened by chance in this study. Thus, we
use inferential statistics to make inferences from our data to more general conditions;
we use descriptive statistics simply to describe what’s going on in our data.
One of the simplest inferential test is used when you want to compare the average
performance of two groups on a single measure to see if there is a difference. You
might want to know whether eighth-grade boys and girls differ in math test scores or
whether a program group differs on the outcome measure from a control group.
Whenever you wish to compare the average performance between two groups you
should consider the t-test for differences between groups.
[Link]
Population and sample
In statistics, we often rely on a sample --- that is, a small subset of a larger set of data ---
to draw inferences about the larger set. The larger set is known as the population from
which the sample is drawn.
Example #1: You have been hired by the Commission on Elections to examine how the
Filipino people feel about the fairness of the voting procedures in the Philippines. Who
will you ask?
It is not practical to ask every single Filipino how he or she feels about the fairness of the
voting procedures. Instead, we query a relatively small number of Filipinos, and draw
inferences about the entire country from their responses. The Filipinos actually queried
constitute our sample of the larger population of all Filipinos. The mathematical
procedures whereby we convert information about the sample into intelligent guesses
about the population fall under the rubric of inferential statistics.
A sample is typically a small subset of the population. In the case of voting attitudes, we
would sample a few thousand Filipinos drawn from the hundreds of millions that make
up the country. In choosing a sample, it is therefore crucial that it not over-represent
one kind of citizen at the expense of others. For example, something would be wrong
with our sample if it happened to be made up entirely of Baguio residents. If the sample
held only those from Baguio, it could not be used to infer the attitudes of other Filipinos.
Inferential statistics are based on the assumption that sampling is random. We trust a
random sample to represent different segments of society in close to the appropriate
proportions.
Sampling and randomization
Simple Random Sampling
Researchers adopt a variety of sampling strategies. The most straightforward is simple
random sampling. Such sampling requires every member of the population to have an
equal chance of being selected into the sample. In addition, the selection of one
member must be independent of the selection of every other member. That is, picking
one member from the population must not increase or decrease the probability of
picking any other member (relative to the others). In this sense, we can say that simple
random sampling chooses a sample by pure chance.
Sample size matters
Recall that the definition of a random sample is a sample in which every member of the
population has an equal chance of being selected. This means that the sampling
procedure rather than the results of the procedure define what it means for a sample to
be random. Random samples, especially if the sample size is small, are not necessarily
representative of the entire population.
Random assignment
In experimental research, populations are often hypothetical. For example, in an
experiment comparing the effectiveness of a new anti-depressant drug with a
placebo, there is no actual population of individuals taking the drug. In this case, a
specified population of people with some degree of depression is defined and a
random sample is taken from this population. The sample is then randomly divided into
two groups; one group is assigned to the treatment condition (drug) and the other
group is assigned to the control condition (placebo). This random division of the sample
into two groups is called random assignment. Random assignment is critical for the
validity of an experiment. For example, consider the bias that could be introduced if
the first 20 subjects to show up at the experiment were assigned to the experimental
group and the second 20 subjects were assigned to the control group. It is possible that
subjects who show up late tend to be more depressed than those who show up early,
thus making the experimental group less depressed than the control group even before
the treatment was administered.
In experimental research of this kind, failure to assign subjects randomly to groups is
generally more serious than having a non-random sample. Failure to randomize (the
former error) invalidates the experimental findings. A non-random sample (the latter
error) simply restricts the generalizability of the results.
Stratified Sampling
Since simple random sampling often does not ensure a representative sample, a
sampling method called stratified random sampling is sometimes used to make the
sample more representative of the population. This method can be used if the
population has a number of distinct “strata” or groups. In stratified sampling, you first
identify members of your sample who belong to each group. Then you randomly
sample from each of those subgroups in such a way that the sizes of the subgroups in
the sample are proportional to their sizes in the population.
Let's take an example: Suppose you were interested in views of capital punishment at
an urban university. You have the time and resources to interview 200 students. The
student body is diverse with respect to age; many older people work during the day
and enroll in night courses (average age is 39), while younger students generally enroll
in day classes (average age of 19). It is possible that night students have different views
about capital punishment than day students. If 70% of the students were day students, it
makes sense to ensure that 70% of the sample consisted of day students. Thus, your
sample of 200 students would consist of 140 day students and 60 night students. The
proportion of day students in the sample and in the population (the entire university)
would be the same. Inferences to the entire population of students at the university
would therefore be more secure.
Applications of Statistics
Like most people, you probably feel that it is important to “take control of your life.” But
what does this mean? Partly, it means being able to properly evaluate the data and
claims that bombard you every day. If you cannot distinguish good from faulty
reasoning, then you are vulnerable to manipulation and to decisions that are not in
your best interest. Statistics provides tools that you need in order to react intelligently to
information you hear or read. In this sense, statistics is one of the most important things
that you can study.
To be more specific, here are some claims that we have heard on several occasions.
(We are not saying that each one of these claims is true!)
• 4 out of 5 dentists recommend Dentine.
• Almost 85% of lung cancers in men and 45% in women are tobacco-related.
• Condoms are effective 94% of the time.
• Native Americans are significantly more likely to be hit crossing the street than are
people of other ethnicities.
• People tend to be more persuasive when they look others directly in the eye and
speak loudly and quickly.
• Women make 75 cents to every dollar a man makes when they work the same job.
• A surprising new study shows that eating egg whites can increase one's life span.
• People predict that it is very unlikely there will ever be another baseball player with a
batting average over 400.
• There is an 80% chance that in a room full of 30 people that at least two people will
share the same birthday.
• 79.48% of all statistics are made up on the spot.
All of these claims are statistical in character. We suspect that some of them sound
familiar; if not, we bet that you have heard other claims like them. Notice how diverse
the examples are. They come from psychology, health, law, sports, business, etc.
Indeed, data and data interpretation show up in discourse from virtually every facet of
contemporary life.
Statistics are often presented in an effort to add credibility to an argument or advice.
You can see this by paying attention to television advertisements. Many of the numbers
thrown about in this way do not represent careful statistical analysis. They can be
misleading and push you into decisions that you might find cause to regret. For these
reasons, learning about statistics is a long step towards taking control of your life. (It is
not, of course, the only step needed for this purpose).
Example business application: Using statistics in financial analysis (watch the video)
[Link]
%20form%20of,and%20draw%20conclusions%20from%20data.
Enhancement Activity:
Application of statistics in college business research and feasibility study: List down and
explain the applications of statistics in college business research and feasibility study.
Paper format:
Century gothic, 11
Double-spaced
Full block style
Narrow margin
8.5” x 11”
Maximum of 2 pages
Portable document format
Other References:
[Link]
Introduction to Statistics (OSE)