0% found this document useful (0 votes)
35 views13 pages

Introduction to Statistics and Data Analysis

This document provides an introduction to statistics. It discusses how statistics is used widely in various fields like politics, medicine, business and law. Statistics involves collecting and analyzing quantitative data. The document then discusses the objectives of learning statistics and provides definitions of key statistical concepts like descriptive statistics, inferential statistics, data collection, presentation and analysis. It distinguishes between descriptive and inferential statistics, and discusses why data is needed for research studies.

Uploaded by

Alexxa Diaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views13 pages

Introduction to Statistics and Data Analysis

This document provides an introduction to statistics. It discusses how statistics is used widely in various fields like politics, medicine, business and law. Statistics involves collecting and analyzing quantitative data. The document then discusses the objectives of learning statistics and provides definitions of key statistical concepts like descriptive statistics, inferential statistics, data collection, presentation and analysis. It distinguishes between descriptive and inferential statistics, and discusses why data is needed for research studies.

Uploaded by

Alexxa Diaz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Chapter INTRODUCTION

TO
STATISTICS
1
Today, statistics and its application are an integral part of our life. In
such diverse settings as politics, medicine, education, business, and the legal
arena, human activities are both measured and guided by statistics.
We begin the module with some basic analysis. Since statistics
involves the collection and interpretation of data, we must first know how to
understand, display, and summarize large amounts of quantitative
information, before undertaking a more sophisticated analysis. Statistical
analysis of quantitative data is important throughout the pure and social
sciences.

General Objectives:
At the end of the chapter, you should be able to:
1. appreciate the use and the beauty of statistics in the field of
research, management and in daily lives;
2. define Statistics, descriptive statistics, inferential statistics, and
other basic terminologies about statistics;
3. identify the need of data in conducting research; and
4. determine the importance of measurement level in identifying
appropriate methods for data collection and analysis.
LESSON I. WHAT IS STATISTICS
As we embark on our journey into the study of statistics, we must begin
with the definition of statistics and expand on the details involved.
Statistics has become the universal language of the sciences. As
potential users of statistics, we need to master both “sciences” and the “art” of
using statistical methodology correctly. Careful use of statistical methods will
enable us to obtain accurate information from data. These methods include
(1) carefully defining the situation, (2) gathering data, (3) accurately
summarizing the data, and (4) deriving and communicating meaningful
conclusions.

At the end of the lesson, you should be able to:


1. define statistics;
2. distinguish between descriptive and inferential statistics;
3. appreciate the need of data in conducting research; and
4. state the reasons in obtaining data.

Statistics is a branch of applied mathematics which deals with the


collection, organization, presentation, analysis, and interpretation of data.
Statisticians develop and apply appropriate methods in collecting and
analyzing data. They guide the design of a research study then analyze the
results. The interpretation of the results is the basis of the statisticians in
making inferences about the population.
a. Data gathering or Collection. May be done through interview,
questionnaires, tests, observation, registration, and experiments.
b. Presentation of Data. Refers to the organization of data into tables,
graphs, charts, or paragraphs. It may be tabular, graphical, or textual.
c. Analysis of Data. Pertains to the process of extracting from the given
data relevant and noteworthy information and this uses statistical tools
or techniques.
d. Interpretation of Data. Refers to the drawing of conclusions or
inferences from the analyzed data.

TYPES OF STATISTICS
As we have seen, statistics can refer to a set of individual numbers or
numerical facts, or to general or specific statistical techniques. A further
breakdown of the subject is possible, depending on whether the emphasis is
on (1) simply describing the characteristics of a set of data or (2) proceeding
from data characteristics to making generalizations, estimates, forecasts, or
judgments based on the data. The former is referred to as descriptive
statistics, while the latter is called inferential statistics.

STATISTICS

Descriptive Statistics Inferential Statistics

FIGURE 1.1Types of Statistics


Descriptive Statistics. It relates to the gathering, classification and
presentation of data and the collection of summarizing values to describe
group characteristics of data. The most used summarizing values to describe
group characteristics of data are percentage, measures of central tendency
and location, measures of variability, skewness, and kurtosis. For example,
upon looking around your class, you may find that 35% of your fellow students
are wearing Casio watches. If so, the figure “35%” is a descriptive statistic.
Chapter 3 and 4 will present several popular visual and statistical approaches
to expressing the data we or others have collected. For now, however, just
remember that descriptive statistics are used only to summarize or describe
data.

Data Collection Exploration of Analysis


and Preparation Data

Collect Data
Explore
Descriptive Relationship
Prepare Statistics between
Codebook Variables
Set up Structure
of Data

Enter Data Compare


Graphs Groups
Screen Data for
Errors

FIGURE 1.2Data Analysis and Descriptive Statistics


Inferential Statistics. Pertains to the methods dealing with making
inference, estimates or prediction about large set of data using the information
gathered. Commonly used inferential statistical tools or techniques are testing
hypothesis using z-test, t-text, simple linear correlation, analysis of variance
(ANOVA), chi-squares, regression, and time series analysis. For example,
observing a sample nurses and other healthcare workers who were likely
infected with the COVID-19, researchers found that only half routinely wore
the PPEs when dealing with patients. Chapter 5 and 6 will present several
popular visual and statistical approaches to predict the data collected. For
now, however, just remember that inferential statistics draws conclusions
about a population based on data observed in a sample.

Data Collection Exploration of Analysis


and Preparation Data

Collect Data
Explore
Descriptive Relationship
Prepare Statistics between
Codebook Variables
Set up Structure
of Data

Enter Data Compare


Graphs Groups
Screen Data for
Errors

FIGURE 1.3Data Analysis and Inferential Statistics

WHY DATA ARE NEEDED


Whichever industry you work in, or whatever your interests, you will
almost certainly have come across a story about how “data” is changing the
face of our world. It might be part of a study helping to cure a disease, boost a
company’s revenue, make a building more efficient or be responsible for
those targeted ads you keep seeing. Data is one of the most important and
vital aspect of any research studies. Researchers conducted in different fields
of study can be different in methodology, but every research is based on data
which is analyzed and interpreted to get information. Data is the basic unit in
statistical studies. Statistical information like census, population variables,
health statistics, and road accidents records all developed from data.
Data contain information needed to make a more informed decision in
a situation, there are many instances in which data are needed:
 A market researcher needs to assess product characteristics to
distinguish one product from another.
 An operations manager wants to monitor an assembly process on a
regular basis to find out whether it follows generally accepted
accounting principles.
 A potential investor wants to determine what firms within what
industries are likely to have accelerated growth in a period of economic
recovery.
 A student wants to get data on classmates’ favorite rock groups to
satisfy a curiosity.

TABLE 1.1 Six Main Reasons for Data Collection

Reason for Obtaining Data

1. Data are needed to provide the necessary input to a survey.


2. Data are needed to provide the necessary input to the study.
3. Data are needed to measure performance of an ongoing service or
production process.
4. Data are needed to evaluate conformance to standards.
5. Data are needed to assist in formulating alternative courses of action
in a decision-making process.
6. Data are needed to satisfy our curiosity.

Key Data Collection Sources


1. Data may already be published by governmental, industrial, or
individual sources. The Philippine Statistics Authority is responsible
for collecting and compiling data on economic, social, demographic,
political affairs, and general affairs of the people of the Philippines.
2. An experimental may be designed to obtain the necessary data.
Strict control is exercised over the treatments. For example, in a study
testing the effectiveness of laundry detergent, the researcher
determines which brands in the study are most effective in cleaning
soiled clothes by actually washing dirty laundry instead of asking
customers which brand they believe to be most effective.
3. A survey may be conducted. In this data collection sources, no
control is exercised over the behavior of the people being surveyed.
They are merely asked questions about their beliefs, attitudes,
behaviors, and other characteristics. Responses are then edited,
coded, and tabulated for analysis.
4. An observational study may be conducted. A researcher observes
the behavior directly, usually in its natural setting. Most knowledge of
animal behavior is developed in this way, as in our scientific knowledge
other fields, such as astronomy and geology, in which experimentation
and surveys are impractical if not impossible.
 
Two Types of Data Collection Sources
1. Primary Sources. It is measured and gathered by the researcher that
published it. They are the data collectors.
2. Secondary Sources. It is republished by another researcher or
agency. They are the data compilers.

EXERCISES
CONCEPTS AND PROCEDURES

1.1. Briefly describe the meaning statistics.


1.2. Briefly explain the types of statistics
1.3. Why data are needed?

APPLICATIONS

1.4. Given the following situation, give a statement for descriptive and
inferential statistics.

a. Of 350 randomly selected people in the Province of Sultan


Kudarat, 100 people had the last name Dela Cruz.
Descriptive: ____________________________________
Inferential: _____________________________________

b. On the last 3 Sundays, Jose Dela Cruz sold 2, 1, and 0 new


cars, respectively.
Descriptive: ____________________________________
Inferential: _____________________________________

LESSON II. TYPES OF VARIABLES AND SCALES OF MEASUREMENT


Statistician develop surveys to deal with a variety of random variables.
The data, which are the observed outcomes of those variables, will virtually
always differ from item to item (or person to person), since no two things are
exactly alike.
The scale of measurement of your variables is important for two
reasons.  Each of the levels of measurement provides a different level of
detail.  Nominal provides the least amount of detail, ordinal provides the next
highest amount of detail, and interval and ratio provide the most amount of
detail. Data can also be obtained in terms of the level of measurement
attained.
In this lesson, you will learn about the types of data: qualitative and
quantitative variables and the four scales of measurement: nominal, ordinal,
interval, and ration scales and why they are important.  
At the end of the lesson, you should be able to:
1. differentiate qualitative and quantitative variable;
2. determine what type of variable are present in a survey questionnaire;
3. identify the different level of measurements;
4. state the importance of the measurement level; and
5. determine what type of measurement level in a survey questionnaire.

TYPES OF VARIABLES
As illustrated in Figure 1.1, there are two types of variables that yield
the observed outcomes or data: qualitative and quantitative.
FIGURE 1.4Types of Variable
Qualitative Variables. Some of the variables associated with people
or objects are qualitative in nature, including that the person or object belongs
to a category. Qualitative variables, also referred to as attributes, typically
involve counting how many people or objects fall into each category. In
expressing results involving qualitative variables, we describe the percentage
or the number of persons or objects falling into each of the possible category.
For example, we may find that 35% of grade school children interviewed
recognize a photograph of McDonald, while 65% do not. Likewise, some of
the children may have eat a Big Mac hamburger at one time or another while
others have not.
Qualitative Variables. Yield numerical responses representing an
amount or quantity. Examples are weight, height, umber of children. There are
two types of quantitative variables: the discrete or continuous.
a. Discrete Quantitative Variables. Produces numerical responses
that arise from a counting process. For example “number of
children”, it is a discrete numerical variable because the response is
one of a finite number of integers ( 0,1,2,3,…).
b. Continuous Quantitative Variables. produce numerical responses
that arise from a measuring process.
Example:
Height (5’4, 157cm, 1.5m)
Weight (130.42 kilos, 210lbs, 432 grams)
Temperature (32.50 C, 1120 F)

SCALES OF MEASUREMENT
Assuming a numerical value to a variable is a process called
measurement. For example, we might look at the thermometer and observe a
reading of 72.5 degrees Fahrenheit or examine a box of lightbulbs and find
that 3 are broken. The numbers 72.5 and 3 would constitute measurements.
When a variable is measured, the result will be in one of the four levels, or
scales of measurement: nominal, ordinal, interval, and ratio. Summarized in
Figure 1.5. the scale to which the measurements belong will be important in
determining appropriate methods for data description and analysis.
Nominal Each number represents a category

Ordinal Greater than and less than relationships

Interval and Units of measurement

Ratio and and Absolute zero point

FIGURE 1.5Scales of Measurement

Nominal Level. Classifies data into various distinct categories in which


no ordering is implied. It is the weakest form of measurement because no
attempt can be made to account for differences within a category or to specify
any ordering or direction across the various categories.

FIGURE 1.6Examples of Nominal Scale

Ordinal Level. Classifies data into distinct categories in which ordering


is implied. Data are ranked from “bottom to top” or “low to high” manner.
Statements of the kin d “greater than” or “less than” may be made.
FIGURE 1.7Examples of Ordinal Scale

Interval Level. It is an ordered scale in which the difference between


measurements is a meaningful quantity that does not involve a true zero
point.
Ratio Level. It is an ordered scale in which the difference between the
measurements involves a true zero point as in height, weight, age, or salary
measurements.

FIGURE 1.8Examples of Interval and Ratio Scale

EXERCISES
CONCEPTS AND PROCEDURES

1.1. Explain the meaning of the following terms.


a. Quantitative variable
b. Qualitative Variable
c. Discrete Variable
d. Continuous Variable

APPLICATIONS

1.2. For each of the following random variables, determine whether the
variable is Qualitative or Quantitative. If the variable is Quantitative,
determine whether the variable of interest is discrete or continuous, in
addition, determine the level of measurement.
a. Number of telephones per household.
b. Type of telephone primarily used.
c. Number of long-distance calls made per month.
d. Length (in minutes) of longest long-distance call made per month.
e. Color of telephone primarily used.
f. Monthly charge for long distance calls made.
g. Ownership of a cellular phone.
h. Number of local calls made per month.
1.3. Identify the following quantitative variables as discrete or continuous.
a. Number of Foreigners migrating to the Philippines every year.
b. Length of hair of female students.
c. The boiling point of water is 100tC
d. Number of students per class.
e. John’s height is 168 cm.
f. The number of children in Barangay A with missing/ decayed teeth is
2,000.
g. The following data are the densities of sample substances taken from
Tabing- ilog River 23.6, 19.8, 15.0,7.8,1.6, and 2.4. in g/cc.
h. The average speed of motorboats cruising in Manila Bay every day is
50m/s.
i. Number of Grade one pupils in MCU Elementary School.
j. Number of Job Applicants at YYY Company.
1.4. Three different beverages are sold at a fast-food restaurant-soft drinks,
tea, and coffee.
a. Explain why the type of beverages sold is an example of a categorical
variable.
b. Explain why the type of beverages sold is an example of a nominal
scaled variable.
1.5. Soft drinks are sold in three sizes in a fast-food restaurant-small,
medium, and large. Explain why the size of the soft drinks is ordinal
scaled variable.
1.6. Suppose that you measure the time it takes to download an MP3 file
from the internet.
a. Explain why the download time is a numerical variable.
b. Explain why the download time is a ratio scaled variable.

SUMMARY

Statistics can be defined as the collection, organization, presentation,


analysis, and interpretation of data relevant to a decision or situation. There
are two branches of statistics: descriptive and inferential. Descriptive statistics
focuses on summarizing and describing data that that have been collected.
Inferential statistics goes beyond mere description and, based om sample
data, seeks to reach conclusions, or make predictions regarding the
population from which the sample was drawn.
As businesspersons, and citizens, we are involved with statistics either
as practitioners, researchers, or as consumers of statistical claims and
findings offered by others. Very early statistical efforts primarily involved
counting people. More recently, statistical methods have been applied in all
facets of business or individual as a tool for analysis and reporting, for
reaching conclusions based on observed data, and as an aid to decision
making.
Variables can be either qualitative or quantitative. Qualitative variables
indicate whether a person or object possess a given attribute, while
quantitative variables express how much of an attribute is possessed.
Discrete quantitative variables can take on only certain values along an
interval, with the possible values gaps between them, while continuous
quantitative variables can take on a value at any point along an interval.
When variable is measured, a numerical value is assigned to it, and the
result will be in one of four levels of measurement – nominal, ordinal, ratio,
and interval. The level to which the measurements belong will be important in
determining appropriate methods for data description and analysis.
By helping to reduce the uncertainty posed by largely uncontrollable
factors, such as competitors, government, technology, the social and
economic environment, and often unpredictable consumers and voters,
statistics plays a vital role in decision making.
REFERENCES

Arao, R., Copo, A.R., Laddaran, A., Mejia, L., & Gabuyo, Y. (2015). Statistics
(based on CMO 03, Series 2007). Rex Book Store, Inc.
Berenson, M., Levine, D., & Krehbiel, T. (2000). Basic Business Statistics:
Concepts and Application, 8th Edition. Pearson Education, Inc.
Data Levels and Measurement. (n.d.). Retrieved from
[Link]
Downie, N.M. & Heath, R. (2005). Basic Statistical Methods 5 th Edition. Harper
& Row, Publishers, Inc. Harper International Edition.
Essay: Importance of data and data collection. (February 2, 2016). Retrieved
from [Link]
data-collection/
Illowsky, B., & Dean, S., (2013). Introduction to Statistics. Openstax College.
Johnson, R., & Kuby, P. (2013). Statistics, 2 nd Edition. Cengage Learning Asia
Pte Ltd.
Mann, P. (2010). Introduction to Statistics, 7 th Edition. United States of
America.
Weiers, R. (2014). Introduction to Business Statistics, 7 th Edition. Cengage
Learning Asia Pte Ltd.
What is data, and why is it important? (June 28, 2018). Retrieved from
[Link]

Common questions

Powered by AI

The statistical analysis process involves several key components: data collection, data preparation, data analysis, and data interpretation. In data collection, accurate and relevant data is gathered via surveys, observations, or experiments, forming the basis for any statistical analysis . Data preparation involves cleaning and organizing this data, which includes tasks like coding responses and screening for errors . Data analysis involves applying statistical techniques to explore data patterns and relationships, using methods like descriptive or inferential statistics to transform the data into meaningful insights . The final stage, data interpretation, focuses on drawing conclusions or making inferences from the analyzed data, which guides decision-making and informs further research . Each stage is crucial for ensuring the reliability and validity of the research outcomes .

Qualitative variables, also known as categorical variables, are non-numeric and categorize data into different groups or attributes. They include nominal or ordinal scales, such as gender or satisfaction level, and are typically analyzed using frequency counts or proportions . Quantitative variables, on the other hand, reflect numeric value in their measurement and indicate how much of an attribute is possessed. They use interval or ratio scales, allowing for operations like addition or multiplication, and are analyzed using statistical techniques such as mean, standard deviation, and more complex inferential methods .

Interpreting statistical data is crucial because it transforms raw data findings into meaningful insights, highlighting trends, relationships, and anomalies within the dataset. It involves deriving conclusions or inferences that go beyond the mere presentation of data, allowing researchers to make informed decisions, predict future outcomes, and validate hypotheses in research . Proper interpretation ensures that the conclusions drawn are based on sound judgment rather than erroneous assumptions, enhancing the reliability and applicability of the research to real-world scenarios .

Primary data sources involve data directly collected by the researcher, providing specific, current, and relevant information tailored to the specific needs of the study. This approach, however, can be time-consuming and costly . Secondary data sources refer to data collected and published by other researchers or agencies. While secondary data can be more accessible and less resource-intensive to obtain, they may not precisely fit the researcher's needs and introduce biases of unknown methodology or context . Using primary data enhances accuracy and validity for specific research questions, while secondary data can offer broader extensive insights across various studies .

Discrete quantitative variables represent countable values, typically integers, such as the number of students in a class. These variables are analyzed using frequency distributions and measures of central tendency . Continuous quantitative variables, on the other hand, represent measurable quantities that can take any value within a range, such as height or weight. These are analyzed through histograms, probability distributions, and summary statistics, including mean and standard deviation . Real-world applications include the use of discrete variables in inventory count analysis and continuous variables in monitoring temperature changes .

The level of measurement determines the nature of the data and influences the choice of statistical methods used for analysis. Nominal data allows only for categorical distinction without implied rank or order, thus limiting analysis to frequency count or mode . Ordinal data includes a rank but not the magnitude of difference between categories, permitting median computation and rank-order tests . Interval data supports meaningful differences between data points but lacks a true zero, enabling the use of mean and other measures of variability, but not ratios . Ratio data possess all attributes of interval data and include a true zero, which allows for a full range of descriptive and inferential statistical procedures, including calculation of ratios .

Descriptive statistics are used to summarize or describe the characteristics of a dataset, focusing on the representation of data through measures such as the mean, median, mode, and standard deviation. This branch of statistics is concerned with the presentation and summarization of data without inferring conclusions or predictions about a larger population . In contrast, inferential statistics involve procedures that allow us to make inferences about a population based on samples drawn from it. This method includes estimating population parameters, hypothesis testing, and making forecasts or decisions about future events based on sample data .

Statistical tools can significantly enhance decision-making across fields by providing a structured method to collect, analyze, and interpret data, thereby enabling more informed and objective decisions . For instance, in business, statistics can forecast market trends, in medicine, they can evaluate treatment efficacy, and in education, they can assess teaching effectiveness. However, challenges in interpreting statistical results include potential biases in data collection, misinterpretation of statistical significance as practical significance, and over-reliance on incomplete data due to sampling errors or misapplication of statistical assumptions . These challenges demand rigorous methodology and critical evaluation skills to ensure conclusions are valid and actionable .

Distinguishing between nominal, ordinal, interval, and ratio scales of measurement in statistical study design serves several purposes. Each scale dictates the level of detail and type of statistical analysis that can be employed; nominal scales allow for classification without inherent order, ordinal scales introduce ranked order, interval scales denote meaningful differences without a true zero, and ratio scales provide an absolute zero allowing for full arithmetic operations . This distinction is critical for selecting appropriate statistical tests, ensuring data validity, and deriving meaningful conclusions. It provides researchers with a framework for accurately interpreting relationships between variables and ensures methodological rigor .

The selection of a data collection method in statistical research is influenced by several factors, including the nature and objectives of the research, the type and availability of the population being studied, the resources and time available, and the required data quality and precision . Each method—such as surveys, observational studies, or experiments—impacts the study's integrity differently. For example, surveys can provide extensive coverage but might suffer from respondent bias, while observational studies offer authentic insights without manipulation but might lack control over variables . Experiments, though precise, can be costly and time-consuming. The choice must align with the study's goals to ensure valid, reliable, and generalizable findings .

You might also like