0% found this document useful (0 votes)
17 views9 pages

Business Statistics for Decision Making

Statistics_Research-1

Uploaded by

gogoinfinix
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views9 pages

Business Statistics for Decision Making

Statistics_Research-1

Uploaded by

gogoinfinix
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Business statistical standards for decision making

Research about

Statistical terminology
By

Doaa Abul Magd Ibrahim

Israa Khaliefa Ali

Remon Awad Younan

Kerolous Tharwat Fawzy

Ahmed Mahmoud Abdl Aziz


Under the guidance of

Dr. Amira El-desokey


1. Definition of Statistics
Statistics is a branch of applied mathematics that involves the collection, description,
analysis, and interpretation of data drawn from a sample of a larger population. Statistical
sampling is used in medicine, finance, marketing, and many other fields to increase
understanding and inform decision-making

2. Types of Statistics
The two major areas of statistics are descriptive statistics and inferential
statistics. describes the properties of sample and population data.
uses those properties to test hypotheses and draw conclusions.

- Summarizes and describes data.


- Examples include mean, median, mode, tables, and charts.

- Uses data from a sample to make conclusions about a population.


- Examples include hypothesis testing and confidence intervals.

3. Importance of Statistics
Statistics is important because it helps in decision making, simplifies large amounts of
data, predicts future trends, and is widely used in business, medicine, education,
economics, and scientific research.

4. Expressions of Statistics

 Aggregate data: statistical summaries of data, meaning that the data have been
analyzed in some way.

 Association: the concept that there is a relationship between two or more


variables that can be defined statistically.

 Big data: a popular term used across academia, industry, and other arenas to
describe the increased availability of all types of data. Big data is typically
described as being huge in volume, high in velocity (how fast it is created), and
diverse in variety.

 Categorical variable: an observable characteristic that describes subjects by


categories. Also called a discrete or nominal variable. Example: Female vs. Male

 Causality: generally, the concept that outcomes are the direct result of certain
events or actions that have taken place. It is often discussed alongside
"association" and "correlation," which are used to describe whether there is a
relationship between variables and the strength of that relationship.!
 Correlation: the measure to which two variables demonstrate a linear relationship
to one another (e.g., positive and negative correlation). Correlation is a form of
association. Correlation is often used very casually, but it is important to note that
correlation between two variables does not mean that the presence of one causes
a change in the other.

 Data: fundamentally, data = information. We typically use the term to refer to


numeric files that are created and organized for analysis. There are two types of
data: aggregate and microdata.

 Data aggregation: a collection of datapoints and datasets.

 Data analytics: generally used to refer to the techniques and tools required to
analyze massive amounts of data.

 Database: a collection of data organized for research and retrieval.

 Data point or datum: singular of data, generally refers to a single data value.

 Dataset: a term used loosely to refer to a collection of related data items. This
term is used very loosely.

 Derived statistics: statistics calculated on the basis of other statistics. Example:


the crime rate is a derived statistic based on the number of crimes committed in
relation to the population of the area under investigation.

 Descriptive statistics: counts, averages (means), percentages, and so on that


summarize the quantitative information obtained during the data collection effort.
These simplify raw, observed data points into understandable and meaningful
information, but does not state anything beyond the observed data points.
Descriptive statistics are usually contrasted with inferential statistics.

 Indicator: typically used as a synonym for statistics that describe something about
the socioeconomic environment of a society, such as per capita income,
unemployment rate, or median years of education.

 Inferential statistics: statistics used to draw inferences about a population based


on information collected on a sample of the population.

 Median value: the "middle" value of a dataset when all values are sorted from
lowest to highest. The median value is valuable because it minimizes the impact
of very low or very high outliers on the dataset's average.

 Microdata: individual response data obtained in surveys and censuses—these are


data points directly observed or collected from a specific unit of observation. Also
known as raw data. ICPSR is an excellent resource for obtaining microdata files.
Watch this video on microdata from the US Census Bureau to learn more:
 Percent: a proportional measure that compares a portion of a total to the actual
total. This helps you understand the relationship between a slice to the whole. It is
often used when you want to assess how significant a portion or amount is to an
established total.

 Quantitative data/variables: information that can be handled


numerically. Example: the number of US consumers who purchased personal care
products and services. Virtually all the data in Sage Data would be considered
quantitative.

 Ratio: proportional measure that compares the difference between numbers from
different categorical variables. It is often used when you want to understand and
compare the relationship between two distinct groups.

 Secondary data: information or data collected for others to analyze and use for
their own research purposes. This is the most common form of data people
encounter or use in their day-to-day.

 Survey: a data collection method using a population of people that are studied or
interviewed at a particular point in time for the purposes of making inferences or
conclusions about the population. T

 Time series data: data points recorded in chronological order.

5. Fields of Statistics

Statistics is prominent in finance, investing, business, and a wide scope of sectors. Much
of the information you see and the data you’re given is derived from statistics used in all
facets of a business.

 Statistics in investing include average trading volume, 52-week low, 52-week high,
beta, and correlation between asset classes or securities.

 Statistics in economics include gross domestic product (GDP), unemployment,


consumer pricing, inflation, and other economic growth metrics.

 Statistics in marketing include conversion rates, click-through rates, search


quantities, and social media metrics.

 Statistics in accounting include liquidity, solvency, and profitability metrics across


time.

 Statistics in information technology include bandwidth, network capabilities, and


hardware logistics.

 Statistics in human resources include employee turnover, employee satisfaction,


and average compensation relative to the market.
[Link] and Variables

A constant can be defined as a fixed value, which is used in algebraic expressions and
equations. A constant does not change over time and has a fixed value. For example, the
size of a shoe or cloth or any apparel will not change at any point.

Variables are terms which can change or vary over time. Its value does not remain
constant, unlike constants. For example, the height and weight of a person do not always
remain constant, and hence they are variables.

7. Types of Variables

Qualitative variables are specific attributes that are often non-numeric. Examples of
qualitative variables in statistics include gender, eye color, or city of birth. Qualitative
data is most often used to determine what percentage of an outcome occurs for any
given qualitative variable. Qualitative analysis often doesn't rely on numbers.

The second type of variable in statistics is quantitative variables. These are studied
numerically and only have weight when they’re about a non-numerical descriptor. This
information is rooted in numbers.
Quantitative variables can be further broken into two categories.

have limitations in statistics and infer that there are gaps between
potential discrete variable values.

. These values run along a scale. Discrete values have


limitations, but continuous variables are often measured into decimals. Any value
within possible limits can be obtained.

8. Types of Constants

Fixed values inherent in mathematical principles.

Values representing fundamental properties of nature.

Apply across various scientific and mathematical fields.


Used in computing and statistics to represent fixed data
points.

9. Population and Sample


refers to the entire group of interest, usually large.

is a part of the population selected for study. usually smaller and easier to study.

Comparison Population Sample

Collection of all the units or


A subgroup of the members of the
Meaning elements that possess
population
common characteristics

Each and every element of a Only includes a handful of units of


Includes
group population

Characteristics Parameter Statistic

Complete enumeration or
Data Collection Sampling or sample survey
census

Identification of the Making inferences about the


Focus on
characteristics population

10. Relation between population and samples

A sample is always a subset of a population. For example, if the population is "all students
in a school," a sample might be "100 students selected for a survey".

[Link] to choose samples


1. Clearly identify the entire group you want to study (e.g., all
customers, students in a school).

2. Do you need results that generalize broadly (probability sampling)


or specific insights into unique groups (non-probability)?.

a)Probability Sampling (Random): Every member has a known chance of


selection, ideal for generalizability.

o Simple Random: Everyone gets an equal chance (e.g., random


number generator).

o Systematic: Select from a random start at fixed intervals (e.g., every


10th person).

o Stratified: Divide into subgroups (strata) and sample from each


proportionally (e.g., by age, gender).

o Cluster: Divide into clusters (e.g., geographic areas) and randomly


select clusters to survey.

b)Non-Probability Sampling (Non-Random): Selection is not random; useful


for exploratory studies.

o Convenience: Easy-to-reach individuals.

o Purposive: Selecting participants for specific knowledge.

o Snowball: Participants refer others.

4. Use formulas considering population size, desired


confidence level (e.g., 95%), margin of error, and expected variability to ensure
reliability without waste.

5. Implement your plan using your chosen method (e.g., surveys,


interviews

11. Why We Use Sample Instead of Population


We use a sample because studying the whole population is often expensive,
time-consuming, and sometimes impossible. A well-chosen sample can give accurate and
reliable results.

[Link] of applying statistics


 Market Research: Analyzing consumer behavior to predict sales.

 Quality Control: Monitoring manufacturing to find defective products.

 Finance: Forecasting stock market trends and assessing investment risk.

 Retail: Using sales data (mean, median, mode) to manage inventory and demand.

 Medical Trials: Designing studies to test new treatments and understand disease.

 Epidemiology: Tracking disease prevalence and public health trends.

 Environmental Science: Monitoring pollution and climate change.

 Urban Planning: Analyzing population growth to plan infrastructure.

 Resource Allocation: Deciding where to build schools or provide services.

 Elections: Conducting polls to predict voting patterns.

 Weather: Probability used for rain forecasts ("90% chance of rain").

 Sports Analytics: Evaluating player performance and strategy.

 Personal Finance: Budgeting and risk assessment

Reference

B. S. EVERITT is Professor Emeritus of the Institute of Psychiatry, King’s College London.


He is the author of over 50 books on statistics and computing, including Medical Statistics
from A to Z, also from Cambridge University Press.

Cramer, D.,&Howitt, D. (2004). The Sage dictionary of statistics (Vols. 1-0). Sage
Publications, Ltd. [Link]

Jupp, V. (2006). The Sage dictionary of social research methods. Sage Publications, Ltd.
[Link]

Herzog, D. (2015). Data literacy: A user's guide. Sage Publications, Inc.


[Link]
Kitchin, R. (2014). The data revolution: Big data, open data, data infrastructures&their
consequences. Sage Publications Ltd. [Link]

Vogt, W. P. (2005). Dictionary of statistics & methodology. Sage Publications, Inc.


[Link]

[Link]

[Link]
s/statistical-terms-and-concepts-glossary

[Link]

[Link]

[Link]
7b253b16be1d

[Link]

[Link]

[Link]
[Link]/academy/lesson/video/[Link]

Common questions

Powered by AI

Differentiating between aggregates and microdata is crucial because aggregation summarizes data to show overall trends or patterns, which can mask underlying variations present in microdata . Aggregates are useful for broad overviews, but losing micro-level detail can result in overlooking diversity or variability within datasets. Conflating the two could lead to erroneous interpretations, such as generalizing findings that do not apply to all subgroups represented in the microdata, potentially misguiding policy decisions or business strategies .

Descriptive statistics focus on summarizing and describing the properties of a dataset through measures such as mean, median, and mode, without making any inferences beyond the data provided . In contrast, inferential statistics use data from a sample to make conclusions and predictions about a larger population, employing methods like hypothesis testing and confidence intervals . These differences impact their application as descriptive statistics are used when the goal is to present data clearly, while inferential statistics are critical for research aiming to generalize findings beyond the sample to the population .

Using a sample instead of studying the entire population allows researchers to save time, resources, and effort, as working with the entire population is often impractical . However, relying on a sample can introduce sampling bias if not conducted properly, affecting the reliability of findings. To ensure reliability, researchers choose appropriate sampling methods, such as probability sampling techniques that maximize generalizability like stratified or cluster sampling, and use statistical formulas to determine sample size, confidence levels, and margins of error .

Reliability of derived statistics, which are calculated through other statistics, is ensured through proper statistical methodologies, consistent data collection practices, and careful consideration of underlying assumptions . Derived statistics like crime rates or GDP depend on accurate base data and sound calculations to reflect true underlying conditions accurately . They are essential in applied research because they provide a framework for understanding complex phenomena by simplifying vast datasets into comprehensible metrics, facilitating informed decision-making and policy formulations . Errors in deriving these statistics can severely impact research outcomes and lead to flawed policy implementations .

Statistical sampling provides a feasible means of studying populations by using a smaller, manageable set of observations to make inferences about the whole, which is particularly valuable in scientific research where measuring entire populations is impractical . Main sampling methods include probability sampling, such as simple random sampling, ensuring each member has an equal chance of selection, systematics sampling using interval-based selection, stratified sampling dividing populations into subgroups, and cluster sampling selecting entire groups randomly to gather representative data . These methods aim to ensure that samples are unbiased, random, and reflective of the overall population .

Correlation measures the degree to which two variables have a linear relationship, indicating that changes in one are associated with changes in the other . However, correlation does not imply causation, meaning one variable change doesn't cause the change in another . Understanding this distinction is crucial because it prevents incorrect conclusions about data relationships during statistical analysis. Misinterpreting correlation as causality can lead to flawed decision-making and analysis, especially in fields like medicine and economics where decisions can have significant implications .

Statistical indicators provide quantitative measures that help describe and assess aspects of socio-economic environments, offering insights into areas like economic health and social well-being . For example, per capita income illustrates the average income of individuals in an area, unemployment rates provide information on job market conditions, and median years of education gauge educational attainment levels . These indicators help policymakers and researchers to analyze trends, make informed decisions, and evaluate the impact of policies or social programs .

Quantitative variables are numerical and can be measured and analyzed mathematically, suitable for statistical and mathematical modeling . These include variables like age, height, and income, which can be continuous or discrete. Qualitative variables, on the other hand, describe non-numeric information, such as gender or city of birth, and are often used to categorize and segment data . The differences are significant in data analysis because they determine the types of statistical tests and methods used, such as mean and standard deviation for quantitative data versus frequency and mode for qualitative data, impacting the interpretation and conclusions drawn from analyses .

Big data is characterized by its large volume, high velocity (fast creation), and diverse variety (different types of data). Its application offers opportunities like enhanced insights from extensive datasets, improved decision-making processes, and potential innovations across fields such as business analytics, healthcare, and environmental science . However, challenges include data management complexities, privacy and security issues, and the need for advanced analytics and computational resources to process and analyze the immense data effectively . Successfully leveraging big data requires balancing these opportunities and challenges with robust data governance and ethical considerations .

In statistical models, constants and variables play distinct yet complementary roles. Constants are fixed values that provide a reference point within mathematical equations or models, remaining unchanged throughout analyses, such as the acceleration due to gravity in physics . Variables, conversely, represent changeable conditions affecting outcomes, categorized as dependent or independent, such as temperature impacting plant growth . Distinguishing between them is crucial because it establishes the structure and dynamic aspects of models, ensuring clarity in interpretation and accuracy in predictions . Failing to properly classify constants and variables could result in incorrect model configurations or misinterpretation of data relationships .

You might also like