0% found this document useful (0 votes)
21 views3 pages

Understanding Inferential Statistics

Mind Map (mapa mental)

Uploaded by

ALFREDO ADCO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views3 pages

Understanding Inferential Statistics

Mind Map (mapa mental)

Uploaded by

ALFREDO ADCO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Inferential Statistics

Introduction
Definition and purpose
○ Inferential statistics refers to the branch of statistics that allows us to make inferences or
generalizations about a population based on a sample of data. Its primary purpose is to
estimate population parameters, test hypotheses, and make predictions while considering the
inherent uncertainty involved in statistical sampling.

Differences between descriptive and inferential statistics


○ Descriptive statistics summarizes and describes the main features of a data set, providing
simple summaries such as means and standard deviations. In contrast, inferential statistics
goes further by using sample data to make predictions or inferences about a larger population,
often involving hypothesis testing and confidence intervals to quantify uncertainty.

Sampling Techniques
Simple random sampling
○ This technique involves selecting a subset of individuals from a larger population in such a
way that each individual has an equal chance of being chosen. This method minimizes bias
and is fundamental for ensuring the representativeness of the sample.

Stratified sampling
○ Stratified sampling divides the population into distinct subgroups, known as strata, that share
similar characteristics. A random sample is then taken from each stratum, ensuring that the
sample reflects the population's diversity and that key subgroups are adequately represented.

Cluster sampling
○ In cluster sampling, the population is divided into clusters (often geographically), and entire
clusters are randomly selected for analysis. This method is cost-effective and useful when the
population is widely dispersed, although it can introduce additional variance if clusters are
not representative of the whole population.

Systematic sampling
○ Systematic sampling involves selecting every nth individual from a list of the population after
a random starting point. This method is straightforward to implement but can lead to bias if
there is an underlying pattern in the list.

Estimation
Point estimation
○ Definition

■ Point estimation provides a single value estimate of a population parameter, such as


the population mean or proportion, based on sample data. It serves as a best guess of
the parameter, without providing information about its precision.
○ Common methods (e.g., Maximum Likelihood Estimation)

■ Methods such as Maximum Likelihood Estimation (MLE) are commonly used for
point estimation, as they identify the parameter values that make the observed data
most probable, thus providing efficient and unbiased estimates.

Interval estimation
○ Confidence intervals

■ Confidence intervals provide a range of values within which we expect the population
parameter to lie with a specified level of confidence (e.g., 95%). This interval
quantifies the uncertainty associated with point estimates and is crucial for making
informed decisions.

○ Margin of error

■ The margin of error is the amount of error that is allowed in the estimate. It reflects
the precision of the estimate; a smaller margin of error indicates a more precise
estimate, which is influenced by the sample size and variability in the data.

Hypothesis Testing
Null and alternative hypotheses
○ The null hypothesis (H0) represents a statement of no effect or no difference, while the
alternative hypothesis (H1) represents what we aim to prove. The formulation of these
hypotheses is fundamental to statistical testing, guiding the analysis.

Type I and Type II errors


○ Type I error occurs when we incorrectly reject the null hypothesis when it is true, whereas
Type II error happens when we fail to reject the null hypothesis when it is false.
Understanding these errors is crucial for evaluating the reliability of hypothesis tests.

p-values and significance levels


○ The p-value measures the strength of the evidence against the null hypothesis, indicating the
probability of observing the sample data, or something more extreme, if the null hypothesis is
true. The significance level (alpha) is a threshold used to determine whether to reject the null
hypothesis, typically set at 0.05.

Common tests (e.g., t-test, chi-square test)


○ Various statistical tests are used to evaluate hypotheses, including t-tests for comparing
means between two groups and chi-square tests for examining the association between
categorical variables. Each test has specific conditions and applications depending on the data
type.

Regression Analysis
Simple linear regression
○ Simple linear regression analyzes the relationship between two continuous variables by
fitting a linear equation to the observed data. This technique helps in predicting the dependent
variable based on the independent variable's values.
Multiple regression
○ Multiple regression extends simple linear regression by including two or more independent
variables to predict a dependent variable. This approach allows for a more comprehensive
analysis of the factors influencing outcomes.

Interpretation of coefficients
○ In regression analysis, coefficients represent the estimated change in the dependent variable
for a one-unit change in the independent variable, holding other variables constant. Proper
interpretation of these coefficients is essential for understanding the strength and direction of
relationships among variables.

Analysis of Variance (ANOVA)


One-way ANOVA
○ One-way ANOVA tests differences in means among three or more independent groups based
on one independent variable. It assesses whether at least one group mean differs significantly
from the others, helping to understand group effects.

Two-way ANOVA
○ Two-way ANOVA evaluates the effects of two independent variables on a dependent
variable and can assess interaction effects. This method provides more detailed insights into
how different factors affect outcomes.

Assumptions and applications


○ ANOVA assumes that the samples are independent, normally distributed, and have equal
variances. Its applications include comparing means across different groups in various fields,
such as medicine, psychology, and marketing.

Non-parametric Methods
Wilcoxon test
○ The Wilcoxon test is a non-parametric alternative to the t-test, used to assess whether there is
a significant difference between paired or related samples when the data does not meet
normality assumptions. It ranks the differences and evaluates their significance.

Kruskal-Wallis test
○ The Kruskal-Wallis test extends the Wilcoxon test to more than two groups. It assesses
whether samples from different groups originate from the same distribution, making it useful
for comparing independent samples in non-normal data.

Advantages and disadvantages


○ Non-parametric methods often require fewer assumptions than parametric tests, making them
suitable for small sample sizes or non-normal distributions. However, they may be less
powerful than parametric tests when the assumptions of the latter are satisfied.

You might also like