0% found this document useful (0 votes)

5 views64 pages

Hypothesis Testing and Model Assessment

The document discusses hypothesis testing, focusing on statistical inference, model assessment, and decision-making under uncertainty. It explains key concepts such as null and alternative hypotheses, test statistics, and p-values, using examples like Minecraft and genetic models to illustrate the application of these concepts. The document emphasizes the importance of comparing observed data with model predictions to determine the validity of hypotheses.

Uploaded by

babyj.conly

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views64 pages

Hypothesis Testing and Model Assessment

Uploaded by

babyj.conly

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Testing Hypotheses

By
Dr. Yang Zhang

1
Overview
● Models that involve chance
● Assessing models
● Comparing distributions
● Hypothesis testing and p-values
● Making decisions with incomplete information
● Error probabilities

2
Review: Distributions
● Any random quantity—aka, random variable (RV)—has a probability
distribution, which captures
○ all the possible values the RV can take; and
○ the probability that the RV takes each value.
● After repeated draws, the RV has an empirical distribution, which
captures
○ all the observed values the RV took; and
○ the proportion of times the RV took each value.
● Law of Large Numbers: With increasing number of independent
draws, the empirical distribution looks more and more like the probability
distribution.

3
Inference

4
Terminology
● Parameter
○ A number that characterizes an aspect of the population
○ Generally, impractical to determine directly

○ Example:

■ A coin of unknown, but fixed (deterministic), bias b in favor of a Head:

■ b = Probability of a Head

■ 1-b = Probability of a Tail

5
Inference
● Statistical Inference:
Draw conclusions (draw inferences) based on data in random
samples
● Example: but fixed (deterministic)
Use the data to guess the value of an unknown number
(parameter)

which depends on the random sample,

and is, therefore, itself random.

Create an estimate of the unknown number using a statistic.

6
Terminology
● Statistic is a number
○ calculated from the sample
○ descriptive of the entire sample
○ serves as an estimate of the unknown parameter
● Example
○ Flip the coin n times. Count the number H of Heads in those n flips.
○ An estimate of the coin’s bias b is the statistic given by

Note: H is random (capital letter). Hence, so is .

But b and n are deterministic (small letters).

7
Probability Distribution of a Statistic
● Values of a statistic vary, because random samples vary
● “Sampling distribution” or “probability distribution” of the
statistic captures
○ all possible values of the statistic; and
○ all the corresponding probabilities of those values.
● Typically hard to calculate:
○ must either do the math (often intractable); or
○ must generate all possible samples and calculate the
statistic based on each sample (impractical).

8
Empirical Distribution of a Statistic
● Empirical distribution of a statistic:
○ Based on simulated and/or sampled values (i.e.,
observed values) of the statistic;
○ Consists of all the observed values of the statistic; and
○ Shows proportion of times each value appeared (was
observed).
● Good approximation to the probability distribution of the
statistic
○ IF the number of repetitions in the simulation is large.

9
Assessing Models

10
Models
● A model is a set of assumptions about the data

● In data science, many models involve assumptions

about processes that involve randomness

○ “Chance models”

● Key question: Does the model fit the data?

11
Approach to Assessment
● If we can simulate data according to the model’s
assumptions, we can learn what the model predicts.

● Can then compare the predictions with the observed

data.

● If data and model’s predictions are inconsistent, we

have evidence against the model.

12
Minecraft (!?)
(An Example)

13
Scene
● Minecraft is a video game
● People called “Speed Runners” race to win the game as fast as
possible
● One famous speed runner has the username “Dream”
○ 3.7 million followers on Twitter
○ 5 million followers on the video game streaming site Twitch
○ $$$$$
● October 2020
○ 6 live streams on Twitch
○ Dream is trying to win the game as quickly as possible
○ Accused of cheating

14
Scene Continued
● In the game, there are characters called
“Piglin” who may give you items that help
you win faster. They may give you the
item when initiate a trade with them.
● Every time you initiate a trade, they
have a 5% chance to give you an item
called an “Ender Pearl” that helps you
win.

15
Summary

16
The data
● Data: across the 6 streams, Dream initiated trades 262
times, and received an “Ender Pearl” 42 times.
● Question: is 42/262 (16%) a realistic outcome, if
Dream was playing an unmodified version of the game?
● Recall: in the unmodified game, there is a 5% chance
of receiving an “Ender Pearl” in a trade.

17
Sampling from a Distribution
● Sample at random from a categorical distribution

sample_proportions(sample_size, pop_distribution)

● Samples at random from the population

○ Returns an array containing the distribution of the
categories in the sample

(Demo)
18
19
A Genetic Model

20
Gregor Mendel, 1822-1884

21
A Model
● Pea plants of a particular kind
● Each one has either purple flowers or white flowers

● Mendel’s model:
○ Each plant is purple-flowering with chance 75%,
○ regardless of the colors of the other plants

● Question:
○ Is the model good, or not?

22
Choosing a Statistic
● Take a sample, see what percent are purple-flowering
● If that percent is much larger or much smaller than 75,
that is evidence against the model
● Distance from 75 is the key

● Statistic:
| sample percent of purple-flowering plants - 75 |

● If the statistic is large, that is evidence against the

model (Demo)
23
Two Viewpoints

24
Model and Alternative
● Minecraft:
○ Model: Each trade has a 5% chance of returning an
ender pearl
○ Alternative viewpoint: The chance was higher than
5% in the version Dream played
● Genetics:
○ Model: Each plant has a 75% chance of having
purple flowers
○ Alternative viewpoint: No, it doesn’t
25
Steps in Assessing a Model
● Choose a statistic to measure discrepancy between
model and data
● Simulate the statistic under the model’s assumptions
● Compare the data to the model’s predictions:
○ Draw a histogram of simulated values of the statistic
○ Compute the observed statistic from the real sample
● If the observed statistic is far from the histogram, that is
evidence against the model

26
Comparing Distributions

27
Example: Haribo Goldbears
● There are 5 different flavors in a
pack of Haribo Goldbears (gummy
bears)
● Let’s say that Haribo claims that
each pack has the same proportion
of each flavor of gummy bears
● Yanay buys a pack of gummy bears
and notices that the proportions of
each flavor seems different from
what he expected. Is this difference
due to chance or is there something (Demo)
else going on?
28
A New Statistic:
Total Variation Distance
(TVD)

29
Distance Between Distributions
● Distribution of flavors is categorical

● To see whether the distribution of flavors in the bag is

close to that of what Haribo claims, we have to measure
the distance between two categorical distributions

30
Total Variation Distance
Every distance has a computational recipe
Total Variation Distance (TVD):
● For each category, compute the difference in
proportions between two distributions
● Take the absolute value of each difference
● Sum, and then divide the sum by 2
(Demo)
31
Summary of the Method
To assess whether a sample was drawn randomly from a known
categorical distribution:
● Use TVD as the statistic to measure the distance between categorical
distributions
● Sample at random from the population
● Compute the TVD from the random sample;
● Repeat numerous times (e.g., 1,000 times)
● Compare:
○ Empirical distribution of simulated TVDs
○ Actual TVD from the sample in the study

32
Testing Hypotheses

33
Testing Hypotheses
● A test chooses between two views of how data were
generated

● The views are called hypotheses

● The test picks the hypothesis that is better supported by the

observed data

34
Null and Alternative
The method only works if we can simulate data under one
of the hypotheses.
● Null Hypothesis
○ A well-defined chance model about how the data
were generated
○ We can simulate data under the assumptions of this
model – “under the null hypothesis”
● Alternative Hypothesis
○ A different view about the origin of the data
35
Haribo Example
Null Hypothesis: The distribution of flavors of gummy
bears is equal, with ⅕ probabilities per flavor. Any
difference is due to chance alone.

Alternative Hypothesis: The difference is not due to

chance - the number of gummy bears are not evenly
distributed among different flavors, with some flavors being
more prevalent than others.

36
Test Statistic
● Test Statistic: Statistic we choose to simulate and
decide between the two hypotheses

Questions before choosing the statistic:

● What values of the statistic will make us lean toward the
null hypothesis?

● What values will make us lean toward the alternative?

○ Preferably, the answer should be just “high”.
Try to avoid “both high and low,” if possible.
37
Haribo Example
● Test Statistic: Total variation distance

def tvd(dist1, dist2):

return sum(abs(dist1 - dist2))/2

38
Prediction Under the Null Hypothesis
● Simulate the test statistic under the null hypothesis many times
(e.g., 1,000 times)
● Draw the histogram of the simulated values
● This displays the empirical distribution of the statistic under
the null hypothesis
● It’s a prediction about the statistic, made by the null hypothesis
○ It shows all the likely values of the statistic
○ Shows how likely they are (if the null hypothesis is true)
● The probabilities are approximate, because we can’t generate
all the possible random samples

39
Haribo Example
● Simulate the test statistic under the null hypothesis many times
(e.g., 1,000 times)
● Draw the histogram of the simulated values

tvds = make_array()
num_simulations = 10000
for i in
[Link](num_simulations):
new_tvd = simulated_tvd()
tvds = [Link](tvds,
new_tvd)

40
Conclusion of the Test
Resolve choice between null and alternative hypotheses
● Compare the observed test statistic and its empirical
distribution under the null hypothesis
● If the observed value is not consistent with the distribution,
then the test favors the alternative (“data more consistent with
the alternative”)
Whether a value is consistent with a distribution:
● A visualization may be sufficient
● If not, there are conventions about “consistency” (stay tuned)

41
Hypothesis Testing with
Python

42
Defining Hypotheses
● First of all, we should understand which scientific question we are
looking for an answer to, and it should be formulated in the form of
the Null Hypothesis (H₀) and the Alternative Hypothesis (H₁ or Hₐ).
● Remember that H₀ and H₁ must be mutually exclusive, and H₁
shouldn’t contain equality:
○ H₀: μ=x, H₁: μ≠x
○ H₀: μ≤x, H₁: μ>x
○ H₀: μ≥x, H₁: μ<x

43
Assumption Check
● To decide whether to use the parametric or nonparametric version
of the test, we should check the specific requirements listed below:
○ Observations in each sample are independent and identically
distributed (IID).
○ Observations in each sample are normally distributed.
○ Observations in each sample have the same variance.

44
Selecting the Proper Test
● Then we select the appropriate test to be used.
● When choosing the proper test, it is essential to analyze
how many groups are being compared and whether the
data are paired or not.
● To determine whether the data is matched, it is
necessary to consider whether the data was collected
from the same individuals.

45
Selecting the Proper Test
● Accordingly, you can decide on the appropriate test
using the chart below.

46
Decision and Conclusion
● After performing the hypothesis testing, we obtain a
related p-value that shows the significance of the test.

● If the p-value is smaller than the alpha (the significance

level), in other words, there is enough evidence to prove
H₀ is not valid; you can reject H₀.

47
Decision and Conclusion
● Otherwise, you fail to reject
H₀. Please remember that
rejecting H₀ validates H₁.

● However, failing to reject H₀

does not mean H₀ is valid,
nor does it mean H₁ is wrong.

(Demo) 48
Decisions and Uncertainty

49
Incomplete Information
● Try to choose between two worldviews (hypotheses)—based on
data in samples (rarely, do we have access to entire population).
● Not always clear whether the data are consistent with one
hypothesis or the other.
● Easier (More Obvious) Decision:
Observed data can turn out quite extreme.
Unlikely, but possible.
● Harder (Less Obvious) Decision:
Observed data can turn out in the proverbial ‘gray area’ —
within reach of each of the two hypotheses.

50
Another Example
(“Gray Area” Type)

51
The Problem
● Large(ish) Statistics class divided into 12 discussion
sections

● Graduate Student Instructors (GSIs) lead the sections

● After midterm, students in Sec. 3 notice average score

in their section lower than in others!

52
The GSI’s Defense
Sec. 3 GSI Position (Null Hypothesis):
● Had we picked my section at random from the whole
class, we could’ve gotten an average like this one.

Alternative Hypothesis:
● No! Sec. 3’s average score too low.
Randomness not the only reason for lower scores.
(Demo)

53
Statistical Significance

54
Tail Areas
Minecraft Ender Pearls Haribo Goldbears Mendel’s Pea Plants

Observed Number (42) Observed TVD (0.13) Observed Distance (1.32)

To quantify reasonableness of observation relative to the

random samples, look at tail probabilities.
55
Conventions About Inconsistency
● “Inconsistent with the null”: The test statistic is in the tail
of the empirical distribution—under the null hypothesis
○ The farther out in the tail the test statistic lies, the more
inconsistent it is with the null hypothesis
● “In the tail,” first convention:
○ The area in the tail is less than 5%
○ The result is “statistically significant”
● “In the tail,” second convention:
○ The area in the tail is less than 1%
○ The result is “highly statistically significant”

56
The p-Value as an Area
● Empirical distribution Distribution under the
of the test statistic Null Hypothesis
under the null
hypothesis.

● Red dot denotes the

observed statistic.

● Yellow area denotes

the tail probability (p-
value).
(Demo)
57
Definition of the p-value
Formal name: observed significance level

The p-value is the chance (probability),

● under the null hypothesis,
● that the test statistic
● is equal to the value that was observed in the data
● or is even further in the direction of the alternative.
● Last two bullets mean: “test statistic is at least as
extreme as the observed value.”

58
P-Values and Error Probabilities

59
Can the Conclusion be Wrong?
Yes.
Null is true Alternative is
true
Test favors the
null
Test favors the
alternative

60
An Error Probability
● The cutoff for the P-value is an error probability.

● If:
○ your cutoff is 5%
○ and the null hypothesis happens to be true

● then there is about a 5% chance that your test will

reject the null hypothesis.

61
P-value cutoff vs P-value
● P-value cutoff
○ Does not depend on observed data or simulation
○ Decide on it before seeing the results
○ Conventional values at 5% and 1%
○ Probability of hypothesis testing making an error
● P-value (empirical)
○ Depends on the observed data and simulation
○ Probability under the null hypothesis that the test
statistic is the observed value or further towards the
alternative
62
How We’ve Tested Thus Far

63
Hypothesis Testing Review
● One Category (e.g. percent of flowers that are purple)
○ Test Statistic (1): empirical_percentage
○ Test Statistic (1): abs(empirical_percentage - null_percentage)
○ How to Simulate: sample_proportions(n, null_dist)
● Multiple Categories (e.g. flavor distribution of gummy bears)
○ Test Statistic: tvd(empirical_dist, null_dist)
○ How to Simulate: sample_proportions(n, null_dist)
● Numerical Data (e.g. scores in a lab section)
○ Test Statistic: empirical_mean
○ How to Simulate: population_data.sample(n, with_replacement=False)
64

Statistical Inference Overview
No ratings yet
Statistical Inference Overview
43 pages
A/B Testing in Data Science: Birth Weights
No ratings yet
A/B Testing in Data Science: Birth Weights
33 pages
Lecture 17 - Comparing Distributions
No ratings yet
Lecture 17 - Comparing Distributions
22 pages
Hypothesis Testing and Inference Methods
No ratings yet
Hypothesis Testing and Inference Methods
29 pages
Types and Procedures of Hypothesis Testing
No ratings yet
Types and Procedures of Hypothesis Testing
41 pages
Descriptive Analysis & Hypothesis Testing
No ratings yet
Descriptive Analysis & Hypothesis Testing
63 pages
Statistical Analysis and Hypothesis Testing Guide
No ratings yet
Statistical Analysis and Hypothesis Testing Guide
15 pages
1.4 Class
No ratings yet
1.4 Class
25 pages
Hypothesis Testing Basics and Examples
No ratings yet
Hypothesis Testing Basics and Examples
51 pages
Basic Statistics: Hypothesis Testing Guide
No ratings yet
Basic Statistics: Hypothesis Testing Guide
31 pages
Statistical Techniques
No ratings yet
Statistical Techniques
37 pages
Research and Statistics
No ratings yet
Research and Statistics
35 pages
Statistical Methods Revision Guide
No ratings yet
Statistical Methods Revision Guide
9 pages
Introduction to Statistics and Types
No ratings yet
Introduction to Statistics and Types
64 pages
Business Statistics: Global Edition Guide
No ratings yet
Business Statistics: Global Edition Guide
6 pages
Univariate Statistics and Hypothesis Testing
No ratings yet
Univariate Statistics and Hypothesis Testing
32 pages
Hypothesis Testing Methodology Overview
No ratings yet
Hypothesis Testing Methodology Overview
31 pages
Lecture 18 - Decisions & Uncertainty
No ratings yet
Lecture 18 - Decisions & Uncertainty
19 pages
Hypothesis Testing in Statistics
No ratings yet
Hypothesis Testing in Statistics
47 pages
Statistical Modeling with Python Insights
No ratings yet
Statistical Modeling with Python Insights
25 pages
Understanding Statistical Tests Explained
No ratings yet
Understanding Statistical Tests Explained
31 pages
Hypothesis Testing in Business Analytics
No ratings yet
Hypothesis Testing in Business Analytics
31 pages
Inferrential Statistics
No ratings yet
Inferrential Statistics
63 pages
Centrality and Spread in Statistics
No ratings yet
Centrality and Spread in Statistics
62 pages
Statistical Analysis: Key Concepts Explained
No ratings yet
Statistical Analysis: Key Concepts Explained
85 pages
Hypothesis Testing and Confidence Intervals
No ratings yet
Hypothesis Testing and Confidence Intervals
210 pages
Ch6 - PM Postlecture
No ratings yet
Ch6 - PM Postlecture
100 pages
A/B Testing and Statistical Inference
No ratings yet
A/B Testing and Statistical Inference
31 pages
Inferential Statistics in Data Science
No ratings yet
Inferential Statistics in Data Science
76 pages
Hypothesis Testing in Data Science with R
No ratings yet
Hypothesis Testing in Data Science with R
63 pages
Introduction to Statistics Lab Guide
100% (1)
Introduction to Statistics Lab Guide
75 pages
Understanding Inferential Statistics
No ratings yet
Understanding Inferential Statistics
4 pages
Understanding Hypothesis Testing Basics
No ratings yet
Understanding Hypothesis Testing Basics
26 pages
Experimental Design in Statistical Modeling
No ratings yet
Experimental Design in Statistical Modeling
20 pages
Hypothesis Testing Explained: Methods & Steps
No ratings yet
Hypothesis Testing Explained: Methods & Steps
11 pages
Statistical Tools for Data Analysis
No ratings yet
Statistical Tools for Data Analysis
66 pages
Understanding Statistical Testing in Data Science
No ratings yet
Understanding Statistical Testing in Data Science
36 pages
Understanding Nonparametric Statistics
No ratings yet
Understanding Nonparametric Statistics
63 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
8 pages
Z-Scores in Hypothesis Testing
No ratings yet
Z-Scores in Hypothesis Testing
100 pages
Descriptive vs Inferential Statistics Guide
No ratings yet
Descriptive vs Inferential Statistics Guide
30 pages
Introduction to Inferential Statistics
No ratings yet
Introduction to Inferential Statistics
34 pages
Understanding Hypothesis Testing Basics
No ratings yet
Understanding Hypothesis Testing Basics
59 pages
Understanding Inferential Statistics
No ratings yet
Understanding Inferential Statistics
48 pages
Measurement Methods in Experiments
No ratings yet
Measurement Methods in Experiments
6 pages
Hypothesis Testing in Statistics Explained
No ratings yet
Hypothesis Testing in Statistics Explained
35 pages
Introduction to Statistics Overview
No ratings yet
Introduction to Statistics Overview
50 pages
Data Modeling and Hypothesis Testing Guide
No ratings yet
Data Modeling and Hypothesis Testing Guide
50 pages
Introduction to Statistics and Analysis
No ratings yet
Introduction to Statistics and Analysis
10 pages
1.testing of Hypothesis - Intro
No ratings yet
1.testing of Hypothesis - Intro
10 pages
Understanding Sampling Distributions and Hypothesis Testing
No ratings yet
Understanding Sampling Distributions and Hypothesis Testing
159 pages
Statistical Thinking in ETC2420
No ratings yet
Statistical Thinking in ETC2420
41 pages
Understanding Hypothesis Testing in Statistics
No ratings yet
Understanding Hypothesis Testing in Statistics
97 pages
Statistical Methods in Excel 2023
No ratings yet
Statistical Methods in Excel 2023
37 pages
Raspberry Pi Data Processing and Analysis
No ratings yet
Raspberry Pi Data Processing and Analysis
9 pages
A/B Testing and Hypothesis Design Guide
No ratings yet
A/B Testing and Hypothesis Design Guide
37 pages
A/B Testing and Hypothesis Design Guide
No ratings yet
A/B Testing and Hypothesis Design Guide
47 pages
Understanding Business Hypothesis Testing
No ratings yet
Understanding Business Hypothesis Testing
58 pages
Introduction to Hypothesis Testing
No ratings yet
Introduction to Hypothesis Testing
33 pages
IGCSE Chemistry Notes 2023-2025
100% (1)
IGCSE Chemistry Notes 2023-2025
9 pages
IGCSE Organic Chemistry Overview
No ratings yet
IGCSE Organic Chemistry Overview
8 pages
TECM 2700 Technical Writing Course Guide
No ratings yet
TECM 2700 Technical Writing Course Guide
13 pages
CSCE 1030: Intro to Computer Science
No ratings yet
CSCE 1030: Intro to Computer Science
7 pages
BCIS 3610 Syllabus Overview Fall 2024
No ratings yet
BCIS 3610 Syllabus Overview Fall 2024
6 pages
Factors Driving European Expansion
No ratings yet
Factors Driving European Expansion
13 pages
Andrew Jackson's Presidency and Democracy
No ratings yet
Andrew Jackson's Presidency and Democracy
6 pages
Nationalism and Political Changes in Early America
No ratings yet
Nationalism and Political Changes in Early America
7 pages
Empirical Rule: Understanding Data Distribution
No ratings yet
Empirical Rule: Understanding Data Distribution
3 pages
Family Ethics: Values and Education
No ratings yet
Family Ethics: Values and Education
9 pages
Philippine National Police Personal Data Sheet
No ratings yet
Philippine National Police Personal Data Sheet
7 pages
Music and Dance: A Deep Connection
No ratings yet
Music and Dance: A Deep Connection
3 pages
Financial Investment Assignment Guide
No ratings yet
Financial Investment Assignment Guide
11 pages
Criminal Litigation SQE Revision Notes 2025 Zbbmby
No ratings yet
Criminal Litigation SQE Revision Notes 2025 Zbbmby
20 pages
PEC Guidelines for ODL in Engineering
No ratings yet
PEC Guidelines for ODL in Engineering
3 pages
Art and Education Course Curriculum
No ratings yet
Art and Education Course Curriculum
3 pages
Understanding Types of Conflict in Stories
No ratings yet
Understanding Types of Conflict in Stories
15 pages
I See The Rhythm of Gospel
100% (1)
I See The Rhythm of Gospel
12 pages
GenAI Interview Questions & Answers Guide
No ratings yet
GenAI Interview Questions & Answers Guide
2 pages
Goodbye Yellow Brick Road Sheet Music
No ratings yet
Goodbye Yellow Brick Road Sheet Music
3 pages
Andrew Johnson's Economic Downfall
No ratings yet
Andrew Johnson's Economic Downfall
8 pages
Open Happiness
No ratings yet
Open Happiness
24 pages
Business Research Methodology Overview
No ratings yet
Business Research Methodology Overview
114 pages
Frank Gehry: Deconstructivist Architect
No ratings yet
Frank Gehry: Deconstructivist Architect
11 pages
Freud's Theories of Personality Overview
100% (3)
Freud's Theories of Personality Overview
40 pages
Brand Management Competency Guide
No ratings yet
Brand Management Competency Guide
78 pages
SPSS Interface: Data Entry & Descriptives
No ratings yet
SPSS Interface: Data Entry & Descriptives
23 pages
Chopin's Teaching Insights from Pupils
100% (2)
Chopin's Teaching Insights from Pupils
5 pages
Poetry Mastery Test Instructions
100% (2)
Poetry Mastery Test Instructions
3 pages
William Morris: Craftsmanship and Ideals
No ratings yet
William Morris: Craftsmanship and Ideals
2 pages
Entertainment Trivia Quizzes Guide
No ratings yet
Entertainment Trivia Quizzes Guide
8 pages
Contribution Matrix for Research Study
No ratings yet
Contribution Matrix for Research Study
8 pages
Key Concepts of the Big Bang Theory
No ratings yet
Key Concepts of the Big Bang Theory
2 pages
UK Registered Psychiatrists List
No ratings yet
UK Registered Psychiatrists List
325 pages
Coping Stress Strategies for Students
No ratings yet
Coping Stress Strategies for Students
11 pages
NetworkComputing 2011 05
No ratings yet
NetworkComputing 2011 05
15 pages
Employee Retention Strategies Study
No ratings yet
Employee Retention Strategies Study
48 pages
Key Events of the French Revolution
No ratings yet
Key Events of the French Revolution
4 pages
How Writing Works - A Field Guide To Effective Writing, 2nd Edition
100% (5)
How Writing Works - A Field Guide To Effective Writing, 2nd Edition
327 pages

Hypothesis Testing and Model Assessment

Uploaded by

Hypothesis Testing and Model Assessment

Uploaded by

Testing Hypotheses

■ A coin of unknown, but fixed (deterministic), bias b in favor of a Head:

■ 1-b = Probability of a Tail

which depends on the random sample,

Create an estimate of the unknown number using a statistic.

Note: H is random (capital letter). Hence, so is .

● In data science, many models involve assumptions

● Key question: Does the model fit the data?

● Can then compare the predictions with the observed

● If data and model’s predictions are inconsistent, we

● Samples at random from the population

● If the statistic is large, that is evidence against the

● To see whether the distribution of flavors in the bag is

● The views are called hypotheses

● The test picks the hypothesis that is better supported by the

Alternative Hypothesis: The difference is not due to

Questions before choosing the statistic:

● What values will make us lean toward the alternative?

def tvd(dist1, dist2):

● If the p-value is smaller than the alpha (the significance

● However, failing to reject H₀

● Graduate Student Instructors (GSIs) lead the sections

● After midterm, students in Sec. 3 notice average score

Observed Number (42) Observed TVD (0.13) Observed Distance (1.32)

To quantify reasonableness of observation relative to the

● Red dot denotes the

● Yellow area denotes

The p-value is the chance (probability),

● then there is about a 5% chance that your test will

You might also like