Chapter 4
Supervised Learning
1
Types of Machine learning
3.1 Regression:
Linear Regression
Contents 3.2 Classification
K-nearest neighbors
Decision Tree
Naive Bayesian
SVM
Ensemble Learning 3
Linear Regression
● Simple linear regression
● Multiple linear regression
● Examples
● Advantage of linear regression
● Limitation of linear regression
4
Linear Regression
● Linear regression is a linear model, e.g. a model that
assumes a linear relationship between the input
variables x and the single output variable y.
5
Simple linear regression
● Linear Regression with one variable
● The objective variable y and the input
variable x1 have the following linear
relationship:
● w0, w1 are unknown constants => We
estimate their values from the input data
● y is the actual value of outcome (based
on the statistics we have in the training
data set),
● ŷ is the value that the model predicts.
6
Simple linear regression
● How do we estimate the coefficients (“fit the model”)?
● How to evaluate model fit from observed data?
7
Prediction error
● It is expected that the difference between
the true value y and the predicted value ŷ
is minimal
⇒ Sum of squared error is calculated by the
formula:
L(w|X): the loss function
The parameters w=(w0, w1) are estimated by
minimizing the sum of squares of error
8
Linear Regression with one variable
● The derivation of loss function:
● We have the solution:
9
Linear Regression with one variable
Example:
● Given a data set consisting of population information and profit earned when opening
restaurants in 15 cities, the data distributed as follows:
Population Profit Population Profit
1.7 139 4 162
2.1 150 5 160
3.5 160 1 135
3.9 162 6.6 171
5 161 4.5 157
6.5 170 5.5 160
2.5 152 3 150
3.5 154
● Predict the profit of a certain restaurant, given the population of the city in which the
restaurant is located?
10
Linear Regression with one variable
Population Profit Population Profit
1.7 139 4 162
2.1 150 5 160
3.5 160 1 135
3.9 162 6.6 171
5 161 4.5 157
6.5 170 5.5 160
2.5 152 3 150
3.5 154
11
Linear Regression with one variable
12
Multiple linear regression
● Multivariable linear regression (Linear Regression with
multiple variables): the model with more than 1 variables is
used to predict the target variable (output):
ŷ = f(w,x)= w0 + w1x1 + w2x2 + …+ wnxn
● Loss function:
13
Multiple linear regression
● The derivation of loss function:
● We have the solution:
14
Multiple linear regression
Area (m2) Number of Number of Sale Price
● Example: bedrooms Floors ($1000)
Predicting 2100 5 1 460
House 1416 3 2 232
Prices 1534 3 2 315
852 2 1 178
1600 3 2 329
1985 5 1 420
1535 4 2 330
1050 2 1 195
2300 4 2 450
1200 3 2 250 15
Data regulation
● A house costs $2100, number of bedrooms: 5,
number of floors: 1. House price forecast:
Large differences in the range between variables can
reduce the accuracy of the model in some cases.
● Solution: normalize the data to the same range.
16
Data regulation
Normalize the data to the same range
● Min max scale:
● Standard scale:
● Unit length scale:
Apply normalization in linear regression:
17
Advantage of linear regression
● Simple model, Easy to understand
● Continuous variables are predictable
● Simple optimal solution
● Easy to interpret the model through regression
coefficients
18
Limitation of linear regression
● The model is simple, so it is not flexible to represent
complex data relationships.
● Very sensitive to outliers (noise)
19
Pratical exercises
20