Intro to Machine
Learning
Part – 1
Dr. Oybek Eraliev,
Department of Computer Engineering
Inha University In Tashkent.
Email: oybekeraliev7@[Link]
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 1
Content
ØWhat is Machine Learning?
ØSupervised Learning
ØUnsupervised Learning
ØLinear Regression with One Variable
ØLinear Regression with Multiple Variables
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 2
What is Machine Learning?
Machine Learning (ML) – the use and development of computer systems that
are able to learn and adapt without following explicit instructions, by using
algorithms and statistical models to analyse and draw inferences from patterns
in data.
Machine Learning (ML) – Field of study that gives computers the ability to
learn without being explicitly programmed. Arthur Samuel (1959).
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
What is Machine Learning?
Artificial
Intelligence
Machine Learning is a part Machine Learning
of Artificial Intelligence.
Deep Learning
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
What is Machine Learning?
Supervised learning
Unsupervised learning
Machine learning
Reinforcement learning
Recommender system
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Content
ØWhat is Machine Learning?
ØSupervised Learning
ØUnsupervised Learning
ØLinear Regression with One Variable
ØLinear Regression with Multiple Variables
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 6
Supervised Learning
Supervised learning
A. Classification problem B. Regression problem
• Logistic Regression • Linear Regression
• Decision Tree • Ridge Regression
• Naive Bayes • Stepwise Regression
• K – Nearest Neighbor Example:
• Support Vector Machine • Stock Market Prediction
Example: • Rainfall prediction
• Email spam detection
• Speech Recognition
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Supervised Learning
Classification problem
Apple
Apple
Output
Model
Banana
(Algorithm)
Input
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Supervised Learning
Regression problem
45 °C
Temperature Temperature
(°C) (F)
10 50
13 55.4 113 F
22 71.6
35 95
Input Model Output
(Algorithm)
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Content
ØWhat is Machine Learning?
ØSupervised Learning
ØUnsupervised Learning
ØLinear Regression with One Variable
ØLinear Regression with Multiple Variables
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 10
Unsupervised Learning
Clustering problem
• K – Means
• MeanShift
Unsupervised learning
Dimensionality reduction
• Principle Component Analysis (PCA)
• Linear Discriminant Analysis (LDA)
• Autoencoders (AEs)
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Unsupervised Learning
Clustering problem
Input Model Output
(Algorithm)
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Content
ØWhat is Machine Learning?
ØSupervised Learning
ØUnsupervised Learning
ØLinear Regression with One Variable
ØLinear Regression with Multiple Variables
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 13
100
Dataset 90
95
80
Temp (°C) Temp (F) 70
(x)
x (y)
y 60
10 50 50
15 59
20 68
25 77
30 86
35 ?
F = 32 + 1.8·t
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
800
700
600
Prices
500
Houses prices 300
(Dataset) 200
Size (feet2), Prices (in 1000 of $)
(x)
x (y)
y 500 1000 1500 2000 2500 3000 3500
2104 460 Size (feet2)
1416 232
1534 315
852 178
… …
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Temp (C°), (x) Temp F (y)
1 10 50
2 15 59
Training set of
Temp. Dataset
m 3 x 20 y 68
4 25 77
… … …
x = “input” variable/feature
(x(1), y(1)) = ( , )
y = “output” variable/target
m = Number of training examples (x(2), y(2)) = ( , )
(x, y) – one training example
(x(i), y(i)) – i th training example (x(3), y(3)) = ( , )
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
How do we represent h?
Training Set
hθ(x) = θ0 + θ1x
y y = b + wx
Learning
Algorithm
hθ(x) = θ0 + θ1x
Temp Temp
(C°) h (F) x
Hypothesis F = 32 + 1.8·t
1.8 t
y = hθ(x) = θ1 = w =
θ0 = b = Linear Regression with one variable
x=
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Hypothesis: hθ(x) = θ0 + θ1 x
Temp (C°), (x) Temp F (y)
Temperature Dataset
h(x) = 1.5 + 0·x 1 h(x) = 010
+ 0.5 · x 50
h(x) = 1 + 0.5 · x
hθ(x) hθ(x) hθ(x)
2 15 59
3 3 3
m = 50 3 20 68
2 2 2
4 25 77
1 1 … … 1 …
x x x
0 1 2 3 0 1 2 3 0 1 2 3
Hypothesis: hθ(x) = θ0 + θ1 x
θ0 = 1.5 θi : Parameters θ0 = 0 θ0 = 1
θ1 = 0 θ1 = 0.5 θ1 = 0.5
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
(x(i), y(i)) (
1 𝟐
minimize ( 𝒉θ (𝒙 𝒊 ) − 𝒚(𝒊)
θ0 , θ1 2𝑚
%&'
y θ0 , θ1 𝒉θ 𝒙 𝒊 = θ𝟎 + θ𝟏 𝒙(𝒊)
x
0
𝐽 θ+ , θ' =
Cost
Idea: Choose θ0 , θ1 so function
that𝒉𝜽(𝒙) is close to y for
our training examples (x, y).
Squared error function
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Hypothesis: Simplified:
hθ(x) = θ00 + θ1 x
=0
Parameters:
θ0 , θ1
Cost function:
(
1 𝟐
𝑱 θ𝟎 , θ𝟏 = ( 𝒉θ (𝒙 𝒊 ) − 𝒚(𝒊) 𝑱 θ𝟏 =
2𝑚
%&'
Goal:
minimize 𝑱 θ𝟏𝟎 , θ𝟏
θ0θ,1θ1
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
hθ(x) 𝑱 θ𝟏
hθ(x)= θ1 x
θ𝟏 =1 𝐽 θ!
3 ×
hθ(x)= θ1 x 3
y 2 × θ𝟏 =0.5
×
2 ×
1 ×
hθ(x)= θ1 x 1
θ𝟏 =0 × ×
0 1 2 3 × θ!
x 0.5 0.5 1 1.5 2 2.5
( 0
1 𝟐
𝑱 θ𝟏 = - 𝒉 (𝒙 𝒊 ) − 𝒚(𝒊)
2𝑚 %&' θ
1 1
𝐽 1 =
2𝑚
( 1 − 1 "+ 2 − 2 "+ 3 − 3 ") =
2·3
·0=0 θ𝟏 1 0.5 0 1.5 2
𝐽 0.5 =
1
2𝑚
( 0.5 − 1 " + 1 − 2 " + 1.5 − 3 " ) =
1
· 3.5 = 0.58 𝑱 θ𝟏 0 0.58 2.33 0.58 2.33
2·3
𝐽 0 =
1
( 0 − 1 "+ 0 − 2 "+ 0 − 3 ") =
1
· 14 = 2.33
minimize 𝑱 θ𝟏
2𝑚 2·3 θ1
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
,
1 𝒊 (𝒊) 𝟐
Cost function: 𝑱 θ𝟎 , θ𝟏 = U 𝒉θ (𝒙 ) − 𝒚
2𝑚
)*+
Goal: minimize 𝑱 θ𝟎 , θ𝟏
θ0 , θ1
Outline:
• Start with some ( θ0 , θ1 ) 𝜽𝟎 = 𝟎, 𝜽𝟏 = 𝟎
• Keep changing θ0 , θ1 to reduce 𝑱 θ𝟎 , θ𝟏
until we hopefully end up at a minimum.
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Source: Machine learning course (Andrew Ng)
𝑱 θ𝟎 , θ𝟏
θ𝟏
θ𝟎
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040
Gradient descent algorithm
Repeat until convergence {
.
𝜃- = 𝜃- − 𝛼 _ ./ 𝐽(𝜃0, 𝜃+) (for j = 0 and j = 1)
1
}
Learning rate Derivative part
minimize 𝑱 θ𝟏 θ1 ∈ 𝑅
θ1
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
𝐽 θ! 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑠𝑙𝑜𝑝𝑒
.
𝜃+ = 𝜃+ − 𝛼 _ 𝐽(𝜃+)
./2
≥0
𝜃+ = 𝜃+ − 𝛼 _ (𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑣𝑎𝑙𝑢𝑒)
θ!
𝐽 θ! 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑠𝑙𝑜𝑝𝑒
.
𝜃+ = 𝜃+ − 𝛼 _ 𝐽(𝜃+)
./2
≤0
𝜃+ = 𝜃+ − 𝛼 _ (𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑣𝑎𝑙𝑢𝑒)
θ!
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
𝐽 θ!
.
𝜃+ = 𝜃+ − 𝛼 _ 𝐽(𝜃+)
./2
If 𝛼 is too small, gradient descent
can be slow.
0 θ!
𝐽 θ!
If 𝛼 is too large, gradient descent
can overshoot the minimum. It
may fail to converge, or even
diverge.
0 θ!
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
𝐽 θ!
𝑠𝑙𝑜𝑝𝑒 = 0
θ! 𝑎𝑡 𝑙𝑜𝑐𝑎𝑙 𝑜𝑝𝑡𝑖𝑚𝑎
θ!
.
𝜃+ = 𝜃+ − 𝛼 _ ./ 𝐽(𝜃+)
2
𝐶𝑢𝑟𝑟𝑒𝑛𝑡 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓θ! =0
𝜃+ = 𝜃+ − 𝛼 _ 0
𝜃+ = 𝜃+
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Gradient descent can converge to a
local minimum, even with the learning
rate 𝛼 fixed.
𝐽 θ!
𝑎
.
𝜃+ = 𝜃+ − 𝛼 _ ./ 𝐽(𝜃+)
2 𝑏
As we approach a local minimum,
gradient descent will automatically
0 θ!
take a smaller steps. So, no need to
decrease 𝛼 over time.
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Temperature hθ(x)= θ0 + θ1 x
(Dataset)
𝑦
100
Temp (°C) Temp (F)
90
(x)
x (y)
y
80
10 50
15 59 70
20 68 60
25 77 50
30 86 𝑥
10 15 20 25 30
F = 32 + 1.8·t
θ0 = 32
hθ(x)= 32 + 1.8 · x
θ1 = 1.8
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Gradient descent algorithm Linear Regression Model
Repeat until convergence { hθ(x)= θ0 + θ1 x
(
/ 1 0
𝜃. = 𝜃. − 𝛼 ? 𝐽(𝜃+ , 𝜃' ) 𝐽 θ+ , θ' = ( ℎθ (𝑥 % ) − 𝑦 (%)
/,,
2𝑚
%&'
(for j = 0 and j = 1) 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝐽(𝜃+ , 𝜃' )
,* ,,+
}
( (
/ 𝜕 1 0 𝜕 1 % (%) 0
𝐽(𝜃+ , 𝜃' ) = ( ℎθ (𝑥 % ) − 𝑦 (%) = ( θ0 + θ1 𝑥 − 𝑦
/,, 𝜕𝜃. 2𝑚 𝜕𝜃. 2𝑚
%&' %&'
(
/ 1
j=0 /,*
𝐽(𝜃+ , 𝜃' ) = ( ℎ (𝑥 % ) − 𝑦 (%) 𝜃0
𝑚 θ
%&'
(
/ 1
j=1 /,+
𝐽(𝜃+ , 𝜃' ) = ( ℎ (𝑥 % ) − 𝑦 (%) · 𝑥 % 𝜃+
𝑚 θ
%&'
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Gradient descent algorithm
Repeat until convergence {
,
+
𝜃0 = 𝜃0 − 𝛼 _ ,U ℎθ (𝑥 ) ) − 𝑦 ())
)*+ Update
, 𝜃0 and 𝜃+
+
𝜃+ = 𝜃+ − 𝛼 _ U ℎθ (𝑥 ) ) − 𝑦 ()) _ 𝑥 )
,
)*+
(for j = 0 and j = 1)
}
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
𝑦
hθ(x) 𝑱(𝜽𝟎 , 𝜽𝟏 )
100
5. 𝐽 θ!
90
4.
80 1.
70 3.
2.
60 2.
3.
50 5.
4.
𝑥 1.
10 15 20 25 30 0 θ!
Iterations: Cost function: Gradient descent:
1. θ" = 0, θ! = 0
2. θ" = …, θ! = … (
1 𝟐
3. θ" = …, θ! = … 𝑱 θ𝟎 , θ𝟏 = - 𝒉 (𝒙 𝒊 ) − 𝒚(𝒊)
2𝑚 %&' θ
4. θ" = …, θ! = …
5. θ" = 32, θ! = 1.8
Dr. Oybek Eraliyev 𝛼 = 0.01
Class: Artificial Intelligence SOC4040 Oybek Eraliev
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 33
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 34