0% found this document useful (0 votes)
5 views23 pages

Understanding Machine Learning Types

The document discusses the concept of learning in agents, emphasizing the importance of adaptability in unpredictable environments and the limitations of pre-programmed solutions. It outlines various forms of learning, including supervised, unsupervised, and reinforcement learning, detailing their methodologies, advantages, and disadvantages. Additionally, it covers techniques like gradient descent and Hebbian learning, highlighting their roles in optimizing machine learning models.

Uploaded by

prabhatgt421
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views23 pages

Understanding Machine Learning Types

The document discusses the concept of learning in agents, emphasizing the importance of adaptability in unpredictable environments and the limitations of pre-programmed solutions. It outlines various forms of learning, including supervised, unsupervised, and reinforcement learning, detailing their methodologies, advantages, and disadvantages. Additionally, it covers techniques like gradient descent and Hebbian learning, highlighting their roles in optimizing machine learning models.

Uploaded by

prabhatgt421
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

LEARNING

➢ An agent is learning if it improves its performance on future tasks


after making observations about the world.

Why learning?

Why would we want an agent to learn? If the design of the agent can
be improved, why wouldn’t the designers just program in that
improvement to begin with?

There are three main reasons:

➢ First, the designers cannot anticipate all possible situations that


the agent might find itself in. For example, a robot designed to
navigate mazes must learn the layout of each new maze it
encounters.
➢ Second, the designers cannot anticipate all changes over time.
For example, a program designed to predict tomorrow’s stock
market prices must learn to adapt when conditions change from
boom to bust.
➢ Third, sometimes human programmers have no idea how to
program a solution themselves. For example, most people are
good at recognizing the faces of family members, but even the
best programmers are unable to program a computer to
accomplish that task, except by using learning algorithms.
FORMS OF LEARNING

➢ Any component of an agent can be improved by learning from data.


The improvements, and the techniques used to make them, depend
on four major factors:
• Which component is to be improved
• What prior knowledge the agent already has.
• What representation is used for the data and the component.
• What feedback is available to learn from.
➢ Most of current machine learning research covers inputs that form a
factored representation (a vector of attribute values) and outputs
that can be either a continuous numerical value or a discrete value.
➢ Learning a (possibly incorrect) general function or rule from
specific input–output pairs is called inductive learning.
➢ Analytical or deductive learning: going from a known general rule
to a new rule that is logically entailed, but is useful because it allows
more efficient processing.

Feedback to learn from

There are three types of feedback that determine the four main types
of learning:
1. Supervised learning
2. Un-supervised learning
3. Reinforcement learning
4. Semi-Supervised learning

1. Supervised Learning
➢ Supervised learning is an ML method in which a model learns
from a labeled dataset containing input-output pairs.
➢ Each input in the dataset has a corresponding correct output (the
label), and the model's task is to learn the relationship between the
inputs and outputs.
➢ This enables the model to make predictions on new, unseen data
by applying the learned mapping.

Example of Supervised Learning

Predicting house prices: The input might be house features such as


size, location, and number of bedrooms, and the output would be the
house price. The supervised learning model would learn the
relationship between these features and house prices from historical
data, and then it could predict prices for new houses entering the
market.
The task of supervised learning is this:

Given a training set of N example input–output pairs

(x1, y1),(x2, y2),...(xN , yN )

where each yj was generated by an unknown function y = f(x),


discover a function h that approximates the true function f.

Here x and y can be any value; they need not be numbers. The
function h is a hypothesis. Learning is a search through the space of
possible hypotheses for one that will perform well, even on new
examples beyond the training set.

We say a hypothesis generalizes well if it correctly predicts the value


of y. Sometimes the function f is stochastic (i.e. it is not strictly a
function of x), and what we have to learn is a conditional probability
distribution, P(Y | x).
Categories of Supervised Learning

• Regression: When y is a number (such as tomorrow’s


temperature), the learning problem is called regression.

When dealing with real-valued output variables like "price" or


"temperature," several popular Regression algorithms come into
play, such as the Simple Linear Regression Algorithm, Multivariate
Regression Algorithm, Decision Tree Algorithm, and Lasso
Regression.

• Classification: When the output y is one of a finite set of values


(such as sunny, cloudy or rainy), the learning problem is called
classification, and is called Boolean or binary classification if there
are only two values.

In instances where the output variable is a category, like


distinguishing between 'spam' and 'not spam' in email filtering,
several widely-used classification algorithms come into play.
These encompass the following algorithms: Random Forest,
Decision Tree, Logistic Regression, and Support Vector Machine.
Advantages

• Gathers previous data, which helps in learning from past


mistakes.
• It is a powerful tool of AI that can perform plenty of business
functions single-handedly.
• It is a more trustworthy algorithm.

Disadvantages

• Difficult to classify huge data sets.


• It requires a certain level of expertise to operate.
• It is time intensive.

2. Un-Supervised Learning :-
➢ In unsupervised learning the agent learns patterns in the input even
though no explicit feedback is supplied.
➢ These algorithms discover hidden patterns or data groupings
without the need for human intervention.
➢ Un-supervised learning builds a concise representation of the data
and generate imaginative content from it.
Types of Unsupervised Learning
Unsupervised learning can be broken down into three main tasks:
i. Clustering
ii. Association rules
iii. Dimensionality reduction.

Clustering
➢ Clustering is a data mining technique which groups unlabeled
data based on their similarities or differences.
➢ Clustering algorithms are used to process raw, unclassified data
objects into groups represented by structures or patterns in the
information.
➢ Clustering algorithms can be categorized into different types of
clustering; for example:
• Exclusive clustering: Data is grouped such that a single
data point exclusively belongs to one cluster.
• Overlapping clustering: A soft cluster in which a single
data point may belong to multiple clusters with varying
degrees of membership.
• Hierarchical clustering: A type of clustering in which
groups are created such that similar instances are within
the same group and different objects are in other groups.
• Probalistic clustering: Clusters are created using
probability distribution.

Association Rule Mining


➢ Association rule mining algorithms have been popularized
through market basket analyses, leading to different
recommendation engines for music platforms and online
retailers.
➢ They are used within transactional datasets to identify frequent
item sets, or collections of items, to identify the likelihood of
consuming a product given the consumption of another
product.
➢ The most widely used algorithm for association rule learning is
the Apriori algorithm. However, other algorithms are used for
this type of unsupervised learning, such as the Eclat and FP-
growth algorithms.

Dimensionality reduction
➢ While more data generally yields more accurate results, it can
also impact the performance of machine learning algorithms
(e.g. overfitting) and it can also make it difficult to visualize
datasets.
➢ Dimensionality reduction is a technique used when the number
of features, or dimensions, in a given dataset is too high.
➢ It reduces the number of data inputs to a manageable size while
also preserving the integrity of the dataset as much as possible.
It is commonly used in the preprocessing data stage.

Advantages of Unsupervised Learning

• Uncovering hidden patterns and structures in data without


needing labeled examples.
• Ability to explore and discover insights from large and
complex datasets.
• Flexibility in handling diverse data types and domains.
• Useful for exploratory data analysis and feature engineering.
• Can be applied in scenarios where labeled data is scarce or
unavailable.

Disadvantages of Unsupervised Learning

• Lack of clear objective metrics for evaluating model


performance.
• Difficulty in interpreting and validating the learned patterns or
clusters.
• Sensitivity to noise and outliers in the data, leading to
potentially misleading results.
• Potential scalability issues with large datasets and high-
dimensional feature spaces.

Reinforcement learning
➢ In reinforcement learning the agent learns from a series of
reinforcements (rewards or punishments).
➢ Reinforcement learning problems involve learning what to
do—how to map situations to actions—so as to maximize a
numerical reward signal.
➢ Moreover, the learner is not told which actions to take, as in
many forms of machine learning, but instead must discover
which actions yield the most reward by trying them out.

Markov decision process

➢ The reinforcement learning agent learns about a problem by


interacting with its environment. The environment provides
information on its current state. The agent then uses that
information to determine which actions(s) to take.
➢ If that action obtains a reward signal from the surrounding
environment, the agent is encouraged to take that action again
when in a similar future state. This process repeats for every
new state thereafter.
➢ The task of reinforcement learning is to use observed rewards
to learn an optimal (or nearly optimal) policy for the
environment. An optimal policy is a policy that maximizes the
expected total reward.

For example, the lack of a tip at the end of the journey gives the
taxi agent an indication that it did something wrong. The two points
for a win at the end of a chess game tells the agent it did something
right. It is up to the agent to decide which of the actions prior to the
reinforcement were most responsible for it.

Exploration-exploitation trade-off

➢ One of the challenges that arise in reinforcement learning, and


not in other kinds of learning, is the trade-off between
exploration and exploitation.
➢ To obtain a lot of reward, a reinforcement learning agent must
prefer actions that it has tried in the past and found to be
effective in producing reward.
➢ But to discover such actions, it has to try actions that it has not
selected before. The agent has to exploit what it already knows
in order to obtain reward, but it also has to explore in order to
make better action selections in the future. T
➢ he dilemma is that neither exploration nor exploitation can be
pursued exclusively without failing at the task.
➢ The agent must try a variety of actions and progressively favor
those that appear to be best.

Components of reinforcement learning


Beyond the agent-environment-goal, four principal sub-elements
characterize reinforcement learning problems.
- Policy. This defines the RL agent’s behavior by mapping
perceived environmental states to specific actions the agent must
take when in those states.

- Reward signal. This designates the RL problem’s goal. Each of


the RL agent’s actions either receives a reward from the
environment or not. The agent’s only objective is to maximize its
cumulative rewards from the environment.

- Value function. Reward signal differs from value function in that


the former denotes immediate benefit while the latter specifies long-
term benefit. Value refers to a state’s desirability per all of the states
(with their incumbent rewards) that are likely to follow.

- Model. This is an optional sub-element of reinforcement learning


systems. Models allow agents to predict environment behavior for
possible actions.

Benefits

• Ability to Learn Optimal Strategies Through Trial and Error

• Scalability to Complex Decision-Making Problems

• Flexibility in Adapting to New Information


• Potential for High Autonomy and Reduced Human
Supervision

• Efficiency in Handling Long-Term Sequential Decision-


Making

Limitations

• Susceptibility to High Variance and Instability

• Dependency on Large Amounts of Environmental Interaction


Data

• Difficulty in Specifying Reward Functions

• Limited Transferability Between Different Tasks

• Ethical and Safety Concerns in Autonomous Decision-


Making
Gradient Descent Learning
➢ Gradient descent is an optimization algorithm that’s used when
training a machine learning model. It’s based on a convex
function and tweaks its parameters iteratively to minimize a
given function to its local minimum.
➢ It trains machine learning models by minimizing errors
between predicted and actual results.

What is a Gradient?

➢ A gradient simply measures the change in all weights with


regard to the change in error. You can also think of a gradient
as the slope of a function.
➢ The higher the gradient, the steeper the slope and the faster
a model can learn. But if the slope is zero, the model stops
learning. In mathematical terms, a gradient is a partial
derivative with respect to its inputs.

How Does Gradient Descent Work?


➢ Instead of climbing up a hill, think of gradient descent as
hiking down to the bottom of a valley. The equation below
describes what the gradient descent algorithm does:

𝑏 = 𝑎 − 𝛾 ∇𝑓(𝑎)
Where,

a = Current position if climber

b = next position of climber

 = the learning rate

f(a) = the gradient of the loss function with respect to

the parameters

➢ This formula basically tells us the next position we need to go,


which is the direction of the steepest descent.

Types of Gradient Descent

Batch Gradient Descent

➢ Batch gradient descent, also called vanilla gradient descent,


calculates the error for each example within the training
dataset, but it only gets updated after all training examples
have been evaluated.
Stochastic Gradient Descent

➢ Stochastic gradient descent (SGD) does this for each training


example within the dataset, meaning it updates the parameters
for each training example one by one.

𝑏𝑡+1 = 𝑏𝑡 − 𝛾 ∇𝑓(𝑏𝑡 ; 𝑥𝑖 )

Mini-Batch Gradient Descent

➢ Mini-batch gradient descent is the go-to method since it’s a


combination of the concepts of SGD and batch gradient
descent. It simply splits the training dataset into small batches
and performs an update for each of those batches.
Hebbian learning
➢ The neuroscientific concept of Hebbian learning was
introduced by Donald Hebb in his 1949 publication of The
Organization of Behaviors. Also known as Hebb’s Rule or Cell
Assembly Theory,
➢ The basis of the theory is when our brains learn something
new, neurons are activated and connected with other neurons,
forming a neural network. These connections start off weak,
but each time the stimulus is repeated, the connections grow
stronger and stronger, and the action becomes more intuitive.
➢ Hebb or Hebbian learning rule comes under Artificial Neural
Network (ANN) which is an architecture of a large number of
interconnected elements called neurons.
➢ These neurons process the input received to give the desired
output. The nodes or neurons are linked by inputs
(x1,x2,x3…xn), connection weights (w1,w2,w3…wn),
and activation functions(a function that defines the output of a
node).
➢ This network is suitable for bipolar data. The Hebbian learning
rule is generally applied to logic gates.
The weights are updated as:

W (new) = w (old) + x*y

Training Algorithm For Hebbian Learning Rule


The training steps of the algorithm are as follows:

• Initially, the weights are set to zero, i.e. w =0 for all inputs i =1
to n and n is the total number of input neurons.
• Let s be the output. The activation function for inputs is
generally set as an identity function.
• The activation function for output is also set to y= t.
• The weight adjustments and bias are adjusted to:

• The steps 2 to 4 are repeated for each input vector and output.
Competitive Learning
➢ Competitive learning is a form of unsupervised learning in
artificial neural networks, in which nodes compete for the right
to respond to a subset of the input data.
➢ Models and algorithms based on the principle of competitive
learning include winner-take-all nets, vector quantization, self-
organizing maps, etc.

Architecture of Competitive Learning


Implementation of Competitive Learning
• Competitive learning is usually implemented with neural
networks that contain a hidden layer which is commonly
known as "competitive layer".
• Every competitive neuron is described by a vector or weights
𝑤𝑖 = (𝑤𝑖1 , … , 𝑤𝑖𝑑 )𝑇 , 𝑖 = 1, … , M and calculates the similarity
measure between the input data 𝑋 𝑛 = (𝑥𝑛1 , … , 𝑥𝑛𝑑 )𝑇 ∈ 𝑅𝑑
and the weight vector 𝑤𝑖 .
• For every input vector, the competitive neurons "compete"
with each other to see which one of them is the most similar to
that particular input vector.
• The winner neuron m sets its output 𝑜𝑚 = 1 and all other
competitive neurons set their output 𝑜𝑖 = 0, i = 1, ..., M, i ≠ m.

FIXED-WEIGHT COMPETITIVE NETS


➢ Many neural nets use the idea of competition among neurons
to enhance the contrast in activations of the neurons.
➢ In the most extreme situation, often called Winner-Take-All,
only the neuron with the largest activation is allowed to remain
"on".
➢ Examples of fixed-weight competitive nets include Maxnet,
Mexican Hat, Hamming net, etc.

You might also like