0% found this document useful (0 votes)
9 views15 pages

Understanding Reinforcement Learning Basics

Reinforcement Learning (RL) is a machine learning approach where an agent learns to take actions in an environment to maximize rewards through feedback, without needing labeled data. It differs from supervised learning as it relies on trial and error, allowing the agent to learn from its experiences. RL has applications in various fields including robotics, game playing, and finance, and can be implemented through value-based, policy-based, or model-based approaches.

Uploaded by

negiarpit2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views15 pages

Understanding Reinforcement Learning Basics

Reinforcement Learning (RL) is a machine learning approach where an agent learns to take actions in an environment to maximize rewards through feedback, without needing labeled data. It differs from supervised learning as it relies on trial and error, allowing the agent to learn from its experiences. RL has applications in various fields including robotics, game playing, and finance, and can be implemented through value-based, policy-based, or model-based approaches.

Uploaded by

negiarpit2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Reinforcement

Learning
• Reinforcement learning is an area of Machine Learning.
• It is about taking suitable action to maximize reward in a

What is particular situation.


• It is employed by various software and machines to find the
Reinforcem best possible behavior or path it should take in a specific
situation.
ent • Reinforcement learning differs from the supervised learning
in a way that in supervised learning the training data has the
Learning? answer key with it so the model is trained with the correct
answer itself whereas in reinforcement learning, there is no
answer but the reinforcement agent decides what to do to
perform the given task.
• In the absence of a training dataset, it is bound to learn
from its experience.
• Reinforcement Learning is a feedback-based Machine
learning technique in which an agent learns to behave in an
environment by performing the actions and seeing the
results of actions.
• For each good action, the agent gets positive feedback, and
for each bad action, the agent gets negative feedback or
penalty.
• In Reinforcement Learning, the agent learns automatically
using feedbacks without any labeled data, unlike supervised
learning.
• Since there is no labeled data, so the agent is bound to learn
by its experience only.
• RL solves a specific type of problem where decision
making is sequential, and the goal is long-term, such
as game-playing, robotics, etc.
• The agent interacts with the environment and explores it by
itself.
• The primary goal of an agent in reinforcement learning is to
improve the performance by getting the maximum positive
rewards.
• The agent learns with the process of hit and trial, and based on
the experience, it learns to perform the task in a better way.
• Hence, we can say that "Reinforcement learning is a type of
machine learning method where an intelligent agent
(computer program) interacts with the environment and
learns to act within that."
• How a Robotic dog learns the movement of his arms is an
example of Reinforcement learning.
• It is a core part of Artificial intelligence, and all AI
agent works on the concept of reinforcement learning.
• Here we do not need to pre-program the agent, as it learns from
its own experience without any human intervention.
• Example:
• Suppose there is an AI agent present within a maze environment, and his goal is to find the
diamond.
• The agent interacts with the environment by performing some actions, and based on those
actions, the state of the agent gets changed, and it also receives a reward or penalty as
feedback.
• The agent continues doing these three things (take action, change state/remain in the same
state, and get feedback), and by doing these actions, he learns and explores the environment.
• The agent learns that what actions lead to positive feedback or rewards and what actions lead to
negative feedback penalty.
• As a positive reward, the agent gets a positive point, and as a penalty, it gets a negative point.
Terms Used in Reinforcement Learning
Agent An entity that can perceive/explore the environment and act upon it.

A situation in which an agent is present or surrounded by. In RL, we assume the stochastic
Environment environment, which means it is random in nature.

Actions are the moves taken by an agent within the environment.


Action
State is a situation returned by the environment after each action taken by the agent.
State
A feedback returned to the agent from the environment to evaluate the action of the agent.
Reward
Policy is a strategy applied by the agent for the next action based on the current state.
Policy
Value It is expected long-term retuned with the discount factor and opposite to the short-term reward.

Q-value It is mostly similar to the value, but it takes one additional parameter as a current action (a).
Key Features of Machine
Learning
• In RL, the agent is not instructed about the environment and what actions need to
be taken.
• It is based on the hit and trial process.
• The agent takes the next action and changes states according to the feedback of
the previous action.
• The agent may get a delayed reward.
• The environment is stochastic, and the agent needs to explore it to reach to get the
maximum positive rewards.
Approaches to implement
Reinforcement Learning
• There are mainly three ways to implement reinforcement-learning in ML, which are:
• Value-based:
• The value-based approach is about to find the optimal value function, which is the maximum value at a state under any policy.
• Therefore, the agent expects the long-term return at any state(s) under policy π.
• Policy-based:
• Policy-based approach is to find the optimal policy for the maximum future rewards without using the value function.
• In this approach, the agent tries to apply such a policy that the action performed in each step helps to maximize the future
reward.
• Model-based:
• In the model-based approach, a virtual model is created for the environment, and the agent explores that environment to learn
it.
• There is no particular solution or algorithm for this approach because the model representation is different for each
environment.
Types of Reinforcement Learning

• There are mainly two types of reinforcement learning, which are:


• Positive Reinforcement
• Negative Reinforcement
The positive reinforcement learning means
adding something to increase the tendency
that expected behavior would occur again.

Positive It impacts positively on the behavior of the


agent and increases the strength of the
Reinforcement behavior.

This type of reinforcement can sustain the


changes for a long time, but too much positive
reinforcement may lead to an overload of
states that can reduce the consequences.
The negative reinforcement learning is
opposite to the positive reinforcement
as it increases the tendency that the
specific behavior will occur again by
avoiding the negative condition.
Negative
Reinforcement It can be more effective than the
positive reinforcement depending on
situation and behavior, but it provides
reinforcement only to meet minimum
behavior.
The Reinforcement Learning and Supervised
Learning both are the part of machine
learning, but both types of learnings are far
Difference opposite to each other.
between The RL agents interact with the
Reinforcement environment, explore it, take action, and get
Learning and rewarded.

Supervised
Whereas supervised learning algorithms
Learning learn from the labeled dataset and, on the
basis of the training, predict the output.
Reinforcement Learning Supervised Learning

RL works by interacting with the Supervised learning works on the existing


environment. dataset.

The RL algorithm works like the human Supervised Learning works as when a
brain works when making some human learns things in the supervision of
decisions. a guide.

There is no labeled dataset is present The labeled dataset is present.

No previous training is provided to the Training is provided to the algorithm so


learning agent. that it can predict the output.

RL helps to take decisions sequentially. In Supervised learning, decisions are


made when input is given.
RL can be used in large environments in the
following situations
1. A model of the environment is known, but an analytic solution is not available;
2. Only a simulation model of the environment is given (the subject of simulation-
based optimization)
[Link] only way to collect information about the environment is to interact with it.
Reinforcement Learning
Applications
• Robotics:
• RL is used in Robot navigation, Robo-soccer, walking, juggling, etc.
• Control:
• RL can be used for adaptive control such as Factory processes, admission control in
telecommunication, and Helicopter pilot is an example of reinforcement learning.
• Game Playing:
• RL can be used in Game playing such as tic-tac-toe, chess, etc.
• Chemistry:
• RL can be used for optimizing the chemical reactions.
• Business:
• RL is now used for business strategy planning.
• Manufacturing:
• In various automobile manufacturing companies, the robots use deep reinforcement
learning to pick goods and put them in some containers.
• Finance Sector:
• The RL is currently used in the finance sector for evaluating trading strategies.

You might also like