0% found this document useful (0 votes)
17 views84 pages

Ece ML PDF

Artificial Neural Networks (ANNs) are computational systems that simulate human brain functions to process data and make predictions through interconnected artificial neurons. The document discusses various models of ANNs, including single-layer and multi-layer perceptrons, and outlines their architectures, advantages, limitations, and learning algorithms. It also covers key terminologies related to ANNs, such as weights, biases, and learning rates, as well as supervised and unsupervised learning methods.

Uploaded by

chinthanikhil26
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views84 pages

Ece ML PDF

Artificial Neural Networks (ANNs) are computational systems that simulate human brain functions to process data and make predictions through interconnected artificial neurons. The document discusses various models of ANNs, including single-layer and multi-layer perceptrons, and outlines their architectures, advantages, limitations, and learning algorithms. It also covers key terminologies related to ANNs, such as weights, biases, and learning rates, as well as supervised and unsupervised learning methods.

Uploaded by

chinthanikhil26
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Introduction to Artificial Neural Networks

(ANNs)

Artificial Neural Networks (ANNs) are computer systems designed to
mimic how the human brain processes information. Just like the brain
uses neurons to process data and make decisions, ANNs use
artificial neurons to analyze data, identify patterns and make
predictions. These networks consist of layers of interconnected
neurons that work together to solve complex problems. The key idea
is that ANNs can "learn" from the data they process, just as our brain
learns from experience.

McCulloch-Pitts Model of Neuron


One of the earliest models of artificial neurons was the McCulloch-
Pitts Model introduced in 1943. It is also known as the linear
threshold gate.
 The neuron takes multiple inputs, each associated with a weight.
 A weighted sum of the inputs is calculated.
 If the weighted sum exceeds a threshold, the neuron fires (output
= 1) otherwise it does not fire (output = 0).
This model laid the foundation for modern neural networks, though it
is limited to solving only linearly separable problems.

Single-Layer Neural Networks (Perceptron)


Single-layer perceptron consists of an input layer and an output
neuron that applies a threshold function to determine the final output.
The perceptron is a fundamental model for binary classification
problems.
The perceptron follows this rule:
y={1,if ∑wiIi≥t0,if ∑wiIi<ty={1,0,if ∑wiIi≥tif ∑wiIi<t
where tt is the threshold
However, a single-layer perceptron cannot solve non-linearly
separable problems like XOR. This limitation led to the development
of multi-layer perceptrons (MLPs).
Logic gates implemented using perceptronsGeneral structure of a
perceptron model with bias and weighted inputs
Multi-Layer Neural Networks (MLPs)
A Multi-Layer Perceptron (MLP) consists of:
 Input Layer: receives raw data
 One or more Hidden Layers: extracts features
 Output Layer: produces final predictions

MLP
Activation functions introduce non-linearity, allowing ANNs to learn
complex patterns. Common activation functions include:

Function Formula Range

σ(x)=11+e−xσ(x)=1+e−x1
(0,1)(0,1)
Sigmoid

tanh⁡(x)=2σ(2x)−1tanh(x)=2σ(2x)−1 (−1,1)(−1,1)
Tanh

f(x)=max⁡(0,x)f(x)=max(0,x) [0,∞)[0,∞)
ReLU

Artificial Neural Networks Algorithm


1. Initialize Weights and Bias
The algorithm starts by initializing the weights i.e the strength of
connections between neurons and biases which is additional
parameters that help to adjust output. These values are usually
initialized randomly.
You also set the learning rate (α) which controls how much the
weights should be adjusted during training.
2. Feed Input Data

The input data is fed into the input layer of the network. Each input is
a feature like an image pixel, a value from a dataset, etc.

3. Forward Propagation (Calculate Output)


The data is passed through the network from the input layer to
the hidden layers and finally to the output layer. At each layer the
input is multiplied by the weights and passed through an activation
function like sigmoid, ReLU, etc to produce the output of that layer.
The result is a prediction or an output that is compared to the actual
target value.

4. Calculate Error

Once the network has made a prediction the next step is to calculate
the error i.e the difference between the predicted output and the
actual target. This error is often measured using a loss function like
Mean Squared Error or Cross-Entropy.

5. Backpropagation (Update Weights)

Backpropagation computes the gradients i.e how much change in


weights would reduce the error by using the chain rule of calculus.
The weights and biases are then updated to minimize the error. The
update is done using an optimization algorithm like Gradient
Descent:
w=w−α×∂Error∂ww=w−α×∂w∂Error
6. Repeat (Epochs)
Steps 2 to 5 are repeated for multiple epochs which is iterations over
the entire training dataset. During each epoch the weights are
adjusted to reduce the error gradually.

7. Test the Network


After training, the network is tested with new data to evaluate its
performance. If the accuracy is good, the training is considered
complete. If not more training or adjustments may be needed.

Advantages
 Noise Resilience: ANNs can handle noisy and incomplete data
without affecting their performance. Even if there are errors in the
training data they can still produce accurate results.
 Versatility: ANNs can be used for a wide range of problems from
real-valued to discrete-valued functions. They are widely used in
image recognition, speech recognition and decision-making.
 Efficiency: Once trained ANNs can evaluate functions very
quickly making them ideal for real-time applications like self-
driving cars or fraud detection.
 Parallel Processing: ANNs can handle large amounts of data
through distributed processing similar to how the human brain
works.
Limitations
 Overfitting: ANNs can overfit to training data especially when the
model is too complex or when there is insufficient data.
 Data Dependency: They require large datasets for training to
generalize well.
 Computational Expense: Training deep neural networks can be
computationally expensive and time-consuming hence requiring
substantial hardware resources.

Basic models of ANN


Neural network architectures define the structural design of deep
learning models, shaping how they process information, learn patterns
and make predictions. From simple feed-forward networks to
advanced architectures like CNNs, RNNs, Transformers and hybrid
models, each architecture is tailored to specific types of data and
tasks.
 Different architectures excel in vision, language, time-series and
generative tasks
 The choice of architecture directly impacts performance, efficiency
and accuracy
1. Single-Layer Feed-Forward Network
A single-layer feed-forward network connects input neurons directly to
output neurons through a single set of weights. It does not contain
hidden layers or feedback connections and information flows only in
the forward direction. This architecture is suitable only for linearly
separable problems.
 Contains only one trainable weight layer.
 Information flows strictly in one direction.
 Computationally efficient and simple to implement.
Single-Layer Feed-Forward Network

Working

 Input features are provided to the network.


 Each input is multiplied by its corresponding weight.
 A bias term is added to the weighted sum.
 The result passes through an activation function.
 The activated value produces the final output.

Use Cases

 Binary Classification: Problems such as yes/no or true/false


predictions.
 Linear Regression Tasks: Predicting values with linear
relationships.
 Threshold-Based Decision Systems: Simple rule-based
classifiers.
 Educational Models: Demonstrating basic neural learning
principles.
2. Multilayer Feed-Forward Network
A multilayer feed-forward network consists of an input layer, one or
more hidden layers and an output layer. The presence of hidden
layers with nonlinear activation functions enables learning of complex,
non-linear mappings. Data propagation occurs strictly from input to
output.
 Includes hidden layers with nonlinear activations.
 Learns hierarchical feature representations.
 Uses backpropagation for learning.
Multilayer Feed-Forward Network

Working

 Input data enters the input layer.


 Signals are forwarded to the first hidden layer.
 Each neuron computes a weighted sum and applies activation.
 Outputs flow sequentially through all hidden layers.
 The output layer generates the final prediction.
 Prediction error is computed.
 Error is propagated backward to update weights.

Use Cases

 Image Classification: Identifying objects in images.


 Medical Diagnosis: Disease prediction based on patient data.
 Fraud Detection: Identifying abnormal financial behavior.
 Pattern Recognition: Learning complex input-output mappings.
3. Single Node with Its Own Feedback
A single node with its own feedback is a simple recurrent structure
where a neuron’s output is fed back as an input in the next time step.
This feedback introduces a basic memory mechanism. The output
depends on both current input and previous output.
 Simplest form of recurrence.
 Introduces temporal dependency.
 Maintains a single internal state.
Single Node with Its Own Feedback
Working

 Input is combined with the previous output.


 The combined signal is weighted and summed.
 An activation function produces the current output.
 The output is stored as feedback.
 Feedback is used in the next time step’s computation.

Use Cases

 Dynamic System Modeling: Representing systems that evolve


over time.
 Signal Filtering: Smoothing fluctuating signals.
 Control Systems: Adaptive decision mechanisms.
 Introductory Temporal Models: Learning basic time-dependent
behavior.
4. Single-Layer Recurrent Network
A single-layer recurrent network contains one layer of neurons with
feedback connections. These connections allow the network to
maintain a hidden state across time steps. It is primarily used for
modeling sequential and time-dependent data.
 Feedback connections create temporal memory.
 Processes data step-by-step in time.
 Hidden state stores past information.
Single-Layer Recurrent Network

Working

 Input at the current time step is received.


 Previous hidden state is retrieved.
 Current input and past state are combined.
 Weighted sum and activation update the hidden state.
 Output is generated for the current time step.
 Hidden state is passed to the next time step.
Use Cases

 Time-Series Prediction: Forecasting stock prices or weather.


 Sequential Pattern Learning: Detecting temporal trends.
 Speech Processing: Modeling speech signals.
 Sensor Data Analysis: Processing streaming data.
5. Multilayer Recurrent Network
A multilayer recurrent network consists of multiple recurrent layers
stacked together. Each layer processes temporal information while
passing its hidden state to the next layer. This structure enables
learning of complex and long-term temporal dependencies.
 Each layer maintains its own temporal state.
 Learns hierarchical sequence representations.
 Handles long and complex sequences effectively.
Multilayer Recurrent Network

Working

 Input sequence enters the first recurrent layer.


 Temporal patterns are learned at the first layer.
 Hidden states are passed to the next recurrent layer.
 Higher layers capture abstract temporal features.
 Final recurrent layer produces output.
 Errors are propagated backward through time.
 Weights are updated across all layers.

Use Cases

 Natural Language Processing: Text generation and translation.


 Speech Recognition: Converting speech to text.
 Video Sequence Analysis: Understanding motion patterns.
 Financial Forecasting: Long-term market trend analysis.
Comparison of Neural Network Architectures
Let's compare the various types of architectures:
Single-
Layer Multilayer Single Single-
Feed- Feed- Node with Layer Multilayer
Aspect Forward Forward Feedback Recurrent Recurrent

One or
Presence of No One Multiple
more No hidden
Hidden hidden recurrent recurrent
hidden layers
layers layer layers
Layers layers

Self- Present Present


Feedback Absent Absent feedback within across
Connections only layer layers

Very Short- Short- and


Memory No No
limited term long-term
memory memory
Capacity memory memory memory

Training
Very low Moderate Very low High Very high
Complexity

Parameter Moderate
Minimal Minimal Moderate Very high
Count to high

Parallel
Fully Fully Very
Processing Limited Limited
parallel parallel limited
Ability

important terminologies:
The ANN(Artificial Neural Network) is based on BNN(Biological Neural
Network) as its primary goal is to fully imitate the Human Brain and its
functions. Similar to the brain having neurons interlinked to each
other, the ANN also has neurons that are linked to each other in
various layers of the networks which are known as nodes.

The ANN learns through various learning algorithms that are


described as supervised or unsupervised learning.
 In supervised learning algorithms, the target values are labeled. Its
goal is to try to reduce the error between the desired output (target)
and the actual output for optimization. Here, a supervisor is
present.
 In unsupervised learning algorithms, the target values are not
labeled and the network learns by itself by identifying the patterns
through repeated trials and experiments.
ANN Terminology:
 Weights: each neuron is linked to the other neurons through
connection links that carry weight. The weight has information and
data about the input signal. The output depends solely on the
weights and input signal. The weights can be presented in a matrix
form that is known as the Connection matrix.
 if there are “n” nodes with each node having “m” weights, then it is
represented as:

 Bias: Bias is a constant that is added to the product of inputs and


weights to calculate the product. It is used to shift the result to the
positive or negative side. The net input weight is increased by a
positive bias while The net input weight is decreased by a negative
bias.
Here,{1,x1...xn} are the inputs, and the output (Y) neurons will be
computed by the function g(x) which sums up all the input and adds
bias to it.
g(x)=∑xi+b where i=0 to n
= x1+........+xn+b

and the role of the activation is to provide the output depending on the
results of the summation function:
Y=1 if g(x)>=0
Y=0 else

 Threshold: A threshold value is a constant value that is compared


to the net input to get the output. The activation function is defined
based on the threshold value to calculate the output.
For Example:
Y=1 if net input>=threshold
Y=0 else

 Learning Rate: The learning rate is denoted α. It ranges from 0 to


1. It is used for balancing weights during the learning of ANN.
 Target value: Target values are Correct values of the output
variable and are also known as just targets.
 Error: It is the inaccuracy of predicted output values compared to
Target Values.
Supervised Learning Algorithms:
 Delta Learning: It was introduced by Bernard Widrow and Marcian
Hoff and is also known as Least Mean Square Method. It reduces
the error over the entire learning and training process. In order to
minimize error, it follows the gradient descent method in which the
Activation Function continues forever.
 Outstar Learning: It was first proposed by Grossberg in 1976,
where we use the concept that a Neural Network is arranged in
layers, and weights connected through a particular node should be
equal to the desired output resulting in neurons that are connected
with those weights.
Unsupervised Learning Algorithms:
 Hebbian Learning: It was proposed by Hebb in 1949 to improve
the weights of nodes in a network. The change in weight is based
on input, output, and learning rate. the transpose of the output is
needed for weight adjustment.
 Competitive Learning: It is a winner takes all strategy. Here,
when an input pattern is sent to the network, all the neurons in the
layer compete with each other to represent the input pattern, the
winner gets the output as 1 and all the others 0, and only the
winning neurons have weight adjustments.

You might also like