0% found this document useful (0 votes)
64 views19 pages

Neural Networks & Genetic Algorithms Guide

This document covers neural networks and genetic algorithms, detailing their structures, training methods, and challenges. It explains key concepts such as perceptrons, multilayer networks, backpropagation, and advanced topics like CNNs and RNNs. Additionally, it introduces genetic algorithms and genetic programming, outlining their processes and evaluation metrics for model performance.

Uploaded by

charugeshm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views19 pages

Neural Networks & Genetic Algorithms Guide

This document covers neural networks and genetic algorithms, detailing their structures, training methods, and challenges. It explains key concepts such as perceptrons, multilayer networks, backpropagation, and advanced topics like CNNs and RNNs. Additionally, it introduces genetic algorithms and genetic programming, outlining their processes and evaluation metrics for model performance.

Uploaded by

charugeshm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

NCVRT

MACHINE
LEARNING-
UNIT-2
UNIT-2 NEURAL NETWORKS
AND GENETIC ALGORITHMS
▪Neural Network Representation – Problems – Perceptrons
– Multilayer Networks and Back Propagation Algorithms –
Advanced Topics – Genetic Algorithms – Hypothesis
Space Search – Genetic Programming – Models of
Evaluation and Learning.
1. NEURAL NETWORK
REPRESENTATION
▪ A neural network is a computational model inspired by the structure and function of the human brain. It consists of
interconnected nodes (neurons) organized in layers.
▪ Key Components:
▪ Neurons (Nodes): The basic processing units of the network. Each neuron receives input from other neurons or external
sources, performs a computation, and produces an output.
▪ Inputs: Values received by the neuron
▪ Weights: Associated with each input connection, representing the strength or importance of that input.
▪ Bias: An additional input with a constant value (usually 1), allowing the neuron to be activated even when all other inputs
are zero
▪ Weighted Sum: The sum of the inputs multiplied by their corresponding weights, plus the bias.
▪ Activation Function: A non-linear function applied to the weighted sum to determine the neuron's output.
▪ Connections (Edges): Represent the flow of information between neurons. Each connection has an associated weight.
▪ Layers: Neurons are typically organized into layers: Input Layer, Hidden Layer and Output Layer
LAYERS
▪ Input Layer: Receives the raw input data. The number of neurons in this layer corresponds to
the number of input features
▪ Hidden Layers: One or more intermediate layers between the input and output layers. These
layers learn complex representations of the input data
▪ Output Layer: Produces the final output of the network. The number of neurons and their
activation functions depend on the task (e.g., one neuron with a sigmoid for binary
classification, multiple neurons with softmax for multi-class classification, linear activation for
regression)
2. PROBLEMS
▪ Vanishing and Exploding Gradients: In deep networks, during backpropagation, gradients can
become extremely small (vanishing) or extremely large (exploding) as they are propagated through
many layers
▪ Difficulty in Training Deep Networks: Training networks with many hidden layers was challenging
due to the gradient issues and the lack of effective initialization techniques
▪ Overfitting: Neural networks with a large number of parameters can easily overfit the training data,
especially with limited data.
▪ Local Minima: The error surface of neural networks is often complex with many local minima.
Gradient-based optimization algorithms can get stuck in these local minima, preventing the network
from finding the global optimum.
▪ Computational Cost: Training large neural networks can be computationally expensive, requiring
significant time and resources.
▪ Lack of Interpretability: Deep neural networks are often considered "black boxes," making it
difficult to understand why they make specific predictions
3. PERCEPTRON
▪ The perceptron is the simplest type of neural network, a single-layer feedforward network with
a single output neuron
▪ Structures:
▪ Receives multiple input signals
▪ Each input signal is multiplied by a corresponding weight
▪ The weighted inputs are summed together
▪ A bias term is added to the sum
▪ The result is passed through an activation function (typically a step function or a sigmoid
function) to produce the output.
MATHEMATICAL
REPRESENTATION
▪ For a perceptron with inputs x1,x2,...,xn, weights w1,w2,...,wn, bias b, and activation function
σ

▪ Limitations of Perceptrons:
▪ Linear Separability: Perceptrons can only learn linearly separable patterns. They cannot solve
problems like XOR where the classes cannot be separated by a single straight line (or
hyperplane in higher dimensions)
▪ Single Layer: The single-layer architecture limits the complexity of the functions that can be
learned.
4. MULTILAYER NETWORKS
AND BACKPROPAGATION
ALGORITHMS
▪ Multilayer Perceptrons (MLPs):
▪ To overcome the limitations of single-layer perceptrons, multilayer networks (also known as
multilayer perceptrons or feedforward neural networks) were developed.
▪ They consist of one or more hidden layers between the input and output layers.
▪ Back Propagation Algorithm:
▪ The back propagation algorithm is the most common method for training MLPs.
▪ It is a supervised learning algorithm that uses gradient descent to minimize the error between
the network's output and the target output.
ALGORITHM STEPS
▪ 1. Forward pass:
▪ Given an input example, the input is propagated through the network, layer by layer.
▪ At each neuron, the weighted sum of its inputs is calculated, and then the activation function is applied
to produce the neuron's output.
▪ This process continues until the output of the network is computed.
▪ 2. Backward Pass (Error Propagation):
▪ The error between the network's output and the target output is calculated using a loss function (e.g.,
mean squared error for regression, cross-entropy for classification)
▪ The gradient of the loss function with respect to the network's weights and biases is computed. This is
done using the chain rule of calculus, propagating the error backwards through the network, layer by
layer
▪ The contribution of each weight and bias to the overall error is determined
ALGORITHM STEPS
▪ 3. Weight and Bias Update:
▪ The weights and biases of the network are updated in the direction that reduces the error, using
the calculated gradients.
▪ 4. Iteration: Steps 1-3 are repeated for multiple epochs (passes through the entire training
dataset) until the error on the training data (and ideally on a validation set) is minimized
5. ADVANCED TOPICS-DEEP
LEARNING
▪ Convolutional Neural Networks (CNNs): Designed for processing grid-like data such as
images. They use convolutional layers, pooling layers, and fully connected layers to learn
spatial hierarchies of features.
▪ Recurrent Neural Networks (RNNs) and their variants (LSTMs, GRUs): Designed for
processing sequential data. LSTMs (Long Short-Term Memory) and GRUs (Gated Recurrent
Units) address the vanishing gradient problem in standard RNNs.
▪ Transformers: A more recent architecture that relies on self-attention mechanisms to model
relationships between different parts of an input sequence. Highly effective for natural
language processing and increasingly used in other domains
ADVANCED TOPICS-
IMPROVED TRAINING
TECHNIQUES
▪ Better Initialization Strategies: Techniques like Xavier/Glorot initialization and He
initialization help to mitigate the vanishing/exploding gradient problem
▪ Activation Functions: ReLU and its variants (e.g., LeakyReLU, ELU) have become popular
due to their ability to alleviate the vanishing gradient problem and promote faster learning.
▪ Batch Normalization: A technique that normalizes the activations of intermediate layers,
improving training stability and allowing for higher learning rates
▪ Dropout: A regularization technique where randomly selected neurons are "dropped out"
during training, preventing overfitting
▪ Gradient Clipping: A technique to prevent exploding gradients by limiting the magnitude of
the gradients during backpropagation
6. GENETIC ALGORITHMS
▪ Genetic Algorithms (GAs) are a class of evolutionary algorithms inspired by the process of natural
selection.
▪ They are used for optimization and search problems
▪ Population: A set of candidate solutions (individuals or chromosomes) to the problem.
▪ Fitness Function: A function that evaluates the quality of each individual in the population. Higher
fitness values indicate better solutions.
▪ Selection: The process of choosing individuals from the current population to become parents for the
next generation, based on their fitness. Individuals with higher fitness are more likely to be selected.
▪ Crossover (Recombination): A genetic operator that combines the genetic information of two parent
individuals to create one or more offspring.
▪ Mutation: A genetic operator that introduces small random changes in the genes (parameters) of an
individual. This helps to maintain diversity in the population and explore new regions of the search
space
ALGORITHM
▪ 1. Initialization: Create an initial population of candidate solutions (randomly or using heuristics).
▪ 2. Evaluation: Evaluate the fitness of each individual in the population using the fitness function
▪ 3. Selection: Select parents from the current population based on their fitness
▪ 4. Crossover: Apply the crossover operator to the selected parents to create offspring
▪ 5. Mutation: Apply the mutation operator to the offspring
▪ 6. Replacement: Create a new generation by replacing some or all of the individuals in the current
population with the offspring
▪ [Link]: Repeat steps 2-6 until a termination condition is met (e.g., a satisfactory solution is
found, a maximum number of generations is reached, or no significant improvement is observed.
7. HYPOTHESIS SPACE
SEARCH
▪ In the context of machine learning, a genetic algorithm can be used to search through the
hypothesis space.
▪ Each individual in the population represents a potential hypothesis (e.g., a set of model
parameters, a decision tree structure, a set of rules).
▪ The fitness function evaluates how well each hypothesis performs on the training data (or a
validation set).
▪ The genetic operators (selection, crossover, mutation) are used to explore and refine the
hypothesis space, aiming to find a hypothesis with high performance.
8. GENETIC PROGRAMMING
▪ Genetic Programming (GP) is an extension of genetic algorithms where the individuals in the
population are computer programs rather than fixed-length strings of genes.
▪ The goal of GP is to automatically discover a program that solves a given task.

▪ Key Differences from Genetic Algorithms:

▪ Representation: Individuals are typically represented as tree structures (parse trees) that correspond to
computer programs or expressions.
▪ Genetic Operators: The crossover and mutation operators are adapted to work on these tree
structures. Common operators include:
▪ Subtree Crossover: Exchanging randomly selected subtrees between two parent programs.
▪ Subtree Mutation: Replacing a randomly selected subtree with a randomly generated new subtree.
▪ Point Mutation: Randomly changing a function or terminal at a specific node in the tree
9. MODELS OF EVALUATION
AND LEARNING
▪ Models of Evaluation:
▪ 1. Supervised Learning Evaluation:
▪ Accuracy: Percentage of correctly classified instances.
▪ Precision: Proportion of correctly predicted positive instances out of all instances predicted as positive
▪ Recall (Sensitivity): Proportion of correctly predicted positive instances out of all actual positive
instances
▪ F1-Score: Harmonic mean of precision and recall
▪ Confusion Matrix: A table showing the counts of true positives, true negatives, false positives, and
false negatives
▪ ROC Curve and AUC: Receiver Operating Characteristic curve and the Area Under the Curve, used
for evaluating binary classifiers at different thresholds
MODELS OF EVALUATION
AND LEARNING
▪ 2. Regression:
▪ Mean Squared Error (MSE): Average of the squared differences between the predicted and actual
values
▪ Root Mean Squared Error (RMSE): Square root of MSE.
▪ Mean Absolute Error (MAE): Average of the absolute differences between the predicted and actual
values.
▪ R-squared (Coefficient of Determination): Proportion of the variance in the dependent variable that
is predictable from the independent variables.
▪ 3. Unsupervised Learning Evaluation: Evaluation is often more subjective and depends on the
specific task
▪ Clustering: Silhouette score, Davies-Bouldin index, visual inspection
▪ Dimensionality Reduction: Reconstruction error, preservation of variance
MODEL SELECTION AND
GENERALIZATION
ASSESSMENT
▪ Training Set: Used to train the model
▪ Validation Set: Used to tune hyperparameters and compare different models during training to
prevent overfitting
▪ Test Set: Used to provide an unbiased estimate of the final model's performance on unseen
data. This set should not be used during training or hyperparameter tuning
▪ Cross-Validation: Techniques like k-fold cross-validation are used to get a more robust
estimate of performance by splitting the data into multiple folds and training/testing the model
on different combinations of folds

You might also like