0% found this document useful (0 votes)

6 views16 pages

Understanding Graphs in Data Science

The document discusses the concepts of graphs and maps in Machine Learning and Data Science, explaining their structures, types, and applications. Graphs represent relationships between entities through nodes and edges, while maps store data in key-value pairs for efficient access and manipulation. Both are essential for tasks such as social network analysis, recommendation systems, and data preprocessing.

Uploaded by

sahil ameriya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views16 pages

Understanding Graphs in Data Science

Uploaded by

sahil ameriya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Unit: 02

Algorithms

Graphs
In Machine Learning and Data Science, a graph is a powerful data structure used to
represent relationships between different entities or objects.
A graph is made up of nodes (also called vertices) and edges (connections between nodes).
In simple terms —
A graph shows how things are connected to each other.

Real-Life Analogy
Imagine a social media network like Instagram:
 Each user is a node.
 Each friendship or following is an edge connecting two users.
If you have 200 followers, that means there are 200 edges connected to your node.
So, the Instagram network is nothing but a graph of users connected by relationships.
Graphs in data science help computers understand and analyze such relationships.

Basic Components of a Graph

1. Vertices (Nodes):
These represent the entities in a dataset.
Example: Users, cities, websites, or molecules.
2. Edges (Links):
These represent relationships or connections between nodes.
Example: Friendship between two people, road between two cities, or hyperlink between
two web pages.
3. Weight (Optional):
Sometimes edges have a weight that represents the strength or cost of the connection.
Example: Distance between cities, similarity score between users, etc.

Types of Graphs
Graphs can be categorized based on their structure and connections:
1. Undirected Graph
 The edges have no direction.
 If A is connected to B, then B is also connected to A.
Example: Friendship network — if you’re friends with someone, it’s mutual.
Mathematically:
If an edge (A, B) exists, then (B, A) also exists.

2. Directed Graph (Digraph)

 The edges have a specific direction.
 If A follows B, it doesn’t mean B follows A.
Example: Twitter — you may follow Elon Musk, but he might not follow you back.
Mathematically:
If edge (A → B) exists, (B → A) may not exist.

3. Weighted Graph
 Each edge has a weight (cost, distance, or similarity).
Example:
In Google Maps, cities (nodes) are connected by roads (edges) with distances as weights.

4. Unweighted Graph
 All edges are considered equal, with no specific weight.
Example:
Friendship network — each connection is just a relationship, not measured by strength.

5. Cyclic and Acyclic Graphs

 Cyclic Graph: Contains at least one loop or cycle (A → B → C → A).
 Acyclic Graph: Has no cycles.
Example:
A dependency chart in project management (where one task depends on another) forms a
Directed Acyclic Graph (DAG) — commonly used in ML pipelines.

Mathematical Representation of a Graph

A graph is generally represented as:
𝐺 = (𝑉, 𝐸)

where,
 𝑉= set of vertices (nodes)
 𝐸= set of edges (connections between nodes)
If the graph is weighted, each edge 𝑒 ∈ 𝐸has an associated weight 𝑤(𝑒).

Representation Methods in Computer Memory

Graphs are stored and processed in computers using these structures:
1. Adjacency Matrix
o A 2D matrix of size 𝑛 × 𝑛(where n = number of vertices).
o If there’s an edge between vertex i and j, then matrix[i][j] = 1 (or weight).
Example:
For 3 vertices (A, B, C):
A B C
A0 1 0
B1 0 1
C0 1 0
→ Means A connected to B, B connected to C.
2. Adjacency List
o Stores each vertex and its connected neighbors.
o More memory-efficient for large, sparse graphs.
Example:
A→ B
B → A, C
C→B

Role of Graphs in Machine Learning and Data Science

Graphs are not just data structures — they are fundamental in many ML applications where
relationships between entities are crucial.
1. Social Network Analysis
Used to model and analyze relationships among people or organizations.
Algorithms like PageRank (used by Google) or Community Detection are graph-based.
Example: Detecting groups of similar users or identifying influencers in a social network.

2. Recommendation Systems
Graphs help build recommendation models by connecting users to items.
Example:
 Netflix → Users and Movies form a graph.
 If users A and B like the same movie, the system may recommend another movie liked
by A to B.
This is achieved using Graph-based Collaborative Filtering.

3. Knowledge Graphs
Knowledge graphs store and link real-world facts in a structured form.
Used by Google, IBM Watson, and AI systems to understand relationships between entities.
Example:
For “Sachin Tendulkar”, the graph connects to Cricket, India, World Cup, Records, etc.

4. Graph Neural Networks (GNNs)

A modern advancement in Machine Learning where deep learning models are applied on
graph structures.
Used in:
 Social networks (predicting connections)
 Molecular chemistry (predicting molecular properties)
 Fraud detection in finance
Example:
Banks use GNNs to detect suspicious money transfers by analyzing patterns in transaction
graphs.
5. Data Science Pipelines
Graphs represent workflow dependencies in ML pipelines — such as data collection →
preprocessing → model training → testing.
Each process is a node, and edges represent data flow between steps.
Mathematical Concepts in Graph Algorithms
Several graph-based algorithms are used in ML preprocessing and optimization tasks:
Algorithm Purpose Mathematical Idea
Dijkstra’s Uses greedy approach; updates minimum distance
Find shortest path
Algorithm 𝑑(𝑢) = min⁡(𝑑(𝑢), 𝑑(𝑣) + 𝑤(𝑢, 𝑣))
Measure 𝑃𝑅(𝑖)
PageRank
importance of 𝑃𝑅(𝐴) = (1 − 𝑑) + 𝑑∑
Algorithm 𝐿(𝑖)
nodes
Breadth-First Level-wise
Uses queue-based exploration
Search (BFS) traversal
Depth-First
Deep exploration Uses recursion or stack
Search (DFS)
These algorithms support various ML tasks such as data clustering, feature selection, and
pattern discovery.

Real-World Applications in Machine Learning

Domain Use of Graphs
Social Media Finding friend suggestions, detecting communities
Finance Fraud detection by transaction linkage
Healthcare Analyzing protein–drug interactions
Search Engines Ranking websites (Google’s PageRank)
Recommender Systems Building user-item interaction models
Transportation Finding optimal delivery or route networks

Graphs are essential in Machine Learning and Data Science for representing and analyzing
relationships in complex datasets.
They help models understand connections, influence, and structure within data, enabling
intelligent recommendations, predictions, and discoveries.

Maps:
In Machine Learning and Data Science, handling large amounts of data efficiently is very
important.
We often need to store, retrieve, and update data based on a key — like a name, ID, or
category.
This is where a Map (also called a Dictionary or Hash Map) becomes extremely useful.
A map is a data structure that stores data in the form of key–value pairs.
In simple words —
A map helps us quickly find the value associated with a key, without searching through the
entire dataset.

Real-Life Analogy
Think of a map like a real-world dictionary:
 The word is the key.
 The meaning is the value.
When you want to find the meaning of a word, you don’t read the entire book — you directly
go to the word and get its meaning.
Similarly, in programming and data science, when you know the key, you can instantly find
its value using a map.

Structure of a Map
A map is made up of:
 Key: A unique identifier (like a name, ID, or feature label).
 Value: The data or information associated with that key.
For example:
student_scores = {
"Ankesh": 95,
"Ravi": 88,
"Neha": 92
}
Here,
 “Ankesh”, “Ravi”, “Neha” → Keys
 95, 88, 92 → Values
If we need Ankesh’s score, we can directly retrieve it using the key:
student_scores["Ankesh"] → 95

Mathematical Representation
A map can be represented as a set of ordered pairs:
𝑀 = {(𝑘1, 𝑣1), (𝑘2 , 𝑣2 ), … , (𝑘𝑛 , 𝑣𝑛 )}

where:
 𝑘𝑖 = key
 𝑣𝑖 = value
and each key 𝑘𝑖 is unique.

Operations on Maps
Operation Description Example
Insert Add a new key–value pair Add a new student’s marks
Search / Lookup Find value by key Find marks of “Ravi”
Update Modify an existing key’s value Change Neha’s marks from 92 → 94
Delete Remove a key–value pair Remove “Ravi” from the record
Traversal Access all keys or values Display all students and marks

Types of Maps
There are different kinds of map structures, each useful for specific tasks:
1. Hash Map
 Stores key–value pairs using a hashing function.
 Provides average O(1) time for search, insert, and delete operations.
Example:
Python’s dict or Java’s HashMap.
💡 Analogy: Like an index in a book — you directly jump to the page instead of searching
line by line.

2. Tree Map
 Stores key–value pairs in a sorted order (usually using Binary Search Trees).
 Operations take O(log n) time.
Example:
Java’s TreeMap, used when data must stay sorted by keys.
💡 Use Case in ML: Keeping sorted feature names or model parameters.

3. Linked Hash Map

 Maintains the insertion order of elements.
 Used when the sequence of data entry matters.
Example:
Used in caching algorithms where order of use is important (like LRU cache).

Importance of Maps in Machine Learning and Data Science

Maps are not just programming tools — they’re essential for managing large datasets
efficiently.
They allow ML algorithms to quickly access and manipulate data based on unique
identifiers (like feature names or sample IDs).
1. Feature Mapping
When training a model, features (inputs) often have names.
Maps store these as key–value pairs, linking the feature name to its numerical value.
Example:
features = {
"Age": 25,
"Income": 40000,
"Education": "Graduate"
}
Here, each key represents a feature, and its value represents the data.

2. Label Encoding and One-Hot Encoding

In data preprocessing, categorical variables are converted into numbers using a mapping.
Example:
Gender: {"Male": 0, "Female": 1}
This map allows ML algorithms to understand categorical data numerically.
Analogy:
Like giving every student a roll number so computers can identify them easily.

3. Model Parameters and Hyperparameters

During ML model training, parameters (like weights, biases, learning rates) are stored as
key–value pairs for easy access.
Example:
model_params = {
"learning_rate": 0.01,
"epochs": 50,
"optimizer": "adam"
}
This map allows algorithms to quickly retrieve configuration values during training.

4. Word Mapping in NLP

In Natural Language Processing (NLP), words are converted into numerical IDs using
maps.
Example:
word_index = {"cat": 1, "dog": 2, "apple": 3}
So, the sentence "cat eats apple" becomes [1, 4, 3] (using mapped values).
This mapping allows ML models to process text efficiently.

5. Graph and Network Data

In Graph-based ML, maps are used to store connections between nodes.
Example:
graph = {
"A": ["B", "C"],
"B": ["A", "D"],
"C": ["A"]
}
Here, the map links each node (key) with its connected nodes (values).

Mathematical Use in ML (Map Function)

In some ML and data processing frameworks (like Spark or MapReduce), the term “map”
refers to a function that applies an operation to each element of data.
Mathematically:
𝑦𝑖 = 𝑓(𝑥𝑖 )

where f is applied to every element of dataset X.

Example in Python:
data = [1, 2, 3, 4]
squared = list(map(lambda x: x**2, data))
# squared = [1, 4, 9, 16]
Analogy:
Imagine stamping each paper in a pile with a label — the map function applies the same
operation to each item.
This concept is also used in distributed ML systems to process large data in parallel.

Applications of Maps in Data Science & ML

Application Use
Data Preprocessing Encode, normalize, or label data
Feature Selection Store and retrieve feature names and values
Model Parameters Keep track of learning configurations
Word Embeddings Convert words into numerical IDs
Graph Networks Represent node–edge relationships
Distributed Computing (MapReduce) Process big data across multiple machines

Advantages of Using Maps

✅ Fast Lookup: Quickly access data using keys.
✅ Memory Efficient: Stores only necessary key–value pairs.
✅ Easy to Modify: Add or remove items dynamically.
✅ Organized Data: Keeps features and parameters neatly mapped.
✅ Supports Complex Structures: Can store even nested dictionaries (maps within maps).

A map is a key–value data structure that enables fast data access, efficient storage, and
easy data manipulation — all of which are essential in Machine Learning and Data
Science for managing datasets, features, parameters, and encoded variables.
It not only helps in programmatic operations but also plays a crucial role in data
preprocessing, NLP, and big data frameworks like MapReduce.

Map Searching:
In the world of Machine Learning and Data Science, map searching refers to the process
of finding the most efficient path, location, or relationship between data points that are
represented as nodes and edges in a map-like structure.
Here, a map doesn’t only mean a physical geographical map — it can represent any dataset
structured as connections or relationships, such as social networks, logistics routes, or
even recommendation systems.
So, map searching algorithms help us navigate through data efficiently — just like how
Google Maps helps us find the shortest or fastest route from one location to another.

Real-Life AnalogyThink about when you use Google Maps to go from your home to your
college.
You enter the start and destination points, and Google instantly shows you the best path.
Behind the scenes, it runs map searching algorithms like Dijkstra’s algorithm or A (A-
star) algorithm* to determine the shortest or fastest route based on real-time data like
distance, traffic, and speed.
In Machine Learning, similar logic is used for finding optimal relationships or paths
between data points — such as clustering locations, linking similar users, or planning
efficient delivery routes.

Why Map Searching is Important in Data Science

Map searching plays a crucial role in ML-related tasks like:
1. Recommendation Systems: Finding connections between users and products (like a
map of preferences).
2. Clustering and Graph Analysis: Identifying the shortest path or closest group of data
points.
3. Routing Problems: Used in logistics, supply chain optimization, and self-driving
vehicles.
4. Social Network Analysis: Finding the shortest link between two people (like degrees
of separation).
These applications show that map searching algorithms are not just about navigation —
they’re about optimizing decisions in complex data structures.

Common Map Searching Algorithms

1. Breadth-First Search (BFS)
 It explores all the neighboring nodes first before moving to the next level.
 Works well for unweighted graphs (where all edges have equal cost).
 Example: Finding the minimum number of stops between two metro stations.
Steps:
1. Start at the source node.
2. Visit all its neighbors.
3. Move to the next level of neighbors.
Time Complexity: O(V + E)
(where V = number of vertices, E = number of edges)

2. Depth-First Search (DFS)

 It explores as far as possible along a single branch before backtracking.
 Useful for path existence and connectivity checking.
Example:
Finding if there’s a path between two cities in a map, regardless of distance.
Steps:
1. Start at a node.
2. Go deep into one branch until you reach the end.
3. Backtrack and explore another path.
Time Complexity: O(V + E)
3. Dijkstra’s Algorithm
 Used for finding the shortest path between nodes in a weighted graph (like road
networks).
 It minimizes the total cost or distance.
Real-Life Example:
Google Maps uses Dijkstra’s algorithm to find the shortest or fastest driving route.
Mathematical Expression:
If we denote distance between nodes u and v as 𝑤(𝑢, 𝑣), Dijkstra’s algorithm minimizes:
Distance(𝑣) = min⁡[Distance(𝑢) + 𝑤(𝑢, 𝑣)]

Time Complexity: O(V²) or O(E log V) using priority queue.

4. A (A-Star) Algorithm*
 A smarter version of Dijkstra’s algorithm.
 Uses both the actual cost so far (g) and the estimated future cost (h).
 Formula for evaluation:
𝑓(𝑛) = 𝑔(𝑛) + ℎ(𝑛)

Where:
 𝑔(𝑛): Cost from start to node n
 ℎ(𝑛): Heuristic (estimated cost to goal)
Example: Used in AI games and navigation systems to plan paths quickly.

Map Searching in Machine Learning

In ML, map searching is not limited to geographical maps — it’s also used to:
 Search for optimal hyperparameters (using search grids or optimization maps).
 Navigate feature spaces to find the best combinations of data attributes.
 Optimize graph-based models like Graph Neural Networks (GNNs) for link
prediction and node classification.
Essentially, the “map” here represents the search space that the ML algorithm needs to
explore to find the most efficient or accurate model configuration.

Mathematical Concept
A map can be represented as a graph 𝐺(𝑉, 𝐸):
 𝑉= set of vertices (locations or data points)
 𝐸= set of edges (connections between them)
Each edge 𝑒 ∈ 𝐸has a weight 𝑤(𝑒)representing cost, distance, or time.
The goal of a map searching algorithm is to find the minimum cost path from source 𝑠to
destination 𝑡:
min⁡∑𝑤(𝑒)

This minimization principle is at the heart of both data optimization and navigation systems.
Application of Algorithms – Stable Marriages, Dictionaries and Hashing,
Search Trees
In Machine Learning and Data Science, algorithms are not just theoretical concepts —
they are the core tools used to efficiently store, retrieve, and process data. The applications
of algorithms extend to tasks like data matching, searching, sorting, and optimization,
which are essential for training ML models, preprocessing datasets, and managing huge data
efficiently.
In this topic, we’ll explore three key applications of algorithms that are frequently
encountered in data-driven systems:
1. Stable Marriage Problem (Matching Algorithms)
2. Dictionaries and Hashing (Fast Data Lookup)
3. Search Trees (Structured Searching)
Each of these plays a crucial role in designing efficient data structures and ensuring that ML
pipelines handle data smoothly.

1. Stable Marriage Problem (Matching Algorithms)

The Stable Marriage Problem (SMP) is a matching algorithm that helps pair two sets of
entities based on preferences — in a way that no pair would prefer each other over their
current partners.
In ML and Data Science, this concept is applied to matching problems, such as
recommender systems, task assignments, and clustering.

Real-Life Analogy
Imagine you have 4 students and 4 project mentors.
 Each student ranks mentors according to preference.
 Each mentor ranks students too.
The goal is to find a stable pairing such that no student and mentor would rather be
matched with each other than their assigned pair.
This concept forms the base of the Gale–Shapley algorithm, which ensures a stable match.

Gale–Shapley Algorithm (Stable Matching Algorithm)

Steps:
1. Each unpaired student proposes to their most preferred mentor who hasn’t yet rejected
them.
2. Each mentor reviews proposals:
o If free, accepts the proposal.
o If already paired but prefers this new proposal, they accept the new one and
reject the old.
3. Repeat until everyone is paired stably.

Mathematical Representation
If 𝑆 = {𝑠1, 𝑠2, … , 𝑠𝑛 }are students and 𝑀 = {𝑚1, 𝑚2 , … , 𝑚𝑛 }are mentors,
the goal is to find a stable matching function 𝑓: 𝑆 → 𝑀such that
no pair (𝑠𝑖 , 𝑚𝑗 )prefers each other over 𝑓(𝑠𝑖 )and 𝑓 −1(𝑚𝑗 ).

Applications in Data Science

 Matching users to suitable products in recommender systems.
 Allocating tasks or computing resources in cloud systems.
 Matching job seekers to suitable job roles based on preferences.

2. Dictionaries and Hashing

Concept
A dictionary (also known as a map in programming) stores data in key-value pairs.
This structure allows quick access to data using keys rather than searching through the entire
dataset.
In Machine Learning, dictionaries are crucial for:
 Storing datasets efficiently (e.g., feature-value mappings).
 Caching model outputs for repeated queries.
 Counting word frequencies in NLP tasks.

Real-Life Analogy
Imagine a telephone directory — you don’t scan every page to find a name; you look it up
alphabetically.
Similarly, in a dictionary (hash map), you directly jump to the correct "bucket" using a key.

Hashing Technique
Hashing is the process of converting a key into a unique index (hash code) that points to its
value in a hash table.
Hash Function Formula:
ℎ(𝑘) = 𝑘mod 𝑚

Where:
 ℎ(𝑘): hash function output (index)
 𝑘: key
 𝑚: size of hash table
If multiple keys map to the same index (collision), it’s resolved using techniques like:
 Chaining (linked lists)
 Open Addressing (linear probing)

Applications in ML and Data Science

 Fast lookup of features during model training.
 Indexing large datasets for search and retrieval.
 Counting occurrences (e.g., term frequency in NLP).
 Storing model parameters efficiently (e.g., word embeddings).
3. Search Trees
A search tree is a hierarchical data structure used to store elements in a way that allows fast
searching, insertion, and deletion operations — which are vital for data-intensive ML
tasks.

🔸 Types of Search Trees

(a) Binary Search Tree (BST)
 Each node has at most two children.
 Left child < Parent < Right child.
 Searching time depends on the height of the tree.
Time Complexity:
 Best Case: 𝑂(log⁡𝑛)
 Worst Case: 𝑂(𝑛)

(b) Balanced Trees (AVL, Red-Black Tree)

 Automatically balance themselves to maintain efficiency.
 Guarantee 𝑂(log⁡𝑛)performance for insert, delete, and search.
Example: Databases often use Red-Black trees for indexing.

(c) B-Trees and B+ Trees

 Used in databases and file systems.
 Can store multiple keys in a single node, reducing disk access.

Real-Life Analogy
Think of a library bookshelf:
 Books are arranged in order by title or ID.
 You don’t check every book; you follow an ordered path to find it quickly — just like
a search tree reduces search time in large datasets.

Applications in ML and Data Science

 Decision Trees: Fundamental models used for classification and regression.
 Database Indexing: Storing and retrieving massive data efficiently.
 Search Optimization: Used in K-D trees for nearest neighbor searches in clustering
and pattern recognition.
 Feature Mapping: Organizing data points hierarchically for quick access.
Key Takeaways
Concept Purpose Example Application
Stable Marriage Matching entities based on User-product matching, resource
Algorithm preference allocation
Dictionaries & Fast data retrieval using key-
Feature lookup, caching
Hashing value pairs
Hierarchical data structure for Decision trees, clustering,
Search Trees
fast search indexing
The application of algorithms in Machine Learning and Data Science helps create systems
that are efficient, scalable, and intelligent.
Whether it’s matching users to products (stable matching), storing and searching
massive datasets (hashing and trees), or building predictive models (decision trees) —
algorithms form the foundation for making data-driven systems smarter and faster.

Dynamic Programming:
Dynamic Programming (DP) is an algorithmic technique used to solve complex problems
by breaking them down into smaller, overlapping subproblems and solving each subproblem
only once.
The results of these smaller problems are then stored (memoized) and reused, which saves
computation time and avoids redundant work.
In the context of Machine Learning and Data Science, dynamic programming plays a vital
role in tasks such as sequence prediction, time series modeling, optimization,
reinforcement learning, and probabilistic reasoning.

Real-Life Analogy
Imagine you are climbing a staircase with 10 steps.
You can take either 1 step or 2 steps at a time.
Now, to reach the top, you can either:
 Take one step from the 9th step, or
 Take two steps from the 8th step.
If you already know how many ways there are to reach the 8th and 9th steps, you don’t need
to calculate them again — you just add them up!
That’s exactly what dynamic programming does — it remembers past results and reuses
them to build up the final solution efficiently.

When to Use Dynamic Programming

A problem can be solved using Dynamic Programming if it satisfies two main properties:
1. Overlapping Subproblems:
The same smaller problems are solved multiple times.
→ DP avoids re-solving them by storing results.
2. Optimal Substructure:
The solution to a big problem depends on the solutions to smaller problems.
→ The optimal overall solution is built from optimal smaller ones.

Basic Idea
In DP, we usually follow these steps:
1. Define the problem in terms of subproblems.
2. Store (memoize) the results of subproblems.
3. Build the final solution using these stored results.
The process can be done in two main ways:
 Top-Down (Memoization): Solve the problem recursively and store results.
 Bottom-Up (Tabulation): Solve smaller subproblems first and build the solution
iteratively.

Mathematical Expression
If a problem’s solution can be expressed as:
𝑓(𝑛) = 𝑓(𝑛 − 1) + 𝑓(𝑛 − 2)

and we store results of 𝑓(𝑛 − 1)and 𝑓(𝑛 − 2)to reuse later,

then this becomes a Dynamic Programming solution (like in Fibonacci series).
Example – Fibonacci Series:
Without DP:
𝐹(𝑛) = 𝐹(𝑛 − 1) + 𝐹(𝑛 − 2)

Each call recomputes the same values multiple times → inefficient.

With DP:
 Compute 𝐹(0)and 𝐹(1).
 Store them and build up to 𝐹(𝑛).
This reduces time complexity from O(2ⁿ) to O(n).

Example Problem – Shortest Path in a Graph

One of the most famous DP-based algorithms is the Floyd–Warshall Algorithm, which
finds the shortest paths between all pairs of vertices in a weighted graph.
If we denote the shortest distance between vertex i and j as 𝐷𝑖𝑗 , then:
𝐷𝑖𝑗 (𝑘) = min⁡[𝐷𝑖𝑗 (𝑘 − 1), 𝐷𝑖𝑘 (𝑘 − 1) + 𝐷𝑘𝑗 (𝑘 − 1)]

Here, the algorithm uses DP to build solutions progressively by considering each vertex as an
intermediate point.

Applications in Machine Learning and Data Science

Dynamic Programming is deeply embedded in many data science and AI tasks, especially
where sequential decision-making and optimization are involved.
Application Area Example / Use Case
Speech recognition, handwriting recognition (using
Sequence Prediction
Hidden Markov Models)
Predicting next values efficiently using overlapping
Time Series Analysis
patterns
Value Iteration and Policy Iteration use DP to find optimal
Reinforcement Learning
actions
Natural Language Processing Viterbi Algorithm for Part-of-Speech tagging and word
(NLP) alignment
Resource allocation, route optimization, and scheduling
Optimization Problems
tasks
Real-World Example: Viterbi Algorithm (NLP)
In speech recognition or text decoding, we often need to find the most probable sequence
of hidden states (like phonemes or words) that could produce a given sequence of
observations (sounds or letters).
The Viterbi algorithm, based on DP, efficiently finds this optimal sequence without
exploring all possible combinations.

Comparison with Other Techniques

Technique Principle Example
Divide and Conquer Divide into independent subproblems Merge Sort, Quick Sort
Dynamic Divide into overlapping subproblems and Fibonacci, Shortest Path,
Programming reuse results Viterbi
Choose locally optimal choice at each Dijkstra’s, Huffman
Greedy Algorithms
step Coding

Advantages of Dynamic Programming

✅ Reduces time complexity by avoiding recomputation.
✅ Ensures optimal solutions for complex problems.
✅ Highly useful in optimization and probabilistic modeling.

Disadvantages
❌ Requires more memory to store intermediate results.
❌ Needs careful problem formulation to identify subproblems.
❌ Sometimes overkill for simple or non-overlapping problems.

Key Takeaways
 Dynamic Programming is an algorithm design strategy, not a specific algorithm.
 It works by breaking problems into smaller parts, solving once, and reusing
results.
 Plays a crucial role in optimization, sequential data, and probabilistic models in
ML.
 Used in famous algorithms like Viterbi, Bellman–Ford, Floyd–Warshall, and Value
Iteration in reinforcement learning.

Unit-5 Vector Database
No ratings yet
Unit-5 Vector Database
24 pages
Vector Databases and ANN Techniques
No ratings yet
Vector Databases and ANN Techniques
82 pages
Database Indexing and Searching Basics
No ratings yet
Database Indexing and Searching Basics
280 pages
Understanding Inverted Indexes in Search Engines
No ratings yet
Understanding Inverted Indexes in Search Engines
38 pages
Reinforcement Learning Overview
No ratings yet
Reinforcement Learning Overview
31 pages
Understanding Vector Databases
No ratings yet
Understanding Vector Databases
7 pages
Understanding Transformer Architecture
No ratings yet
Understanding Transformer Architecture
55 pages
Advanced Deep Learning Techniques
No ratings yet
Advanced Deep Learning Techniques
89 pages
Notes of Deep Learning Top Architectures
No ratings yet
Notes of Deep Learning Top Architectures
13 pages
Top 25 ML System Design Questions
No ratings yet
Top 25 ML System Design Questions
207 pages
Graph Algorithms For Data Science: With Examples in Neo4j 1st Edition Tomaž Bratanic Ebook Premium Student Version
100% (3)
Graph Algorithms For Data Science: With Examples in Neo4j 1st Edition Tomaž Bratanic Ebook Premium Student Version
60 pages
Reflection Pattern in Agent Workflows
No ratings yet
Reflection Pattern in Agent Workflows
14 pages
Clustering Techniques and Algorithms Explained
No ratings yet
Clustering Techniques and Algorithms Explained
61 pages
Agentic AI Workflows Explained
No ratings yet
Agentic AI Workflows Explained
6 pages
MOELoRA: Efficient Fine-tuning for Medical LLMs
No ratings yet
MOELoRA: Efficient Fine-tuning for Medical LLMs
11 pages
Understanding Bias-Variance Tradeoff
No ratings yet
Understanding Bias-Variance Tradeoff
11 pages
Open LLM Model Details
No ratings yet
Open LLM Model Details
146 pages
Indexing and Searching Techniques Overview
No ratings yet
Indexing and Searching Techniques Overview
93 pages
AI Techniques in Constraint Satisfaction
No ratings yet
AI Techniques in Constraint Satisfaction
53 pages
Real-Life Applications of Linked Lists
No ratings yet
Real-Life Applications of Linked Lists
40 pages
Regularization Techniques in Neural Networks
No ratings yet
Regularization Techniques in Neural Networks
9 pages
Understanding Word Embeddings and Vectorization
No ratings yet
Understanding Word Embeddings and Vectorization
9 pages
Comparing LoRA and QLoRA Techniques
No ratings yet
Comparing LoRA and QLoRA Techniques
5 pages
Advanced Deep Learning Techniques Overview
No ratings yet
Advanced Deep Learning Techniques Overview
10 pages
Machine Learning Diagrams Overview
No ratings yet
Machine Learning Diagrams Overview
32 pages
Vector Embeddings and Database Essentials
No ratings yet
Vector Embeddings and Database Essentials
28 pages
Tool Use in Agent Function Calling
No ratings yet
Tool Use in Agent Function Calling
21 pages
Enhancing Reasoning in Language Models
No ratings yet
Enhancing Reasoning in Language Models
9 pages
Understanding CSP in AI Techniques
No ratings yet
Understanding CSP in AI Techniques
4 pages
Python Course Lecture Slides
No ratings yet
Python Course Lecture Slides
53 pages
RNNs, Transformers, and GANs Explained
No ratings yet
RNNs, Transformers, and GANs Explained
24 pages
Multi-Agent LLM Memory Management Insights
No ratings yet
Multi-Agent LLM Memory Management Insights
4 pages
TabTransformer: Contextual Embeddings for Tabular Data
No ratings yet
TabTransformer: Contextual Embeddings for Tabular Data
17 pages
Super Tiny Language Models Explained
No ratings yet
Super Tiny Language Models Explained
11 pages
Deep Learning Course Overview and Structure
100% (1)
Deep Learning Course Overview and Structure
243 pages
LLM Interview Questions & Course Guide
No ratings yet
LLM Interview Questions & Course Guide
10 pages
Deep Learning Course Overview CM20315
No ratings yet
Deep Learning Course Overview CM20315
62 pages
LangGraph Agents Tutorial Guide
No ratings yet
LangGraph Agents Tutorial Guide
24 pages
AI Algorithms: Machine Learning Basics
No ratings yet
AI Algorithms: Machine Learning Basics
69 pages
A Survey of Evolution of Image Captioning PDF
No ratings yet
A Survey of Evolution of Image Captioning PDF
18 pages
Generative AI Mastery Roadmap 2025
No ratings yet
Generative AI Mastery Roadmap 2025
26 pages
LLMs in Personalized Recommendation Systems
No ratings yet
LLMs in Personalized Recommendation Systems
101 pages
Understanding Parameter-Efficient Fine-Tuning
No ratings yet
Understanding Parameter-Efficient Fine-Tuning
10 pages
Paper+10+ (2024 6 1) +Dominance+of+AI+and+Machine+Learning+Techniques+in+Hybrid+Movie+
No ratings yet
Paper+10+ (2024 6 1) +Dominance+of+AI+and+Machine+Learning+Techniques+in+Hybrid+Movie+
9 pages
Machine Learning Interview Q&A Guide
100% (1)
Machine Learning Interview Q&A Guide
17 pages
Domain-Specific LLM Solutions
No ratings yet
Domain-Specific LLM Solutions
37 pages
Enhancing EdgeAI with SLM Techniques
No ratings yet
Enhancing EdgeAI with SLM Techniques
45 pages
LSTM and GRU: Illustrated Guide
No ratings yet
LSTM and GRU: Illustrated Guide
15 pages
Data Mining Lab Manual with Python
No ratings yet
Data Mining Lab Manual with Python
35 pages
Logistic Regression Overview by Andrew Ng
No ratings yet
Logistic Regression Overview by Andrew Ng
31 pages
Understanding Deep Learning Basics
No ratings yet
Understanding Deep Learning Basics
32 pages
Data Visualization Techniques Explained
No ratings yet
Data Visualization Techniques Explained
21 pages
Building Variational Autoencoders
No ratings yet
Building Variational Autoencoders
29 pages
Multi-layer Perceptron and Backpropagation
No ratings yet
Multi-layer Perceptron and Backpropagation
54 pages
Graphs and Maps in Machine Learning
No ratings yet
Graphs and Maps in Machine Learning
10 pages
Graph-Based Data Science Techniques
No ratings yet
Graph-Based Data Science Techniques
37 pages
Machine Learning Techniques for Graphs
No ratings yet
Machine Learning Techniques for Graphs
32 pages
Introduction to Graphs in Machine Learning
No ratings yet
Introduction to Graphs in Machine Learning
5 pages
Introduction to Graphs for ML
No ratings yet
Introduction to Graphs for ML
13 pages
General Loader Scheme: Overview & Analysis
No ratings yet
General Loader Scheme: Overview & Analysis
10 pages
Overview of Matrix Types and Operations
No ratings yet
Overview of Matrix Types and Operations
56 pages
GEM Bid Status Overview
No ratings yet
GEM Bid Status Overview
4 pages
Group By Clause Practice Problems
No ratings yet
Group By Clause Practice Problems
2 pages
Eastman TRI-X Reversal Film 7278 Guide
No ratings yet
Eastman TRI-X Reversal Film 7278 Guide
8 pages
Semi Detaxhed Boq - 095031
No ratings yet
Semi Detaxhed Boq - 095031
19 pages
PWSA Transport Layer Overview
No ratings yet
PWSA Transport Layer Overview
2 pages
GameCenterBizApplication Startup Logs
No ratings yet
GameCenterBizApplication Startup Logs
15 pages
Node.js Developer Resume Sample
No ratings yet
Node.js Developer Resume Sample
2 pages
Resistivity Meter 88C
No ratings yet
Resistivity Meter 88C
1 page
Optimal Protein Diet Cost Analysis
No ratings yet
Optimal Protein Diet Cost Analysis
7 pages
Applied Mathematics Exam Paper 2025
No ratings yet
Applied Mathematics Exam Paper 2025
5 pages
TMS Migration Playbook for EOL Products
No ratings yet
TMS Migration Playbook for EOL Products
17 pages
3D Reflection Techniques in Graphics
No ratings yet
3D Reflection Techniques in Graphics
11 pages
Hyosung Pump Troubleshooting Guide
No ratings yet
Hyosung Pump Troubleshooting Guide
3 pages
Sierra Leone Electronic Transactions Act 2018
No ratings yet
Sierra Leone Electronic Transactions Act 2018
14 pages
Mapa América
No ratings yet
Mapa América
11 pages
Household Rainwater Energy System Design
No ratings yet
Household Rainwater Energy System Design
17 pages
Grade IX Computer Science Assessment Framework
No ratings yet
Grade IX Computer Science Assessment Framework
31 pages
Profile of Telure Mahendra Barku
No ratings yet
Profile of Telure Mahendra Barku
3 pages
Spatial Probabilistic Multi Criteria Dec
No ratings yet
Spatial Probabilistic Multi Criteria Dec
46 pages
Time Series Analysis Overview
No ratings yet
Time Series Analysis Overview
104 pages
Industrial Drives Course Syllabus
100% (1)
Industrial Drives Course Syllabus
17 pages
Reliance Retail Tax Invoice Details
No ratings yet
Reliance Retail Tax Invoice Details
3 pages
Adder and Subtractor Circuit Analysis
No ratings yet
Adder and Subtractor Circuit Analysis
13 pages
Artifact Analysis Worksheet Guide
No ratings yet
Artifact Analysis Worksheet Guide
1 page
HMK 310LC3A Excavator Parts Catalog
No ratings yet
HMK 310LC3A Excavator Parts Catalog
590 pages
ITSM - Self Service - User Manual - v1.1
No ratings yet
ITSM - Self Service - User Manual - v1.1
23 pages
Electric Submersible Pump Evaluation
No ratings yet
Electric Submersible Pump Evaluation
9 pages
Impact Performance of Foam-Filled Aluminum Tubes
No ratings yet
Impact Performance of Foam-Filled Aluminum Tubes
19 pages

Understanding Graphs in Data Science

Uploaded by

Understanding Graphs in Data Science

Uploaded by

Unit: 02

Basic Components of a Graph

2. Directed Graph (Digraph)

5. Cyclic and Acyclic Graphs

Mathematical Representation of a Graph

Representation Methods in Computer Memory

Role of Graphs in Machine Learning and Data Science

4. Graph Neural Networks (GNNs)

Real-World Applications in Machine Learning

3. Linked Hash Map

Importance of Maps in Machine Learning and Data Science

2. Label Encoding and One-Hot Encoding

3. Model Parameters and Hyperparameters

4. Word Mapping in NLP

5. Graph and Network Data

Mathematical Use in ML (Map Function)

where f is applied to every element of dataset X.

Applications of Maps in Data Science & ML

Advantages of Using Maps

Why Map Searching is Important in Data Science

Common Map Searching Algorithms

2. Depth-First Search (DFS)

Time Complexity: O(V²) or O(E log V) using priority queue.

Map Searching in Machine Learning

1. Stable Marriage Problem (Matching Algorithms)

Gale–Shapley Algorithm (Stable Matching Algorithm)

Applications in Data Science

2. Dictionaries and Hashing

Applications in ML and Data Science

🔸 Types of Search Trees

(b) Balanced Trees (AVL, Red-Black Tree)

(c) B-Trees and B+ Trees

Applications in ML and Data Science

When to Use Dynamic Programming

and we store results of 𝑓(𝑛 − 1)and 𝑓(𝑛 − 2)to reuse later,

Each call recomputes the same values multiple times → inefficient.

Example Problem – Shortest Path in a Graph

Applications in Machine Learning and Data Science

Comparison with Other Techniques

Advantages of Dynamic Programming

You might also like