0% found this document useful (0 votes)
6 views67 pages

Algorithm Notes 4

The document provides an overview of various sorting and searching algorithms, including Bubble Sort, Insertion Sort, Selection Sort, Merge Sort, Quick Sort, Linear Search, and Binary Search, detailing their algorithms and time complexities. It also covers advanced sorting techniques like Heap Sort, Counting Sort, and Radix Sort, as well as data structures like Binary Search Trees and Hashing for efficient searching. Merging algorithms are introduced as a fundamental operation in combining sorted sequences, with a naïve approach example provided for merging two sorted arrays.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views67 pages

Algorithm Notes 4

The document provides an overview of various sorting and searching algorithms, including Bubble Sort, Insertion Sort, Selection Sort, Merge Sort, Quick Sort, Linear Search, and Binary Search, detailing their algorithms and time complexities. It also covers advanced sorting techniques like Heap Sort, Counting Sort, and Radix Sort, as well as data structures like Binary Search Trees and Hashing for efficient searching. Merging algorithms are introduced as a fundamental operation in combining sorted sequences, with a naïve approach example provided for merging two sorted arrays.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Sorting and searching algorithms

Sorting and Searching Algorithms

Introduction:

Sorting and searching are fundamental operations in computer science and


play a crucial role in various applications. In this lecture, we will explore
different sorting and searching algorithms, their principles, and their
efficiency.

Part 1: Sorting Algorithms

1. Bubble Sort:

Bubble Sort is a simple comparison-based sorting algorithm that repeatedly


swaps adjacent elements if they are in the wrong order.

Algorithm:

1. Start with the first element and compare it with the next element.

2. If they are in the wrong order, swap them.

3. Continue this process for all adjacent pairs.

4. Repeat steps 1-3 for the remaining unsorted portion of the array until the
array is sorted.
Example:

Consider the array [5, 2, 8, 1, 3].

- Pass 1: [2, 5, 1, 3, 8]

- Pass 2: [2, 1, 3, 5, 8]

- Pass 3: [1, 2, 3, 5, 8]

Bubble Sort has a time complexity of O(n^2) in the worst case.

2. Insertion Sort:

Insertion Sort is a simple comparison-based sorting algorithm that builds


the final sorted array one element at a time.

Algorithm:

1. Start with the second element and compare it with the previous elements.

2. If an element is smaller, shift the larger elements one position to the right.

3. Insert the current element into the correct position.

4. Repeat steps 1-3 for the remaining unsorted portion of the array until the
array is sorted.

Example:

Consider the array [5, 2, 8, 1, 3].


- Pass 1: [2, 5, 8, 1, 3]

- Pass 2: [2, 5, 8, 1, 3]

- Pass 3: [1, 2, 5, 8, 3]

- Pass 4: [1, 2, 3, 5, 8]

Insertion Sort has a time complexity of O(n^2) in the worst case.

3. Selection Sort:

Selection Sort is a simple comparison-based sorting algorithm that divides


the array into sorted and unsorted portions. It repeatedly selects the smallest
element from the unsorted portion and swaps it with the first element of the
unsorted portion.

Algorithm:

1. Find the minimum element in the unsorted portion.

2. Swap the minimum element with the first element of the unsorted portion.

3. Expand the sorted portion by one element.

4. Repeat steps 1-3 for the remaining unsorted portion of the array until the
array is sorted.

Example:
Consider the array [5, 2, 8, 1, 3].

- Pass 1: [1, 2, 8, 5, 3]

- Pass 2: [1, 2, 8, 5, 3]

- Pass 3: [1, 2, 3, 5, 8]

Selection Sort has a time complexity of O(n^2) in the worst case.

4. Merge Sort:

Merge Sort is a divide-and-conquer sorting algorithm that divides the array


into smaller subarrays, sorts them, and then merges them back together.

Algorithm:

1. Divide the array into two halves.

2. Recursively apply Merge Sort to each half.

3. Merge the sorted halves back into a single sorted array.

Example:

Consider the array [5, 2, 8, 1, 3].

- Split: [5, 2] [8, 1, 3]

- Split: [5] [2] [8] [1, 3]

- Merge: [2, 5] [1, 3, 8]


- Merge: [1, 2, 3, 5, 8]

Merge Sort has a time complexity of O(n log n) in all cases.

5. Quick Sort:

Quick Sort is a divide-and-conquer sorting algorithm that selects a pivot


element, partitions the array around the pivot, and recursively applies Quick
Sort to the two resulting subarrays.

Algorithm:

1. Select a pivot element from the array.

2. Partition the array, so that elements smaller than the pivot are placed
before it, and elements larger than the pivot are placed after it.

3. Recursively apply Quick Sort to the subarrays before and after the pivot.

Example:

Consider the array [5, 2, 8, 1, 3].

- Pivot: 3

- Partition: [2, 1, 3], [5, 8]

- Recurse: [2, 1, 3] -> [1, 2, 3]

- Recurse: [5, 8]
Quick Sort has a time complexity of O(n log n) in the average and best cases.
However, it can degrade to O(n^2) in the worst case if the pivot selection is
poor.

Part 2: Searching Algorithms

1. Linear Search:

Linear Search is a simple searching algorithm that sequentially checks each


element of the array until a match is found or the entire array is traversed.

Algorithm:

1. Start from the first element of the array.

2. Compare the current element with the target element.

3. If a match is found, return the index.

4. If the entire array is traversed without a match, return -1.

Example:

Consider the array [5, 2, 8, 1, 3] and the target element 8.

- Pass 1: 5 != 8

- Pass 2: 2 != 8
- Pass 3: 8 == 8 (Match found at index 2)

Linear Search has a time complexity of O(n), where n is the size of the array.

2. Binary Search:

Binary Search is an efficient searching algorithm that works on sorted arrays.


It repeatedly divides the search space in half, eliminating half of the
remaining elements in each iteration.

Algorithm:

1. Compare the target element with the middle element of the array.

2. If they are equal, return the index.

3. If the target element is smaller, repeat the process on the left half of the
array.

4. If the target element is larger, repeat the process on the right half of the
array.

5. Continue dividing until a match is found or the search space is empty.

Example:

Consider the sorted array [1, 2, 3, 5, 8] and the target element 3.

- Pass 1: 3 < 5 (Search left half)


- Pass 2: 3 == 3 (Match found at index 2)

Binary Search has a time complexity of O(log n), where n is the size of the
array.

Sorting and searching algorithms are essential tools in computer science. By


understanding these algorithms, their principles, and their efficiency, you
can choose the most appropriate algorithm for a given problem or
application. It is important to consider the input size, expected data
distribution, and performance requirements when selecting an algorithm.

Part 1: Sorting Algorithms

6. Heap Sort:

Heap Sort is a comparison-based sorting algorithm that uses a binary heap


data structure. It involves building a heap from the array and repeatedly
extracting the maximum element from the heap and placing it at the end of
the sorted portion.

Algorithm:

1. Build a max-heap from the array.

2. Swap the root (maximum element) with the last element of the heap.
3. Reduce the heap size by one.

4. Heapify the remaining heap to maintain the max-heap property.

5. Repeat steps 2-4 until the heap is empty.

Example:

Consider the array [5, 2, 8, 1, 3].

- Pass 1: Swap 8 with 3 -> [3, 5, 2, 1, 8]

- Pass 2: Swap 5 with 1 -> [1, 3, 2, 5, 8]

- Pass 3: Swap 3 with 2 -> [1, 2, 3, 5, 8]

Heap Sort has a time complexity of O(n log n) in all cases.

7. Counting Sort:

Counting Sort is a non-comparison based sorting algorithm that works well


for a range of input values. It creates a count array to store the frequency of
each distinct element and uses this information to sort the array.

Algorithm:

1. Find the minimum and maximum values in the array.

2. Create a count array to store the frequencies of each element.

3. Compute the cumulative sum of the count array.


4. Place each element from the input array into its correct position using the
count array.

5. Copy the sorted elements back into the original array.

Example:

Consider the array [5, 2, 8, 1, 3] with values ranging from 1 to 8.

- Counting Array: [0, 1, 1, 1, 0, 1, 0, 0, 1]

- Cumulative Sum: [0, 1, 2, 3, 3, 4, 4, 4, 5]

- Sorted Array: [1, 2, 3, 5, 8]

Counting Sort has a time complexity of O(n + k), where n is the size of the
array and k is the range of input values.

8. Radix Sort:

Radix Sort is a non-comparison based sorting algorithm that sorts the


elements based on their digits or characters. It iteratively sorts the elements
from the least significant digit to the most significant digit using a stable
sorting algorithm such as Counting Sort.

Algorithm:

1. Determine the maximum number of digits or characters in the input


elements.
2. Starting from the least significant digit, use a stable sorting algorithm to
sort the elements based on that digit.

3. Repeat step 2 for each subsequent digit until all digits are considered.

Example:

Consider the array [170, 45, 75, 90, 802, 24, 2, 66] represented as integers.

- Sort by the ones digit: [170, 90, 802, 2, 24, 45, 75, 66]

- Sort by the tens digit: [802, 2, 24, 45, 66, 170, 75, 90]

- Sort by the hundreds digit: [2, 24, 45, 66, 75, 90, 170, 802]

Radix Sort has a time complexity of O(d * (n + k)), where d is the maximum
number of digits, n is the size of the array, and k is the range of input values.

Part 2: Searching Algorithms

3. Binary Search Tree (BST):

Binary Search Tree is a binary tree data structure that maintains the property
that the left subtree of a node contains values smaller than the node, and the
right subtree contains values larger than the node. It allows for efficient
searching and insertion.

Algorithm:
1. Start at the root node.

2. If the target value is equal to the current node’s value, return the node.

3. If the target value is less than the current node’s value, go to the left
subtree.

4. If the target value is greater than the current node’s value, go to the right
subtree.

5. Repeat steps 2-4 until a match is found or a leaf node is reached.

Example:

Consider a BST with the following elements: 5, 2, 8, 1, 3.

- Search for value 3:

- Start at root (5), go left (2), go right (3) -> Match found.

Binary Search Tree has an average time complexity of O(log n) for searching,
insertion, and deletion operations. However, in the worst case, it can have a
time complexity of O(n) if the tree is unbalanced.

4. Hashing:

Hashing is a technique that maps data elements to a fixed-size array called a


hash table using a hash function. It provides constant-time average-case
searching by computing the hash value of the key and using it as an index to
store or retrieve the corresponding element.
Algorithm:

1. Compute the hash value of the key using a hash function.

2. Use the hash value as an index to store or retrieve the element in the hash
table.

Example:

Consider a hash table with the following elements: [5, 2, 8, 1, 3Apologies for
the abrupt interruption. Let’s continue with the explanation of sorting and
searching algorithms:

5. Hashing:

Hashing is a technique that maps data elements to a fixed-size array called a


hash table using a hash function. It provides constant-time average-case
searching by computing the hash value of the key and using it as an index to
store or retrieve the corresponding element.

Algorithm:

1. Compute the hash value of the key using a hash function.

2. Use the hash value as an index to store or retrieve the element in the hash
table.
Example:

Consider a hash table with the following elements: [5, 2, 8, 1, 3].

- Hash Function: key mod table_size (e.g., key % 5).

- Hash Table: [ , 1, 2, , 5, , , , 8].

- To retrieve the value 8, compute its hash (8 % 5 = 3) and access the element
at index 3.

Hashing provides constant-time average-case search, insertion, and deletion


operations, but in the worst case, it can have a time complexity of O(n) if
there are many collisions.

6. Binary Search:

Binary Search is an efficient search algorithm that works on sorted arrays. It


repeatedly divides the search space in half by comparing the target value with
the middle element, discarding the half that cannot contain the target.

Algorithm:

1. Set low and high pointers to the start and end of the search space,
respectively.

2. While the low pointer is less than or equal to the high pointer:

a. Compute the middle index as (low + high) / 2.

b. If the middle element is equal to the target, return its index.


c. If the target is less than the middle element, set the high pointer to
middle – 1.

d. If the target is greater than the middle element, set the low pointer to
middle + 1.

3. If the target is not found, return a “not found” indicator.

Example:

Consider the sorted array [1, 2, 3, 5, 8].

- Search for value 3:

- Initial low = 0, high = 4.

- Middle = (0 + 4) / 2 = 2.

- Target (3) < Middle (3) -> Set high = 2 – 1 = 1.

- New low = 0, new high = 1.

- Middle = (0 + 1) / 2 = 0.

- Target (3) > Middle (1) -> Set low = 0 + 1 = 1.

- New low = 1, new high = 1.

- Middle = (1 + 1) / 2 = 1.

- Target (3) = Middle (2) -> Match found.

Binary Search has a time complexity of O(log n) as it reduces the search space
by half in each iteration.
These are just a few examples of sorting and searching algorithms. There are
many more algorithms with different characteristics and use cases.

Merging Algorithms

Introduction:

Merging is a fundamental operation in computer science and is commonly


used in various algorithms and applications. It involves combining two or
more sorted sequences or arrays into a single sorted sequence or array.
Merging is a crucial step in many sorting algorithms as well as in applications
such as database operations, merge sort, and external sorting. In this lecture,
we will explore different merging algorithms and their properties, along with
examples.

I. Naïve Approach:

The simplest approach to merging two sorted sequences or arrays is the naïve
approach. It involves comparing the elements from both sequences and
selecting the smallest element to add to the merged sequence. This process
continues until all elements are merged.
Example:

Consider two sorted sequences: A = [2, 4, 6, 8] and B = [1, 3, 5, 7]. We want


to merge these sequences into a single sorted sequence.

Algorithm:

1. Initialize an empty merged sequence: Merged = [].

2. Compare the first elements from both sequences: A[0] = 2 and B[0] = 1.

3. Select the smaller element (1) and append it to the merged sequence:
Merged = [1].

4. Move the pointer of the selected sequence (B) to the next element.

5. Repeat steps 2-4 until all elements from both sequences are merged.

Following the algorithm, we compare 2 and 3, select 2, compare 3 and 4,


select 3, compare 4 and 5, select 4, compare 5 and 6, select 5, compare 6 and
7, select 6, compare 7 and 8, select 7, and finally, append 8 to the merged
sequence.

The merged sequence will be Merged = [1, 2, 3, 4, 5, 6, 7, 8].

Time Complexity:
The time complexity of the naïve merging algorithm is O(n), where n is the
total number of elements in both sequences. This is because each element is
compared and added to the merged sequence once.

II. Two-Pointers Approach:

The two-pointers approach is an optimized version of the naïve approach


that uses two pointers to track the positions in the sequences being merged.
This approach avoids unnecessary comparisons and achieves better
performance.

Example:

Consider the same sorted sequences as in the previous example: A = [2, 4, 6,


8] and B = [1, 3, 5, 7].

Algorithm:

1. Initialize two pointers, one for each sequence, pointing to the first element:
I = 0, j = 0.

2. Initialize an empty merged sequence: Merged = [].

3. Compare the elements at the current positions of the pointers: A[i] = 2 and
B[j] = 1.

4. Select the smaller element (1) and append it to the merged sequence:
Merged = [1].

5. Move the pointer of the selected sequence (B) to the next element: j = 1.
6. Repeat steps 3-5 until one of the sequences is fully merged.

Following the algorithm, we compare 2 and 3, select 2, compare 3 and 4,


select 3, compare 4 and 5, select 4, compare 5 and 6, select 5, compare 6 and
7, select 6, compare 7 and 8, select 7, and finally, append 8 to the merged
sequence.

The merged sequence will be Merged = [1, 2, 3, 4, 5, 6, 7, 8].

Time Complexity:

The time complexity of the two-pointers merging algorithm is O(n), where n


is the total number of elements in both sequences. Similar to the naïve
approach, each element is compared and added to the merged sequence
once.

III. Merge Sort:

Merge sort is a divide-and-conquer sorting algorithm that utilizes the


merging operation. It recursively divides the input sequence into smaller
sub-sequences, sorts them individually, and then merges them to obtain the
final sorted sequence.

Example:
Consider an unsorted sequence: [7, 2, 5, 1, 8, 3, 6, 4]. We want to sort this
sequence using merge sort.

Algorithm:

1. Divide the input sequence into two halves: [7, 2, 5, 1] and [8, 3, 6, 4].

2. Recursively apply merge sort to each half.

For the first half:

- Divide: [7, 2, 5, 1] -> [7, 2] and [5, 1]

- Recursively apply merge sort: [7, 2] -> [2, 7] and [5, 1] -> [1, 5]

- Merge the sorted halves: [2, 7] and [1, 5] -> [1, 2, 5, 7]

For the second half:

- Divide: [8, 3, 6, 4] -> [8, 3] and [6, 4]

- Recursively apply merge sort: [8, 3] -> [3, 8] and [6, 4] -> [4, 6]

- Merge the sorted halves: [3, 8] and [4, 6] -> [3, 4, 6, 8]

3. Merge the two sorted halves using the merging algorithm.

- Merge [1, 2, 5, 7] and [3, 4, 6, 8] -> [1, 2, 3, 4, 5, 6, 7, 8]

The final sorted sequence is [1, 2, 3, 4, 5, 6, 7, 8].


Time Complexity:

The time complexity of merge sort is O(n log n), where n is the number of
elements in the input sequence. This is because the input sequence is divided
into halves recursively, and the merge operation takes linear time.

IV. External Sorting:

External sorting is a technique used when the data to be sorted does not fit
entirely in the main memory. It involves dividing the data into smaller blocks
that can fit in memory, sorting each block, and then merging the sorted
blocks using a merging algorithm.

Example:

Consider a large dataset that cannot fit in memory entirely. We want to sort
this dataset using external sorting.

Algorithm:

1. Divide the data into smaller blocks that fit in memory.

- Suppose the data is divided into four blocks: [7, 2, 5, 1], [8, 3, 6, 4], [9, 12,
10, 11], [15, 13, 16, 14].

2. Sort each block using an in-memory sorting algorithm (e.g., quicksort).


- Sort each block individually: [1, 2, 5, 7], [3, 4, 6, 8], [9, 10, 11, 12], [13, 14,
15, 16].

3. Merge the sorted blocks using a merging algorithm (e.g., k-way merge).

- Merge [1, 2, 5, 7], [3, 4, 6, 8], [9, 10, 11, 12], [13, 14, 15, 16] using k-way
merge.

4. Repeat steps 2-3 until all blocks are merged into a single sorted sequence.

Time Complexity:

The time complexity of external sorting depends on the size of the data and
the number of blocks. It is typically dominated by the merging step, which
has a time complexity of O(n log k), where n is the total number of elements
and k is the number of blocks.

More on the same

1. Stability:

Merging algorithms can be categorized as stable or unstable based on their


behavior when encountering equal elements. A stable merging algorithm
preserves the relative order of equal elements in the input sequences, while
an unstable merging algorithm may change their order during the merge.

For example, let’s consider two input sequences: A = [2, 4, 4, 6] and B = [1,
3, 4, 7]. If we merge these sequences using a stable merging algorithm, the
result will be [1, 2, 3, 4, 4, 4, 6, 7], where the order of the equal elements is
maintained. On the other hand, an unstable merging algorithm may produce
a different result, such as [1, 2, 3, 4, 4, 6, 4, 7], where the order of the two
equal 4s is changed.

2. In-Place Merging:

In some scenarios, it is desirable to merge two sorted sequences or arrays in-


place, without using additional memory. In-place merging algorithms
achieve this by rearranging the elements within the original sequences.

One popular in-place merging algorithm is the “In-Place Merge” algorithm,


also known as the “Huang’s algorithm.” It works by repeatedly finding the
maximum element from one sequence and inserting it into the correct
position in the other sequence. This process continues until one of the
sequences is fully merged.

In-place merging algorithms can be more challenging to implement and


typically have higher time complexity than merging algorithms that use
additional memory.

3. K-Way Merge:

The merging algorithms discussed so far merge two sequences at a time.


However, there are scenarios where we need to merge more than two
sequences efficiently. K-way merge algorithms handle this situation by
merging k sorted sequences into a single sorted sequence.

One common approach for k-way merging is to use a min-heap or a priority


queue. Initially, the first element from each sequence is inserted into the
heap. Then, at each step, the smallest element is removed from the heap and
added to the merged sequence. The next element from the same sequence as
the removed element is inserted into the heap. This process continues until
all sequences are fully merged.

K-way merge algorithms are commonly used in external sorting, where large
datasets are divided into multiple blocks, and each block is sorted
individually. The sorted blocks are then merged using k-way merge to obtain
the final sorted result.

4. Parallel Merging:

With the increasing availability of multi-core processors and parallel


computing, parallel merging algorithms have gained importance. These
algorithms take advantage of parallelism to merge sequences more
efficiently.

Parallel merging algorithms can be designed to merge multiple elements in


parallel, either by dividing the input sequences into smaller chunks or by
using parallel comparison and selection techniques. These algorithms aim to
achieve better performance by utilizing the available computing resources.

However, designing efficient parallel merging algorithms can be complex


and requires careful consideration of load balancing, synchronization, and
communication among parallel processes.

5. Applications of Merging:

Merging algorithms have numerous applications beyond sorting. Some


notable applications include:

- Database Operations: Merging is used in database operations


such as merging two sorted result sets from different tables or
merging the results of multiple queries.

- External Sorting: Merging algorithms play a crucial role in


external sorting, where data is too large to fit in memory entirely
and needs to be sorted using disk-based operations.

- Union-Find Data Structure: Merging algorithms are used in the


union operation of the union-find data structure, which
maintains a partition of a set into disjoint subsets.
- Merge-Policy in Version Control Systems: Version control
systems like Git use merging algorithms to combine changes
from different branches or versions of source code files.

Merging algorithms are fundamental to many areas of computer science and


have various properties, optimizations, and applications. Understanding
these algorithms provides a solid foundation for efficient algorithm design,
sorting, and working with sorted data in different contexts.

Tree and Graph Traversals

1. Introduction:

- Trees and graphs are fundamental data structures used in many


algorithms.

- Traversing a tree or a graph involves visiting all its nodes or vertices in a


systematic way.
- Traversals are essential for various applications, such as searching,
pathfinding, and network analysis.

2. Tree Traversals:

2.1 Depth-First Traversals:

- Depth-First Traversals explore a tree or graph by going as deep as


possible before backtracking.

- There are three common depth-first traversal techniques:

a. Pre-order Traversal:

- Visit the current node before its children.

- Recursive algorithm: Visit the root, perform pre-order traversal on


the left subtree, perform pre-order traversal on the right subtree.

- Example:

/ \

2 3

/\ /\

4 56 7

Pre-order traversal: 1, 2, 4, 5, 3, 6, 7

b. In-order Traversal:
- Visit the current node between its left and right children.

- Recursive algorithm: Perform in-order traversal on the left subtree,


visit the root, perform in-order traversal on the right subtree.

- Example:

/ \

2 3

/\ /\

4 56 7

In-order traversal: 4, 2, 5, 1, 6, 3, 7

c. Post-order Traversal:

- Visit the current node after its children.

- Recursive algorithm: Perform post-order traversal on the left


subtree, perform post-order traversal on the right subtree, visit the root.

- Example:

/ \

2 3

/\ /\

4 56 7

Post-order traversal: 4, 5, 2, 6, 7, 3, 1
2.2 Breadth-First Traversal:

- Breadth-First Traversal explores a tree or graph level by level, visiting


all nodes at each level before moving to the next level.

- It uses a queue data structure to maintain the order of visiting nodes.

- Algorithm: Enqueue the root, then repeat the following steps until the
queue is empty: Dequeue a node, visit the dequeued node, enqueue its
children.

- Example:

/ \

2 3

/\ /\

4 56 7

Breadth-first traversal: 1, 2, 3, 4, 5, 6, 7

3. Graph Traversals:

3.1 Depth-First Search (DFS):

- DFS explores a graph by visiting a node and recursively exploring all its
unvisited neighbors.

- It uses a stack or recursion to maintain the order of visiting nodes.

- Algorithm: Visit a node, mark it as visited, recursively visit all its


unvisited neighbors.
- Example:

1---2---3---4

5---6---7

DFS traversal starting from node 1: 1, 2, 3, 4, 7, 6, 5

3.2 Breadth-First Search (BFS):

- BFS explores a graph level by level, visiting all nodes at each level before
moving to the next level.

- It uses a queue data structure to maintain the order of visiting nodes.

- Algorithm: Enqueue the start node, then repeat the following steps until
the queue is empty: Dequeue a node, visit the dequeued node, enqueue its
unvisited neighbors.

- Example:

1---2---3---4

5---6---7

BFS traversal starting from node 1: 1, 2, 5, 3, 6, 4, 7

4. Applications and Analysis:

- Traversals are used in various algorithms and applications:

- Searching for a specific node or value in a tree or graph.

- Pathfinding algorithms like Dijkstra’s algorithm and A* search.


- Connectivity analysis, component labeling, and cycle detection.

- Topological sorting of directed acyclic graphs.

- Network analysis and traversal-based algorithms like breadth-first


search-based page ranking.

- Time complexity of tree and graph traversals depends on the number of


nodes/vertices and edges.

-Continued:

- For trees, the time complexity of all traversal techniques is O(n), where n
is the number of nodes.

- For graphs, the time complexity of traversals depends on the data


structure used to represent the graph (e.g., adjacency matrix or adjacency
list) and the number of edges.

- In the worst case, the time complexity of graph traversals can be O(|V| +
|E|), where |V| is the number of vertices and |E| is the number of edges.

- The space complexity of tree and graph traversals is O(h), where h is the
height of the tree or the maximum depth of any path in the graph.

5. Conclusion:

- Tree and graph traversals are essential techniques in algorithm design


and analysis.

- Depth-first and breadth-first traversals offer different perspectives on


exploring the structure of a tree or graph.
- Understanding these traversal techniques is crucial for solving various
algorithmic problems efficiently.

Shortest Path Algorithms

1. Introduction:

- Shortest path algorithms are used to find the shortest path between two
vertices in a graph.

- They are widely used in various applications, such as navigation systems,


network routing, and resource optimization.

- Two popular shortest path algorithms are Dijkstra’s algorithm and the
Bellman-Ford algorithm.

2. Dijkstra’s Algorithm:

- Dijkstra’s algorithm finds the shortest path from a source vertex to all
other vertices in a weighted graph.

- It works for both directed and undirected graphs with non-negative edge
weights.

- Algorithm Steps:
a. Initialize a distance table to store the shortest distance from the source
vertex to each vertex.

b. Set the distance of the source vertex to 0 and all other distances to
infinity.

c. Select the vertex with the minimum distance from the distance table and
mark it as visited.

d. Update the distances of its neighboring vertices if a shorter path is


found.

e. Repeat steps c and d until all vertices are visited.

- Example:

A—4—B—2—D

/ /

2 1 6

/ /

C—3—E—5

Applying Dijkstra’s algorithm from source vertex A:

- Initialize distance table: A(0), B(inf), C(inf), D(inf), E(inf)

- Visit A and update distances: B(4), C(2)

- Visit C and update distances: E(5)

- Visit B and update distances: D(6)


- Visit E and update distances: D(6)

- Final distance table: A(0), B(4), C(2), D(6), E(5)

- Shortest path from A to D: A -> B -> D (total distance = 6)

3. Bellman-Ford Algorithm:

- The Bellman-Ford algorithm finds the shortest path from a source vertex
to all other vertices in a weighted graph.

- It works for both directed and undirected graphs and can handle graphs
with negative edge weights.

- Algorithm Steps:

a. Initialize a distance array to store the shortest distance from the source
vertex to each vertex.

b. Set the distance of the source vertex to 0 and all other distances to
infinity.

c. Relax all edges repeatedly to find shorter paths until no further


improvements can be made.

d. Detect negative cycles if the distances can still be reduced after the
relaxation phase.

- Example:

A—4—B—2—D

/ /

-1 1 6
/ /

C—3—E—5

Applying the Bellman-Ford algorithm from source vertex A:

- Initialize distance array: A(0), B(inf), C(inf), D(inf), E(inf)

- Relax edges repeatedly:

- Relax (A, B): B(-1)

- Relax (A, C): C(-1)

- Relax (B, D): D(1)

- Relax (C, E): E(2)

- Final distance array: A(0), B(-1), C(-1), D(1), E(2)

- Shortest path from A to D: A -> B -> D (total distance = 1)

4. Time and Space Complexity:

- Dijkstra’s algorithm has a time complexity of O((V + E) log V) using a


min-heap data structure for efficient selection of the next vertex.

- The Bellman-Ford algorithm has a time complexity of O(V * E), where V


is the number of vertices and E is the number of edges.

- Both algorithms have a space complexity of O(V) to store the distance


table or array.
5. Optimizations and Variations:

- Various optimizations can be applied to improve the efficiency of shortest


path algorithms, such as using Fibonacci heaps for Dijkstra’s algorithm or
the SPFA algorithm for certain cases of the Bellman-Ford algorithm.

- Variations of shortest path algorithms include the A* algorithm, which


uses heuristics to guide the search towards the target vertex, and the Floyd-
Warshall algorithm, which finds the shortest path between all pairs of
vertices in a graph.

6. Conclusion:

- Shortest path algorithms are essential tools for finding the most efficient
routes or paths in graphs.

- Dijkstra’s algorithm and the Bellman-Ford algorithm are widely used for
different graph types and edge weights.

- Understanding these algorithms and their variations will enable you to


solve various optimization problems effectively.

Additional thoughts

1. Dijkstra’s Algorithm:

- Example Continued:

Let’s consider a modified version of the previous example where the graph
has additional edges and weights:
A—4—B—2—D

/\ /\

2 1 2 3

/ \ / \

C—3—E—5 F—2—G

Applying Dijkstra’s algorithm from source vertex A:

- Initialize distance table: A(0), B(inf), C(inf), D(inf), E(inf), F(inf), G(inf)

- Visit A and update distances: B(4), C(2)

- Visit C and update distances: E(5)

- Visit B and update distances: D(6), F(5)

- Visit F and update distances: G(7)

- Visit E and update distances: D(6)

- Final distance table: A(0), B(4), C(2), D(6), E(5), F(5), G(7)

- Shortest path from A to G: A -> B -> F -> G (total distance = 11)

2. Bellman-Ford Algorithm:

- Example Continued:

Let’s continue with the modified example graph for the Bellman-Ford
algorithm:

A—4—B—2—D
/\ /\

-1 1 2 3

/ \ / \

C—3—E—5 F—2—G

Applying the Bellman-Ford algorithm from source vertex A:

- Initialize distance array: A(0), B(inf), C(inf), D(inf), E(inf), F(inf), G(inf)

- Relax edges repeatedly:

- Relax (A, B): B(-1)

- Relax (A, C): C(-1)

- Relax (B, D): D(1)

- Relax (C, E): E(2)

- Relax (B, F): F(1)

- Relax (E, D): D(0)

- Relax (F, G): G(3)

- Final distance array: A(0), B(-1), C(-1), D(0), E(2), F(1), G(3)

- Shortest path from A to G: A -> B -> F -> G (total distance = 3)

3. A* Algorithm:

- The A* algorithm combines Dijkstra’s algorithm with heuristics to guide


the search towards the target vertex efficiently.
- Example:

Consider a grid-based graph where each cell represents a location and has
a movement cost associated with it:

- The start cell is S, the target cell is T, and the empty cells are represented
by dots.

- The movement costs from one cell to an adjacent cell are as follows:

- Horizontal or vertical movement: 1

- Diagonal movement: sqrt(2) (approx. 1.41)

The A* algorithm uses the Euclidean distance as a heuristic to estimate


the remaining distance to the target cell.

By considering both the actual cost from the start cell and the estimated
cost to the target cell, the A* algorithm explores the most promising paths
first.

4. Floyd-Warshall Algorithm:

- The Floyd-Warshall algorithm finds the shortest path between all pairs of
vertices in a graph.

- Example:

Consider the following weighted graph:

A—3—B—5—C

/ | |
2 4 1

/ | |

D—6—E—2—F

Applying the Floyd-Warshall algorithm:

- Initialize the distance matrix with the weights of the edges and infinity
for unreachable pairs.

- Iterate through all vertices as intermediate nodes and update the


distance matrix if a shorter path is found.

- After the algorithm completes, the distance matrix will contain the
shortest path between all pairs of vertices.

Minimum Spanning Tree Algorithms

1. Introduction:

- Minimum Spanning Tree (MST) algorithms are used to find the


minimum-weight connected subgraph that spans all the vertices in a given
graph.

- MSTs are widely used in various applications, such as network design,


clustering, and approximation algorithms.
- Two popular MST algorithms are Kruskal’s algorithm and Prim’s
algorithm.

2. Kruskal’s Algorithm:

- Kruskal’s algorithm constructs an MST by repeatedly adding the edges


with the smallest weight that do not form cycles.

- Algorithm Steps:

a. Create a disjoint set for each vertex in the graph.

b. Sort the edges of the graph in non-decreasing order of their weights.

c. Iterate through the sorted edges and add each edge to the MST if it does
not form a cycle.

d. Repeat step c until the MST contains (V-1) edges, where V is the number
of vertices.

- Example:

4 3 2 6

A----B----C----D

2 1 4 5

E----F----G----H

3 2
Applying Kruskal’s algorithm to find the MST:

- Sort the edges in non-decreasing order of their weights: (B, C)(1), (F,
G)(2), (A, E)(2), (H, G)(2), (A, B)(3), (C, D)(4), (A, F)(4), (D, G)(5)

- Initialize an empty MST.

- Add edges in the sorted order, ensuring they do not form cycles:

- Add (B, C)(1) to the MST.

- Add (F, G)(2) to the MST.

- Add (A, E)(2) to the MST.

- Add (A, B)(3) to the MST.

- Add (C, D)(4) to the MST.

- Add (A, F)(4) to the MST.

- Add (D, G)(5) to the MST.

- Final MST: (B, C)(1), (F, G)(2), (A, E)(2), (A, B)(3), (C, D)(4), (A, F)(4),
(D, G)(5)

- Total weight of the MST: 21

3. Prim’s Algorithm:

- Prim’s algorithm grows an MST incrementally from a starting vertex by


adding the nearest vertex with the minimum weight edge.

- Algorithm Steps:

a. Initialize an empty MST and a set to track visited vertices.


b. Start from an arbitrary vertex and mark it as visited.

c. Repeat until all vertices are visited:

- Find the minimum weight edge that connects a visited vertex to an


unvisited vertex.

- Add the edge and the unvisited vertex to the MST and mark the vertex
as visited.

- Example:

4 3 2 6

A----B----C----D

2 1 4 5

E----F----G----H

3 2

Applying Prim’s algorithm to find the MST (starting from vertex A):

- Initialize an empty MST and mark vertex A as visited.

- Repeat until all vertices are visited:

- Find the minimum weight edge that connects a visited vertex to an


unvisited vertex:

- The minimum weight edge is (A, B)(3).

- Add the edge (A, B)(3) and vertex B to the MST, mark B as visited.
- Find the minimum weight edge that connects a visited vertex to an
unvisited vertex:

- The minimum weight edge is (B, F)(1).

- Add the edge (B, F)(1) and vertex F to the MST, mark F as visited.

- Find the minimum weight edge that connects a visited vertex to an


unvisited vertex:

- The minimum weight edge is (F, G)(2).

- Add the edge (F, G)(2) and vertex G to the MST, mark G as visited.

- Find the minimum weight edge that connects a visited vertex to an


unvisited vertex:

- The minimum weight edge is (G, H)(2).

- Add the edge (G, H)(2) and vertex H to the MST, mark H as visited.

- Find the minimum weight edge that connects a visited vertex to an


unvisitedvertex:

- There are no more unvisited vertices, so the algorithm terminates.

- Final MST: (A, B)(3), (B, F)(1), (F, G)(2), (G, H)(2)

- Total weight of the MST: 8

4. Comparison of Kruskal’s and Prim’s Algorithms:

- Both Kruskal’s and Prim’s algorithms find the MST, but they differ in their
approaches.
- Kruskal’s algorithm is a greedy algorithm that focuses on selecting edges
based on their weights, ensuring there are no cycles.

- Prim’s algorithm is also a greedy algorithm that grows the MST


incrementally from a starting vertex, focusing on adding the nearest vertex
with the minimum weight edge.

- Time Complexity: Kruskal’s algorithm has a time complexity of O(E log


E) or O(E log V) (if using a suitable data structure). Prim’s algorithm has a
time complexity of O(E log V) (if using a suitable data structure).

- Kruskal’s algorithm is suitable for sparse graphs, while Prim’s algorithm


performs well on dense graphs.

5. Conclusion:

- Minimum Spanning Tree algorithms, such as Kruskal’s and Prim’s


algorithms, are fundamental in finding the minimum-weight connected
subgraph that spans all vertices in a graph.

- These algorithms have various applications, including network design,


clustering, and approximation algorithms.

- Understanding the differences between Kruskal’s and Prim’s algorithms


and their implementations is crucial for solving MST problems efficiently.
Additional details and examples of minimum spanning tree
(MST) algorithms:

1. Kruskal’s Algorithm:

- Kruskal’s algorithm is a greedy algorithm that finds the MST by


considering edges in ascending order of their weights.

- Example Continued:

Let’s consider a weighted graph with the following edges and weights:

(A, B)(4), (A, C)(2), (B, C)(1), (B, D)(2), (C, E)(3), (D, E)(6), (D, F)(5),
(E, F)(4)

Applying Kruskal’s algorithm to find the MST:

- Sort the edges in non-decreasing order of their weights: (B, C)(1), (A,
C)(2), (B, D)(2), (A, B)(4), (C, E)(3), (E, F)(4), (D, F)(5), (D, E)(6)

- Initialize an empty MST.

- Add edges in the sorted order, ensuring they do not form cycles:

- Add (B, C)(1) to the MST.

- Add (A, C)(2) to the MST.

- Add (B, D)(2) to the MST.

- Add (C, E)(3) to the MST.

- Add (E, F)(4) to the MST.


- Final MST: (B, C)(1), (A, C)(2), (B, D)(2), (C, E)(3), (E, F)(4)

- Total weight of the MST: 12

2. Prim’s Algorithm:

- Prim’s algorithm is a greedy algorithm that grows the MST incrementally


from a starting vertex by adding the nearest vertex with the minimum weight
edge.

- Example Continued:

Let’s continue with the same weighted graph as before.

Applying Prim’s algorithm to find the MST (starting from vertex A):

- Initialize an empty MST and mark vertex A as visited.

- Repeat until all vertices are visited:

- Find the minimum weight edge that connects a visited vertex to an


unvisited vertex:

- The minimum weight edge is (A, C)(2).

- Add the edge (A, C)(2) and vertex C to the MST, mark C as visited.

- Find the minimum weight edge that connects a visited vertex to an


unvisited vertex:

- The minimum weight edge is (B, C)(1).

- Add the edge (B, C)(1) and vertex B to the MST, mark B as visited.
- Find the minimum weight edge that connects a visited vertex to an
unvisited vertex:

- The minimum weight edge is (B, D)(2).

- Add the edge (B, D)(2) and vertex D to the MST, mark D as visited.

- Find the minimum weight edge that connects a visited vertex to an


unvisited vertex:

- The minimum weight edge is (D, F)(5).

- Add the edge (D, F)(5) and vertex F to the MST, mark F as visited.

- Find the minimum weight edge that connects a visited vertex to an


unvisited vertex:

- The minimum weight edge is (E, F)(4).

- Add the edge (E, F)(4) and vertex E to the MST, mark E as visited.

- Final MST: (A, C)(2), (B, C)(1), (B, D)(2), (D, F)(5), (E, F)(4)

- Total weight of the MST: 14

3. Comparison of Kruskal’s and Prim’s Algorithms:

- Both Kruskal’s and Prim’s algorithms aim to find the MST, but they differ
in their approaches.

- Kruskal’s algorithm is a bottom-up approach that considers edges in


ascending order of their weights and ensures no cycles are formed.
- Prim’s algorithm is a top-down approach that starts from a vertex and
grows the MST incrementally by adding the nearest vertex with the
minimum weight edge.

- Kruskal’s algorithm is suitable for sparse graphs, while Prim’s algorithm


performs well on dense graphs.

- It’s important to note that the choice of algorithm depends on the specific
problem and graph characteristics.

4. Real-World Examples:

- Minimum spanning tree algorithms have practical applications in various


fields, such as:

- Network Design: Finding the most cost-effective way to connect cities or


nodes in a telecommunications network.

- Electric Grid Design: Determining the optimal placement of power lines


to connect different regions while minimizing costs.

- Transportation Planning: Designing efficient routes for delivery trucks


or deciding on the placement of highways to connect cities.

- Cluster Analysis: Identifying clusters or communities within a dataset by


finding the most interconnected subgraph.

- Approximation Algorithms: MST algorithms are used as a subroutine in


approximation algorithms for solving NP-hard problems.
- For example, let’s consider a network design problem where you
want to connect a set of cities with a minimum cost. Each city is
represented by a vertex, and the edges represent the connection
between cities with associated costs.

Applying Kruskal’s algorithm or Prim’s algorithm can help find the


minimum-cost network:

- Given the cities and their connections with associated costs:

- City A is connected to City B with a cost of 4.

- City A is connected to City C with a cost of 2.

- City B is connected to City C with a cost of 1.

- City B is connected to City D with a cost of 2.

- City C is connected to City E with a cost of 3.

- City D is connected to City E with a cost of 6.

- City D is connected to City F with a cost of 5.

- City E is connected to City F with a cost of 4.

By applying Kruskal’s algorithm or Prim’s algorithm to this problem, you


can find the minimum-cost network that connects all the cities efficiently.

Order Statistics
Introduction:

Order statistics is a fundamental concept in computer science and


algorithms. It refers to the study of finding the k-th smallest or largest
element in a set of elements. The order statistics problem has various
applications, such as finding the median, finding the k-th largest element,
and solving problems related to ranking and selection. In this lecture, we will
discuss several algorithms and techniques for solving order statistics
problems, along with relevant examples.

1. Selection Algorithm:

- The selection algorithm is a well-known algorithm for finding the k-th


smallest element in an unsorted array.

- Example:

Let’s consider an array: [9, 2, 5, 1, 7, 4, 8, 6, 3]

If we want to find the 3rd smallest element, k = 3.

Applying the selection algorithm:

- Choose a pivot element (e.g., the first element).

- Partition the array around the pivot, such that elements smaller than the
pivot are on the left and larger elements are on the right.

- If the pivot’s index is equal to k-1, return the pivot element.

- If the pivot’s index is greater than k-1, recursively search the left
partition.
- If the pivot’s index is less than k-1, recursively search the right partition.

- Repeat the steps until the k-th smallest element is found.

- In our example, the 3rd smallest element is 3.

2. Quickselect Algorithm:

- Quickselect is a variation of the quicksort algorithm that efficiently finds


the k-th smallest element.

- Example:

Let’s continue with the previous array: [9, 2, 5, 1, 7, 4, 8, 6, 3]

If we want to find the 3rd smallest element, k = 3.

Applying the Quickselect algorithm:

- Choose a pivot element.

- Partition the array around the pivot, similar to the selection algorithm.

- If the pivot’s index is equal to k-1, return the pivot element.

- If the pivot’s index is greater than k-1, recursively search the left
partition.

- If the pivot’s index is less than k-1, recursively search the right partition.

- Repeat the steps until the k-th smallest element is found.

- In our example, the 3rd smallest element is 3.

3. Balanced Binary Search Trees:


- Balanced binary search trees, such as AVL trees and red-black trees, can
be used to solve order statistics problems efficiently.

- Example:

Let’s assume we have an AVL tree with the following elements: [2, 4, 6, 7,
8, 9]

If we want to find the 3rd smallest element, k = 3.

Applying the AVL tree algorithm:

- Starting from the root, compare the k value with the size of the left
subtree.

- If k is less than the size of the left subtree, recursively search the left
subtree.

- If k is equal to the size of the left subtree plus one, return the current
node’s value.

- If k is greater than the size of the left subtree plus one, recursively search
the right subtree by adjusting k.

- In our example, the 3rd smallest element is 6.

4. Augmented Data Structures:

- Augmented data structures, such as order statistic trees, provide


additional information about the order statistics of a set.

- Example:
Let’s consider an order statistic tree with the following elements: [2, 4, 6,
7, 8, 9]

If we want to find the rank (position) of the element 7, the rank is 4.

Applying the order statistic tree algorithm:

- Each node in the tree stores the size of its subtree (number of elements).

- Starting from the root, compare the target element with the current
node’s value.

- If the target element is smaller, recursively search the left subtree.

- If the target element is equal to the current node’s value, return the size
of the left subtree plus one.

- If the target element is greater, recursively search the right subtree and
adjust the rank by subtracting the size of the left subtree plus one.

- In our example, the rank of the element 7 is 4.

5. Heap-based Algorithms:

- Heaps, such as binary heaps and Fibonacci heaps, can be utilized to solve
order statistics problems efficiently.

- Example:

Let’s assume we have a binary min-heap with the following elements: [1,
2, 3, 4, 5, 6, 7, 8, 9]

If we want to find the 3rdsmallest element, k = 3.

Applying the binary min-heap algorithm:


- Extract the minimum element (root) k-1 times.

- After k-1 extractions, the k-th smallest element will be the new minimum
element.

- In our example, the 3rd smallest element is 3.

Conclusion:

Order statistics play a crucial role in various algorithms and applications. In


this lecture, we discussed several algorithms and techniques for solving order
statistics problems, including the selection algorithm, Quickselect algorithm,
balanced binary search trees, augmented data structures, and heap-based
algorithms. These algorithms provide efficient ways to find the k-th smallest
or largest element in a set of elements, allowing us to solve problems related
to ranking, selection, and finding medians. Understanding and
implementing these algorithms will enhance your problem-solving skills in
the field of algorithms and data structures.

Additional information

1. Selection Algorithm:

- Example:

Consider an array of integers: [9, 2, 5, 1, 7, 4, 8, 6, 3]

We want to find the 3rd smallest element, k = 3.

Applying the selection algorithm:


- Choose the first element, 9, as the pivot.

- Partition the array around the pivot: [2, 5, 1, 7, 4, 8, 6, 3, 9]

- The pivot’s index is 8, which is not equal to k-1 (2).

- Since the pivot’s index (8) is greater than k-1 (2), we recursively search
the left partition: [2, 5, 1, 7, 4, 8, 6, 3]

- Choose the first element of the left partition, 2, as the new pivot.

- Partition the left partition: [1, 2, 5, 7, 4, 8, 6, 3]

- The new pivot’s index is 1, which is equal to k-1 (2).

- Therefore, we have found the 3rd smallest element, which is 5.

2. Quickselect Algorithm:

- Example:

Consider the same array of integers: [9, 2, 5, 1, 7, 4, 8, 6, 3]

We want to find the 3rd smallest element, k = 3.

Applying the Quickselect algorithm:

- Choose a pivot, let’s say the first element, 9.

- Partition the array around the pivot: [2, 5, 1, 7, 4, 8, 6, 3, 9]

- The pivot’s index is 8, which is not equal to k-1 (2).

- Since the pivot’s index (8) is greater than k-1 (2), we recursively search
the left partition: [2, 5, 1, 7, 4, 8, 6, 3]

- Choose a new pivot, let’s say the first element of the left partition, 2.
- Partition the left partition: [1, 2, 5, 7, 4, 8, 6, 3]

- The new pivot’s index is 1, which is equal to k-1 (2).

- Therefore, we have found the 3rd smallest element, which is 5.

3. Balanced Binary Search Trees:

- Example:

Consider an AVL tree with the following elements: [2, 4, 6, 7, 8, 9]

We want to find the 3rd smallest element, k = 3.

Applying the AVL tree algorithm:

- Starting from the root (6), compare k (3) with the size of the left subtree
(2).

- Since k is greater than the size of the left subtree, we recursively search
the right subtree.

- In the right subtree, compare k (3) with the adjusted rank (k –


left_subtree_size – 1 = 3 – 2 – 1 = 0) of the node 7.

- Since k is equal to the adjusted rank of the node 7, we return its value.

- Therefore, the 3rd smallest element is 7.

4. Augmented Data Structures:

- Example:
Consider an order statistic tree with the following elements: [2, 4, 6, 7, 8,
9]

We want to find the rank (position) of the element 7.

Applying the order statistic tree algorithm:

- Starting from the root (6), compare the target element (7) with the
current node’s value.

- Since the target element is greater than the current node’s value, we
recursively search the right subtree.

- In the right subtree, we adjust the rank by subtracting the size of the left
subtree plus one (3 + 1 = 4).

- Continuing the search, we reach the node 7, and since the target element
is equal to the current node’s value, we return the size of the left subtree plus
one, which is 3 + 1 = 4.

- Therefore, the rank of the element 7 is 4.

5. Heap-based Algorithms:

- Example:

Consider a binary min-heap with the following elements: [1, 2, 3, 4, 5, 6,


7, 8, 9]

We want to find the 3rd smallest element, k = 3.

Applying the binary min-heap algorithm:

- Extract the minimum element (root) k-1 (3-1 = 2) times:


- First extraction: The minimum element is 1. The heap becomes [2, 4, 3,
8, 5, 6, 7, 9].

- Second extraction: The minimum element is 2. The heap becomes [3,


4, 6, 8, 5, 9, 7].

- After k-1 extractions, the k-th smallest element will be the new minimum
element, which is 3.

- Therefore, the 3rd smallest element is 3.

Conclusion:

Order statistics algorithms provide efficient ways to find the k-th smallest or
largest element in a set of elements. We explored various algorithms,
including the selection algorithm, Quickselect algorithm, balanced binary
search trees, augmented data structures, and heap-based algorithms. Each
algorithm has its own advantages and can be applied to different scenarios.
By understanding and implementing these algorithms, you can successfully
solve order statistics problems and related applications in the field of
algorithms and data structures.

String Matching Algorithms

Introduction:

String matching algorithms are used to find occurrences or patterns within


a given text or string. These algorithms play a crucial role in various
applications, such as data mining, bioinformatics, text processing, and more.
In this lecture, we will explore several common string matching algorithms,
including Brute Force, Knuth-Morris-Pratt (KMP), Boyer-Moore, and
Rabin-Karp. We will discuss their principles, complexities, and provide
detailed examples to illustrate their usage.

1. Brute Force Algorithm:

- Principle: The simplest string matching algorithm, which compares the


pattern with each possible substring of the text.

- Complexity: O((n – m + 1) * m), where n is the length of the text and m is


the length of the pattern.

- Example:

Text: “ABABACABABC”

Pattern: “ABABC”

Steps:

- Start comparing the pattern with each substring of the text:

- Compare “ABABC” with “ABABA” (mismatch).

- Compare “ABABC” with “BABAC” (mismatch).

- Compare “ABABC” with “ABACA” (mismatch).

- Compare “ABABC” with “BACAB” (mismatch).

- Compare “ABABC” with “ACABA” (match found).


2. Knuth-Morris-Pratt (KMP) Algorithm:

- Principle: Utilizes the concept of a prefix-suffix table (also known as the


“failure function”) to avoid unnecessary comparisons.

- Complexity: O(n + m), where n is the length of the text and m is the length
of the pattern.

- Example:

Text: “ABABACABABC”

Pattern: “ABABC”

Steps:

- Construct the prefix-suffix table: [0, 0, 1, 0, 0].

- Start comparing the pattern with the text using the prefix-suffix table:

- Compare “ABABC” with “ABABA” (mismatch, shift by 2 according to the


prefix-suffix table).

- Compare “ABABC” with “ABABAC” (mismatch, shift by 1).

- Compare “ABABC” with “ABABACA” (mismatch, shift by 3).

- Compare “ABABC” with “ABABACAB” (mismatch, shift by 1).

- Compare “ABABC” with “ABABACABA” (match found).

3. Boyer-Moore Algorithm:

- Principle: Employs two heuristics, the “bad character rule” and the “good
suffix rule,” to skip comparisons based on previously matched characters.
- Complexity: O(n + m), where n is the length of the text and m is the length
of the pattern (in the worst case).

- Example:

Text: “ABABACABABC”

Pattern: “ABABC”

Steps:

- Preprocess the pattern to generate bad character and good suffix tables.

- Start comparing the pattern with the text using the tables:

- Compare “ABABC” with “ABABA” (mismatch, shift by 2 using the bad


character rule).

- Compare “ABABC” with “ABABAC” (mismatch, shift by 3 using the bad


character rule).

- Compare “ABABC” with “ABABACA” (mismatch, shift by 1 using the good


suffix rule).

- Compare “ABABC” with “ABABACAB” (mismatch, shift by 2 using the


bad character rule).

- Compare “ABABC” with “ABABACABA” (match found).

4. Rabin-Karp Algorithm:

- Principle: Utilizes a rolling hash function to compare the hash values of the
pattern and each substring of the text.
- Complexity: O((n – m + 1) * m), where n is the length of the text and m is
the length of the pattern (in the worst case).

- Example:

Text: “ABABACABABC”

Pattern: “ABABC”

Steps:

- Compute the hash value of the pattern and the first substring of the text.

- Start comparing the pattern with each substring of the text based on their
hash values:

- Compare “ABABC” with “ABABA” (mismatch).

- Compare “ABABC” with “BABAC” (mismatch).

- Compare “ABABC” with “ABACA” (mismatch).

- Compare “ABABC” with “BACAB” (mismatch).

- Compare “ABABC” with “ACABA” (match found).

Conclusion:

Remember that choosing the right algorithm is crucial for achieving efficient
string matching, and it’s always recommended to analyze the characteristics
of your data and understand the strengths and limitations of each algorithm
before making a decision.
additional solved examples for each string matching algorithm
mentioned:

1. Brute Force Algorithm:

Text: “ABABACABABC”

Pattern: “ABABC”

Steps:

- Start comparing the pattern with each substring of the text:

- Compare “ABABC” with “ABABA” (mismatch).

- Compare “ABABC” with “BABAC” (mismatch).

- Compare “ABABC” with “ABACA” (mismatch).

- Compare “ABABC” with “BACAB” (mismatch).

- Compare “ABABC” with “ACABA” (match found).

2. Knuth-Morris-Pratt (KMP) Algorithm:

Text: “ABABACABABC”

Pattern: “ABABC”

Steps:

- Construct the prefix-suffix table: [0, 0, 1, 0, 0].

- Start comparing the pattern with the text using the prefix-suffix table:
- Compare “ABABC” with “ABABA” (mismatch, shift by 2 according to the
prefix-suffix table).

- Compare “ABABC” with “ABABAC” (mismatch, shift by 1).

- Compare “ABABC” with “ABABACA” (mismatch, shift by 3).

- Compare “ABABC” with “ABABACAB” (mismatch, shift by 1).

- Compare “ABABC” with “ABABACABA” (match found).

3. Boyer-Moore Algorithm:

Text: “ABABACABABC”

Pattern: “ABABC”

Steps:

- Preprocess the pattern to generate bad character and good suffix tables.

- Start comparing the pattern with the text using the tables:

- Compare “ABABC” with “ABABA” (mismatch, shift by 2 using the bad


character rule).

- Compare “ABABC” with “ABABAC” (mismatch, shift by 3 using the bad


character rule).

- Compare “ABABC” with “ABABACA” (mismatch, shift by 1 using the good


suffix rule).

- Compare “ABABC” with “ABABACAB” (mismatch, shift by 2 using the bad


character rule).

- Compare “ABABC” with “ABABACABA” (match found).


4. Rabin-Karp Algorithm:

Text: “ABABACABABC”

Pattern: “ABABC”

Steps:

- Compute the hash value of the pattern and the first substring of the text.

- Start comparing the pattern with each substring of the text based on their
hash values:

- Compare “ABABC” with “ABABA” (mismatch).

- Compare “ABABC” with “BABAC” (mismatch).

- Compare “ABABC” with “ABACA” (mismatch).

- Compare “ABABC” with “BACAB” (mismatch).

- Compare “ABABC” with “ACABA” (match found).

These examples showcase the step-by-step process of each string matching


algorithm. They demonstrate how the algorithms handle different scenarios,
such as finding matches and handling mismatches using various techniques
like shifting, table lookups, and hash comparisons.

By analyzing these examples, you can observe the efficiency and effectiveness
of each algorithm in different situations. This understanding can help you
choose the most suitable algorithm for your specific string matching
requirements.

You might also like