Sorting in Data Structures (DSA)
What is Sorting?
Sorting is the process of arranging elements in a specific order.
Types of Order
Ascending → smallest to largest
Descending → largest to smallest
Why Sorting is Important?
Makes searching faster (Binary Search)
Improves data organization
Used in ranking, reports, databases
Required in many algorithms
Classification of Sorting Algorithms
Basic
1. Bubble Sort
2. Selection Sort
3. Insertion Sort
Advanced
4. Merge Sort
5. Quick Sort
6. Heap Sort
Bubble Sort is the simplest sorting algorithm that works by repeatedly swapping the adjacent
elements if they are in the wrong order. This algorithm is not efficient for large data sets as its
average and worst-case time complexity are quite high.
Sorts the array using multiple passes. After the first pass, the maximum goes to end (its
correct position). Same way, after second pass, the second largest goes to second last
position and so on.
In every pass, process only those that have already not moved to correct position. After k
passes, the largest k must have been moved to the last k positions.
In a pass, we consider remaining elements and compare all adjacent and swap if larger
element is before a smaller element. If we keep doing this, we get the largest (among the
remaining elements) at its correct position.
# Optimized Python program for implementation of Bubble Sort
def bubbleSort(arr):
n = len(arr)
# Traverse through all array elements
for i in range(n):
swapped = False
# Last i elements are already in place
for j in range(0, n-i-1):
# Traverse the array from 0 to n-i-1
# Swap if the element found is greater
# than the next element
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]
swapped = True
if (swapped == False):
break
# Driver code to test above
if __name__ == "__main__":
arr = [64, 34, 25, 12, 22, 11, 90]
bubbleSort(arr)
print("Sorted array:")
for i in range(len(arr)):
print("%d" % arr[i], end=" ")
Output: Sorted array:
11 12 22 25 34 64 90
Complexity Analysis of Bubble Sort:
Time Complexity: O(n2)
Auxiliary Space: O(1)
Advantages of Bubble Sort:
Bubble sort is easy to understand and implement.
It does not require any additional memory space.
It is a stable sorting algorithm, meaning that elements with the same key value maintain
their relative order in the sorted output.
Disadvantages of Bubble Sort:
Bubble sort has a time complexity of O(n2) which makes it very slow for large data sets.
Bubble sort has almost no or limited real world applications. It is mostly used in academics to
teach different ways of sorting.
Selection Sort is a comparison-based sorting algorithm. It sorts by repeatedly selecting the smallest
(or largest) element from the unsorted portion and swapping it with the first unsorted element.
1. Find the smallest element and swap it with the first element. This way we get the smallest
element at its correct position.
2. Then find the smallest among remaining elements (or second smallest) and swap it with the
second element.
3. We keep doing this until we get all elements moved to correct position.
# Python program for implementation of Selection
# Sort
def selection_sort(arr):
n = len(arr)
for i in range(n - 1):
# Assume the current position holds
# the minimum element
min_idx = i
# Iterate through the unsorted portion
# to find the actual minimum
for j in range(i + 1, n):
if arr[j] < arr[min_idx]:
# Update min_idx if a smaller element is found
min_idx = j
# Move minimum element to its
# correct position
arr[i], arr[min_idx] = arr[min_idx], arr[i]
def print_array(arr):
for val in arr:
print(val, end=" ")
print()
if __name__ == "__main__":
arr = [64, 25, 12, 22, 11]
print("Original array: ", end="")
print_array(arr)
selection_sort(arr)
print("Sorted array: ", end="")
print_array(arr)
Output
Original vector: 64 25 12 22 11
Sorted vector: 11 12 22 25 64
Complexity Analysis of Selection Sort
Time Complexity: O(n2) ,as there are two nested loops:
One loop to select an element of Array one by one = O(n)
Another loop to compare that element with every other Array element = O(n)
Therefore overall complexity = O(n) * O(n) = O(n*n) = O(n2)
Auxiliary Space: O(1) as the only extra memory used is for temporary variables.
Advantages of Selection Sort
Easy to understand and implement, making it ideal for teaching basic sorting concepts.
Requires only a constant O(1) extra memory space.
It requires less number of swaps (or memory writes) compared to many other standard
algorithms. Only cycle sort beats it in terms of memory writes. Therefore it can be simple
algorithm choice when memory writes are costly.
Disadvantages of the Selection Sort
Selection sort has a time complexity of O(n^2) makes it slower compared to algorithms
like Quick Sort or Merge Sort.
Does not maintain the relative order of equal elements which means it is not stable.
Applications of Selection Sort
Perfect for teaching fundamental sorting mechanisms and algorithm design.
Suitable for small lists where the overhead of more complex algorithms isn't justified and
memory writing is costly as it requires less memory writes compared to other standard
sorting algorithms.
Heap Sort algorithm is based on Selection Sort.
Insertion sort is a simple sorting algorithm that works by iteratively inserting each element of an
unsorted list into its correct position in a sorted portion of the list. It is like sorting playing cards in
your hands. You split the cards into two groups: the sorted cards and the unsorted cards. Then, you
pick a card from the unsorted group and put it in the right place in the sorted group.
Start with the second element as the first element is assumed to be sorted.
Compare the second element with the first if the second is smaller then swap them.
Move to the third element, compare it with the first two, and put it in its correct position
Repeat until the entire array is sorted.
# Python program for implementation of Insertion Sort
# Function to sort array using insertion sort
def insertionSort(arr):
for i in range(1, len(arr)):
key = arr[i]
j=i-1
# Move elements of arr[0..i-1], that are
# greater than key, to one position ahead
# of their current position
while j >= 0 and key < arr[j]:
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = key
# A utility function to print array of size n
def printArray(arr):
for i in range(len(arr)):
print(arr[i], end=" ")
print()
# Driver method
if __name__ == "__main__":
arr = [12, 11, 13, 5, 6]
insertionSort(arr)
printArray(arr)
Output
5 6 11 12 13
arr = {23, 1, 10, 5, 2}
Initial:
Current element is 23
The first element in the array is assumed to be sorted.
The sorted part until 0th index is : [23]
First Pass:
Compare 1 with 23 (current element with the sorted part).
Since 1 is smaller, insert 1 before 23 .
The sorted part until 1st index is: [1, 23]
Second Pass:
Compare 10 with 1 and 23 (current element with the sorted part).
Since 10 is greater than 1 and smaller than 23 , insert 10 between 1 and 23 .
The sorted part until 2nd index is: [1, 10, 23]
Third Pass:
Compare 5 with 1 , 10 , and 23 (current element with the sorted part).
Since 5 is greater than 1 and smaller than 10 , insert 5 between 1 and 10
The sorted part until 3rd index is : [1, 5, 10, 23]
Fourth Pass:
Compare 2 with 1, 5, 10 , and 23 (current element with the sorted part).
Since 2 is greater than 1 and smaller than 5 insert 2 between 1 and 5 .
The sorted part until 4th index is: [1, 2, 5, 10, 23]
Final Array:
The sorted array is: [1, 2, 5, 10, 23]
Complexity Analysis of Insertion Sort
Time Complexity
Best case: O(n), If the list is already sorted, where n is the number of elements in the list.
Average case: O(n2), If the list is randomly ordered
Worst case: O(n2), If the list is in reverse order
Space Complexity
Auxiliary Space: O(1), Insertion sort requires O(1) additional space, making it a space-
efficient sorting algorithm.
Advantages and Disadvantages of Insertion Sort
Advantages
Simple and easy to implement.
Stable sorting algorithm.
Efficient for small lists and nearly sorted lists.
Space-efficient as it is an in-place algorithm.
Adoptive. the number of inversions is directly proportional to number of swaps. For example,
no swapping happens for a sorted array and it takes O(n) time only.
Disadvantages
Inefficient for large lists.
Not as efficient as other sorting algorithms (e.g., merge sort, quick sort) for most cases.
Applications of Insertion Sort
Insertion sort is commonly used in situations where:
The list is small or nearly sorted.
Simplicity and stability are important.
Used as a subroutine in Bucket Sort
Can be useful when array is already almost sorted
Merged sort
Here's a step-by-step explanation of how merge sort works:
1. Divide: Divide the list or array recursively into two halves until it can no
more be divided.
2. Conquer: Each subarray is sorted individually using the merge sort
algorithm.
3. Merge: The sorted subarrays are merged back together in sorted order.
The process continues until all elements from both subarrays have
been merged.
Let's look at the working of above example:
Divide:
[38, 27, 43, 10] is divided into [38, 27] and [43, 10] .
[38, 27] is divided into [38] and [27] .
[43, 10] is divided into [43] and [10] .
Conquer:
[38] is already sorted.
[27] is already sorted.
[43] is already sorted.
[10] is already sorted.
Merge:
Merge [38] and [27] to get [27, 38] .
Merge [43] and [10] to get [10,43] .
Merge [27, 38] and [10,43] to get the final sorted list [10, 27, 38, 43]
Therefore, the sorted list is [10, 27, 38, 43] .
def merge(arr, left, mid, right):
n1 = mid - left + 1
n2 = right - mid
# Create temp arrays
L = [0] * n1
R = [0] * n2
# Copy data to temp arrays L[] and R[]
for i in range(n1):
L[i] = arr[left + i]
for j in range(n2):
R[j] = arr[mid + 1 + j]
i=0
j=0
k = left
# Merge the temp arrays back
# into arr[left..right]
while i < n1 and j < n2:
if L[i] <= R[j]:
arr[k] = L[i]
i += 1
else:
arr[k] = R[j]
j += 1
k += 1
# Copy the remaining elements of L[],
# if there are any
while i < n1:
arr[k] = L[i]
i += 1
k += 1
# Copy the remaining elements of R[],
# if there are any
while j < n2:
arr[k] = R[j]
j += 1
k += 1
def mergeSort(arr, left, right):
if left < right:
mid = (left + right) // 2
mergeSort(arr, left, mid)
mergeSort(arr, mid + 1, right)
merge(arr, left, mid, right)
# Driver code
if __name__ == "__main__":
arr = [38, 27, 43, 10]
mergeSort(arr, 0, len(arr) - 1)
for i in arr:
print(i, end=" ")
print()
Applications of Merge Sort
1. Sorting Large Datasets
Merge Sort has a guaranteed time complexity of O(n log n).
Its performance does not degrade for large inputs.
Suitable for applications where predictable performance is required.
2. External Sorting
Used when data is too large to fit into main memory (RAM).
Data is:
o Divided into smaller chunks
o Each chunk is sorted separately
o Sorted chunks are merged together
Commonly used in:
o Database systems
o File systems
o Large log file processing
3. Solving Advanced Algorithmic Problems
Merge Sort is used to solve problems that require counting during sorting, such
as:
Inversion Count
Count of Smaller Elements on the Right
Surpasser Count
These problems are efficiently solved by modifying the merge step.
4. Use in Programming Language Libraries
Merge Sort and its variations are widely used in standard libraries.
TimSort, a hybrid of Merge Sort and Insertion Sort, is used in:
o Python
o Java (Android)
o Swift
Reason for Preference
Merge Sort is stable, meaning it preserves the order of equal elements.
Stability is crucial when sorting objects or records.
Java Example
[Link]() → Uses Quick Sort (for primitive data types)
[Link]() → Uses Merge Sort (for objects)
5. Preferred Algorithm for Sorting Linked Lists
Linked lists do not support random access.
Merge Sort:
o Works efficiently with sequential access
o Does not require shifting of elements
Hence, Merge Sort is the best choice for sorting linked lists.
6. Parallel Processing
Subarrays can be sorted independently.
Merging can be done in parallel.
Suitable for:
o Multi-core systems
o Distributed computing environments
7. Set Operations Using Merge Function
The merge operation can be reused to efficiently solve:
Union of two sorted arrays
Intersection of two sorted arrays
Difference of two sorted arrays
Advantages of Merge Sort
1. Stability
Maintains the relative order of equal elements.
Important when sorting structured data such as records or objects.
2. Guaranteed Worst-Case Performance
Time complexity is O(n log n) in all cases.
Unlike Quick Sort, performance does not degrade.
3. Simple and Structured Approach
Based on divide-and-conquer technique.
Easy to understand and analyze.
4. Naturally Parallel
Independent sorting of subarrays makes it suitable for parallel execution.
Disadvantages of Merge Sort
1. High Space Complexity
Requires additional memory for temporary arrays.
Space complexity is O(n).
2. Not an In-Place Algorithm
Does not sort within the original array.
Extra memory is required to store merged results.
3. Slower Than Quick Sort in Practice
Merge Sort is less cache-friendly.
Quick Sort often performs faster because it sorts in-place.
Merge Sort vs Quick Sort (Comparison)
Feature Merge Sort Quick Sort
Worst-case time complexity O(n log n) O(n²)
Stability Yes No
Extra memory Required Not required
Suitable for linked lists Yes No
Cache friendliness Low High
QuickSort is a sorting algorithm based on the Divide and Conquer that picks an
element as a pivot and partitions the given array around the picked pivot by
placing the pivot in its correct position in the sorted array. .
There are mainly three steps in the algorithm:
1. Choose a Pivot: Select an element from the array as the pivot. The
choice of pivot can vary (e.g., first element, last element, random
element, or median).
2. Partition the Array: Re arrange the array around the pivot. After
partitioning, all elements smaller than the pivot will be on its left, and all
elements greater than the pivot will be on its right.
3. Recursively Call: Recursively apply the same process to the two
partitioned sub-arrays.
4. Base Case: The recursion stops when there is only one element left in
the sub-array, as a single element is already sorted.
# partition function
def partition(arr, low, high):
# choose the pivot
pivot = arr[high]
# index of smaller element and indicates
# the right position of pivot found so far
i = low - 1
# traverse arr[low..high] and move all smaller
# elements to the left side. Elements from low to
# i are smaller after every iteration
for j in range(low, high):
if arr[j] < pivot:
i += 1
swap(arr, i, j)
# move pivot after smaller elements and
# return its position
swap(arr, i + 1, high)
return i + 1
# swap function
def swap(arr, i, j):
arr[i], arr[j] = arr[j], arr[i]
# the QuickSort function implementation
def quickSort(arr, low, high):
if low < high:
# pi is the partition return index of pivot
pi = partition(arr, low, high)
# recursion calls for smaller elements
# and greater or equals elements
quickSort(arr, low, pi - 1)
quickSort(arr, pi + 1, high)
if __name__ == "__main__":
arr = [10, 7, 8, 9, 1, 5]
n = len(arr)
quickSort(arr, 0, n - 1)
for val in arr:
print(val, end=" ")
Advantages of Quick Sort
It is a divide-and-conquer algorithm that makes it easier to solve
problems.
It is efficient on large data sets.
It has a low overhead, as it only requires a small amount of memory to
function.
It is Cache Friendly as we work on the same array to sort and do not copy
data to any auxiliary array.
Fastest general-purpose algorithm for large data when stability is not
required.
Disadvantages of Quick Sort
It has a worst-case time complexity of O(n2), which occurs when the
pivot is chosen poorly.
It is not a good choice for small data sets.
It is not a stable sort, meaning that if two elements have the same key,
their relative order will not be preserved in the sorted output in case of
quick sort, because here we are swapping elements according to the
pivot's position (without considering their original positions).
Applications of Quick Sort
Sorting large datasets efficiently in memory.
Used in library sort functions (like C++ std::sort and Java [Link] for
primitives).
Arranging records in databases for faster searching.
Preprocessing step in algorithms requiring sorted input (e.g., binary
search, two-pointer techniques).
Finding the kth smallest/largest element using Quickselect (a variant of
quicksort).
Sorting arrays of objects based on multiple keys (custom comparators).
Data compression algorithms (like Huffman coding preprocessing).
Graphics and computational geometry (e.g., convex hull algorithms).