INTRODUCTION TO SORTING
Sorting is nothing but storage of data in sorted order, it can be in ascending or descending
order. The term Sorting comes into picture with the term Searching. There are so many things in
our real life that we need to search, like a particular record in database, roll numbers in merit
list, a particular telephone number, any particular page in a book etc.
Sorting arranges data in a sequence which makes searching easier. Every record which is
going to be sorted will contain one key. Based on the key the record will be sorted. For example,
suppose we have a record of students, every such record will have the following data:
Roll No.
Name
Age
Class
Here Student roll no. can be taken as key for sorting the records in ascending or descending
order. Now suppose we have to search a Student with roll no. 15, we don't need to search the
complete record we will simply search between the Students with roll no. 10 to 20.
Sorting Efficiency
There are many techniques for sorting. Implementation of particular sorting technique depends
upon situation. Sorting techniques mainly depends on two parameters. First parameter is the
execution time of program, which means time taken for execution of program. Second is the
space, which means space taken by the program.
Types of Sorting Techniques
There are many types of Sorting techniques, differentiated by their efficiency and space
requirements. Following are some sorting techniques which we will be covering in next sections.
1. Bubble Sort
2. Insertion Sort
3. Selection Sort
4. Quick Sort
5. Merge Sort
6. Heap Sort
Internal sort
An internal sort is any data sorting process that takes place entirely within the main
memory of a computer. This is possible whenever the data to be sorted is small enough to all be
held in the main memory. For sorting larger datasets, it may be necessary to hold only a chunk
of data in memory at a time, since it won’t all fit. The rest of the data is normally held on some
larger, but slower medium, like a hard-disk. Any reading or writing of data to and from this
slower media can slow the sortation process considerably.
Some common internal sorting algorithms include:
1. Bubble Sort
2. Insertion Sort
3. Quick Sort
4. Heap Sort
5. Radix Sort
6. Selection sort
External sorting
External sorting is a class of sorting algorithms that can handle massive amounts
of data. External sorting is required when the data being sorted do not fit into the main
memory of a computing device (usually RAM) and instead they must reside in the
slower external memory (usually a hard drive). External sorting typically uses a hybrid sort-
merge strategy. In the sorting phase, chunks of data small enough to fit in main memory are
read, sorted, and written out to a temporary file. In the merge phase, the sorted subfiles are
combined into a single larger file.
Time Complexities of all Sorting Algorithms
Following is a quick revision sheet that you may refer at last minute
Algorithm Time Complexity
Best Average Worst
Selection Sort Ω(n^2) θ(n^2) O(n^2)
Bubble Sort Ω(n) θ(n^2) O(n^2)
Insertion Sort Ω(n) θ(n^2) O(n^2)
Heap Sort Ω(n log(n)) θ(n log(n)) O(n log(n))
Quick Sort Ω(n log(n)) θ(n log(n)) O(n^2)
Algorithm Time Complexity
Best Average Worst
Merge Sort Ω(n log(n)) θ(n log(n)) O(n log(n))
Bucket Sort Ω(n+k) θ(n+k) O(n^2)
Radix Sort Ω(nk) θ(nk) O(nk)
THE BUBBLE SORT
The bubble sort makes multiple passes through a list. It compares adjacent items and
exchanges those that are out of order. Each pass through the list places the next largest value
in its proper place. In essence, each item “bubbles” up to the location where it belongs.
Figure shows the first pass of a bubble sort. The shaded items are being compared to see if
they are out of order. If there are n items in the list, then there are n−1n−1 pairs of items that
need to be compared on the first pass. It is important to note that once the largest value in the
list is part of a pair, it will continually be moved along until the pass is complete.
At the start of the second pass, the largest value is now in place. There are n−1n−1 items left
to sort, meaning that there will be n−2n−2 pairs. Since each pass places the next largest value
in place, the total number of passes necessary will be n−1n−1. After completing
the n−1n−1 passes, the smallest item must be in the correct position with no further processing
required.
Program- Bubble Sort
#include <stdio.h>
int main()
{
int array[100], n, c, d, swap;
printf("Enter number of elements\n");
scanf("%d", &n);
printf("Enter %d integers\n", n);
for (c = 0; c < n; c++)
scanf("%d", &array[c]);
for (c = 0 ; c < ( n - 1 ); c++)
{
for (d = 0 ; d < n - c - 1; d++)
{
if (array[d] > array[d+1]) /* For decreasing order use < */
{
swap = array[d];
array[d] = array[d+1];
array[d+1] = swap;
}
}
}
printf("Sorted list in ascending order:\n");
for ( c = 0 ; c < n ; c++ )
printf("%d\n", array[c]);
return 0;
}
THE SELECTION SORT
The selection sort improves on the bubble sort by making only one exchange for every pass
through the list. In order to do this, a selection sort looks for the largest value as it makes a pass
and, after completing the pass, places it in the proper location.
As with a bubble sort, after the first pass, the largest item is in the correct place. After the
second pass, the next largest is in place. This process continues and requires n−1n−1 passes
to sort n items, since the final item must be in place after the (n−1)(n−1) st pass.
Figure 3 shows the entire sorting process. On each pass, the largest remaining item is selected
and then placed in its proper location. The first pass places 93, the second pass places 77, the
third places 55, and so on.
Program
#include <stdio.h>
int main()
{
int array[100], n, c, d, position, swap;
printf("Enter number of elements\n");
scanf("%d", &n);
printf("Enter %d integers\n", n);
for ( c = 0 ; c < n ; c++ )
scanf("%d", &array[c]);
for ( c = 0 ; c < ( n - 1 ) ; c++ )
{
position = c;
for ( d = c + 1 ; d < n ; d++ )
{
if ( array[position] > array[d] )
position = d;
}
if ( position != c )
{
swap = array[c];
array[c] = array[position];
array[position] = swap;
}
}
printf("Sorted list in ascending order:\n");
for ( c = 0 ; c < n ; c++ )
printf("%d\n", array[c]);
return 0;
}
THE INSERTION SORT
The insertion sort, although still O(n2) , works in a slightly different way. It always maintains a
sorted sublist in the lower positions of the list. Each new item is then “inserted” back into the
previous sublist such that the sorted sublist is one item larger. Figure 4 shows the insertion
sorting process. The shaded items represent the ordered sublists as the algorithm makes each
pass.
PROGRAM
#include <stdio.h>
int main()
{
int n, array[1000], c, d, t;
printf("Enter number of elements\n");
scanf("%d", &n);
printf("Enter %d integers\n", n);
for (c = 0; c < n; c++) {
scanf("%d", &array[c]);
}
for (c = 1 ; c <= n - 1; c++) {
d = c;
while ( d > 0 && array[d] < array[d-1]) {
t = array[d];
array[d] = array[d-1];
array[d-1] = t;
d--;
}
}
printf("Sorted list in ascending order:\n");
for (c = 0; c <= n - 1; c++) {
printf("%d\n", array[c]);
}
return 0;
}
THE QUICK SORT
The quick sort uses divide and conquer to gain the same advantages as the merge sort, while
not using additional storage. As a trade-off, however, it is possible that the list may not be
divided in half. When this happens, we will see that performance is diminished.
A quick sort first selects a value, which is called the pivot value. Although there are many
different ways to choose the pivot value, we will simply use the first item in the list. The role of
the pivot value is to assist with splitting the list. The actual position where the pivot value
belongs in the final sorted list, commonly called the split point, will be used to divide the list for
subsequent calls to the quick sort.
Figure 12 shows that 54 will serve as our first pivot value. Since we have looked at this example
a few times already, we know that 54 will eventually end up in the position currently holding 31.
The partitionprocess will happen next. It will find the split point and at the same time move
other items to the appropriate side of the list, either less than or greater than the pivot value.
Partitioning begins by locating two position markers—let’s call them leftmark and rightmark
—at the beginning and end of the remaining items in the list (positions 1 and 8 in Figure 13).
The goal of the partition process is to move items that are on the wrong side with respect to the
pivot value while also converging on the split point. Figure 13 shows this process as we locate
the position of 54.
We begin by incrementing leftmark until we locate a value that is greater than the pivot value.
We then decrement rightmark until we find a value that is less than the pivot value. At this
point we have discovered two items that are out of place with respect to the eventual split point.
For our example, this occurs at 93 and 20. Now we can exchange these two items and then
repeat the process again.
At the point where rightmark becomes less than leftmark, we stop. The position
of rightmark is now the split point. The pivot value can be exchanged with the contents of the
split point and the pivot value is now in place (Figure 14). In addition, all the items to the left of
the split point are less than the pivot value, and all the items to the right of the split point are
greater than the pivot value. The list can now be divided at the split point and the quick sort can
be invoked recursively on the two halves.
Program
#include <stdio.h>
void quick_sort(int[],int,int);
int partition(int[],int,int);
int main()
{
int a[50],n,i;
printf("How many elements?");
scanf("%d",&n);
printf("\nEnter array elements:");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
quick_sort(a,0,n-1);
printf("\nArray after sorting:");
for(i=0;i<n;i++)
printf("%d ",a[i]);
return 0;
}
void quick_sort(int a[],int l,int u)
{
int j;
if(l<u)
{
j=partition(a,l,u);
quick_sort(a,l,j-1);
quick_sort(a,j+1,u);
}
}
int partition(int a[],int l,int u)
{
int v,i,j,temp;
v=a[l];
i=l;
j=u+1;
do
{
do
i++;
while(a[i]<v&&i<=u);
do
j--;
while(v<a[j]);
if(i<j)
{
temp=a[i];
a[i]=a[j];
a[j]=temp;
}
}while(i<j);
a[l]=a[j];
a[j]=v;
return(j);
}
THE MERGE SORT
We now turn our attention to using a divide and conquer strategy as a way to improve the
performance of sorting algorithms. The first algorithm we will study is the merge sort. Merge
sort is a recursive algorithm that continually splits a list in half. If the list is empty or has one
item, it is sorted by definition (the base case). If the list has more than one item, we split the list
and recursively invoke a merge sort on both halves. Once the two halves are sorted,
the fundamental operation, called a merge, is performed. Merging is the process of taking two
smaller sorted lists and combining them together into a single, sorted, new list. Figure 10 shows
our familiar example list as it is being split by mergeSort. Figure 11shows the simple lists,
now sorted, as they are merged back together.
Program
#include<stdio.h>
void mergesort(int a[],int i,int j);
void merge(int a[],int i1,int j1,int i2,int j2);
int main()
{
int a[30],n,i;
printf("Enter no of elements:");
scanf("%d",&n);
printf("Enter array elements:");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
mergesort(a,0,n-1);
printf("\nSorted array is :");
for(i=0;i<n;i++)
printf("%d ",a[i]);
return 0;
}
void mergesort(int a[],int i,int j)
{
int mid;
if(i<j)
{
mid=(i+j)/2;
mergesort(a,i,mid);
mergesort(a,mid+1,j);
merge(a,i,mid,mid+1,j);
}
}
void merge(int a[],int i1,int j1,int i2,int j2)
{
int temp[50];
int i,j,k;
i=i1;
j=i2;
k=0;
while(i<=j1 && j<=j2)
{
if(a[i]<a[j])
temp[k++]=a[i++];
else
temp[k++]=a[j++];
}
while(i<=j1)
temp[k++]=a[i++];
while(j<=j2)
temp[k++]=a[j++];
for(i=i1,j=0;i<=j2;i++,j++)
a[i]=temp[j];
THE SHELL SORT
The shell sort, sometimes called the “diminishing increment sort,” improves on the insertion
sort by breaking the original list into a number of smaller sublists, each of which is sorted using
an insertion sort. The unique way that these sublists are chosen is the key to the shell sort.
Instead of breaking the list into sublists of contiguous items, the shell sort uses an increment i,
sometimes called the gap, to create a sublist by choosing all items that are i items apart.
This can be seen in Figure 6. This list has nine items. If we use an increment of three, there are
three sublists, each of which can be sorted by an insertion sort. After completing these sorts, we
get the list shown in Figure 7. Although this list is not completely sorted, something very
interesting has happened. By sorting the sublists, we have moved the items closer to where
they actually belong.
Figure 8 shows a final insertion sort using an increment of one; in other words, a standard
insertion sort. Note that by performing the earlier sublist sorts, we have now reduced the total
number of shifting operations necessary to put the list in its final order. For this case, we need
only four more shifts to complete the process.
Program
1.
#include <stdio.h>
2. void shellsort(int arr[], int num)
3. {
4. int i, j, k, tmp;
5. for (i = num / 2; i > 0; i = i / 2)
6. {
7. for (j = i; j < num; j++)
8. {
9. for(k = j - i; k >= 0; k = k - i)
10. {
11. if (arr[k+i] >= arr[k])
12. break;
13. else
14. {
15. tmp = arr[k];
16. arr[k] = arr[k+i];
17. arr[k+i] = tmp;
18. }
19. }
20. }
21. }
22. }
23. int main()
24. {
25. int arr[30];
26. int k, num;
27.
28. printf("Enter total no. of elements : ");
29. scanf("%d", &num);
30.
31. printf("\nEnter %d numbers: ", num);
32.
33. for (k = 0 ; k < num; k++)
34. {
35. scanf("%d", &arr[k]);
36. }
37.
38. shellsort(arr, num);
39.
40. printf("\n Sorted array is: ");
41.
42. for (k = 0; k < num; k++)
43. printf("%d ", arr[k]);
44. return 0;
45. }
SEARCHING
In computer science, a search algorithm is an algorithm that retrieves information stored
within some data structure. Data structures can include linked lists, arrays,search trees, hash
tables, or various other storage methods. The appropriatesearch algorithm often depends on
the data structure being searched.
5.3. THE SEQUENTIAL SEARCH
When data items are stored in a collection such as a list, we say that they have a linear or
sequential relationship. Each data item is stored in a position relative to the others. In Python
lists, these relative positions are the index values of the individual items. Since these index
values are ordered, it is possible for us to visit them in sequence. This process gives rise to our
first searching technique, the sequential search.
Figure 1 shows how this search works. Starting at the first item in the list, we simply move from
item to item, following the underlying sequential ordering until we either find what we are looking
for or run out of items. If we run out of items, we have discovered that the item we were
searching for was not present.
Program
#include <stdio.h>
int main()
{
int array[100], search, c, n;
printf("Enter the number of elements in array\n");
scanf("%d",&n);
printf("Enter %d integer(s)\n", n);
for (c = 0; c < n; c++)
scanf("%d", &array[c]);
printf("Enter the number to search\n");
scanf("%d", &search);
for (c = 0; c < n; c++)
{
if (array[c] == search) /* if required element found */
{
printf("%d is present at location %d.\n", search, c+1);
break;
}
}
if (c == n)
printf("%d is not present in array.\n", search);
return 0;
}
BINARY SEARCH
C Program to implement Binary search. Binary search technique is simple
searching technique which can be applied if the items to be compared are either in
ascending order or descending order. The general idea used in binary search is
similar to the way we search for the telephone number of a person in the telephone
directory. Binary search is the divide and conquer strategy.
#include<stdio.h>
int main() {
int n, a[30], item, i, j, mid, top, bottom;
printf("Enter how many elements you want:\n");
scanf("%d", &n);
printf("Enter the %d elements in ascending order\n", n);
for (i = 0; i < n; i++) {
scanf("%d", &a[i]);
}
printf("\nEnter the item to search\n");
scanf("%d", &item);
bottom = 1;
top = n;
do {
mid = (bottom + top) / 2;
if (item < a[mid])
top = mid - 1;
else if (item > a[mid])
bottom = mid + 1;
} while (item != a[mid] && bottom <= top);
if (item == a[mid]) {
printf("Binary search successfull!!\n");
printf("\n %d found in position: %d\n", item, mid + 1);
} else {
printf("\n Search failed\n %d not found\n", item);
}
return 0;
}
Here's an example of a binary search for 11 in the given list,
that keeps track of index of the left-most element (i.e., the
"lowest index" to consider), the index of the right-most
element (i.e., the "highest
index" to consider), and the index of the element in the
"middle" of these two positions:
The search method in the above example should then return
a value of 4, the index of the found key value.
Binary search algorithm Efficiency
Visualization of the binary search algorithm where 4 is the target value.
Class Search algorithm
Best-case performance O(1)
Average performance O(log n)
Worst-case space complexity O(1)
HASHING
The best technique to search in efficient time in any type of situation is
hashing with minimum wastage of memory.
The idea is simple in hashing is to use hash function to map keys in the
hash table.
HASH TABLE
All the large collection of data are stored in a hash table. The size of the
hash table is usually fixed and it is always bigger than the number of
elements we are going to store.
HASH FUNCTION
The element that is to be retrieved from the hash table is known as a key. A
hash function helps in connecting a key to the index of the hash table. A
hash function should be perfect in storing the elements in the exact index in
the hash table and tell us where to retrieve the searched data from the
table.
Types of hashing
1. Static Hashing
2. Dynamic Hashing
Static Hashing is another form of the hashing problem which allows
users to perform lookups on a finalized dictionary set (all objects in
the dictionary are final and not changing).
dynamic hashing. (data structure) Definition: A hash table
that grows to handle more items. The associated hash
function must change as the table grows.
Rehashing
When the load becomes greater than the allowable load factor, a rehash
operation is required.
HASHING FUNCTIONS
1. Modular Arithmetic
2. Truncation
3. Folding
4. Mid-Square method