Linear Sorting
Sorting in O(N) time
Dr. AMIT KUMAR @JUET
How Fast Can We Sort?
• Selection Sort, Bubble Sort, Insertion Sort: O(n2)
• Heap Sort, Merge sort: O(nlgn)
• Quicksort: O(nlgn) - average
• What is common to all these algorithms?
– Make comparisons between input elements
ai < aj, ai ≤ aj, ai = aj, ai ≥ aj, or ai > aj
How Fast Can We Sort?
Can we do better than O( n log n )?
– No.
– It can be proven that any comparison-based sorting
algorithm will need to carry out at least O( n log n )
operations
Dr. AMIT KUMAR @JUET
Can we do better?
• Linear sorting algorithms
– Bucket sort
– Radix Sort
– Counting Sort
• Make certain assumptions about the data
• Linear sorts are NOT “comparison sorts”
Restrictions on the problem
• Suppose the values in the list to be sorted can
repeat but the values have a limit (e.g., values
are digits from 0 to 9)
• Sorting, in this case, appears easier.
• Is it possible to come up with an algorithm
better than O( n log n )?
– Yes
– Strategy will not involve comparisons
Dr. AMIT KUMAR @JUET
Bucket Sort
• Assumption:
– the input is generated by a random process that distributes elements
uniformly over [0, 1]
• Idea:
– Divide [(0, 1) into k equal-sized buckets (k=Θ(n))]
– Distribute the n input values into the buckets
– Sort each bucket (e.g., using quicksort)
– Go through the buckets in order, listing elements in each one
• Input: A[1 . . n], where 0 ≤ A[i] < 1 for all i
• Output: elements A[i] sorted
Dr. AMIT KUMAR @JUET
Example - Bucket Sort
Dr. AMIT KUMAR @JUET
Example - Bucket Sort
A 1 .78 B 0 /
2 .17 1 .17 .12 /
3 .39 2 .26 .21 .23 /
4 .26 3 .39 /
5 .72 4 / Distribute
6 5 /
.94
Into buckets
7 .21 6 .68 /
8 .12 7 .78 .72 /
9 .23 8 /
10 .68 9 .94 /
Dr. AMIT KUMAR @JUET
Example - Bucket Sort
0 /
1 .12 .17 /
2 .21 .23 .26 /
3 .39 /
Sort within each
4 /
bucket
5 /
6 .68 /
7 .72 .78 /
8 /
9 .94 / Dr. AMIT KUMAR @JUET
Example - Bucket Sort
.12 .17 .21 .23 .26 .39 .68 .72 .78 .94 /
0 /
1 .12 .17 /
2 .21 .23 .26 /
3 .39 /
4 /
5 / Concatenate the lists from
6 .68 / 0 to n – 1 together, in order
7 .72 .78 /
8 /
9 .94 / Dr. AMIT KUMAR @JUET
Bucket sort algorithm
Algorithm BucketSort( S )
( values in S are between 0 and m-1 )
for j 0 to m-1 do // initialize m buckets
b[j] 0
for i 0 to n-1 do // place elements in their
b[S[i]] b[S[i]] + 1 // appropriate buckets
i0
for j 0 to m-1 do // place elements in buckets
for r 1 to b[j] do // back in S
S[i] j
ii+1
Dr. AMIT KUMAR @JUET
Time complexity
• Bucket initialization: O( m )
• From array to buckets: O( n )
• From buckets to array: O( n )
– Even though this stage is a nested loop, notice that all we
do is dequeue from each bucket until they are all empty –>
n dequeue operations in all
• Since m will likely be small compared to n, Bucket
sort is O( n )
– Strictly speaking, time complexity is O ( n + m )
Dr. AMIT KUMAR @JUET
Analysis of Bucket Sort
Alg.: BUCKET-SORT(A, n)
for i ← 1 to n
O(n)
do insert A[i] into list B[nA[i]]
for i ← 0 to k - 1
k O(n/k log(n/k))
do sort list B[i] with quicksort sort
=O(nlog(n/k)
concatenate lists B[0], B[1], . . . , B[n -1]
together in order O(k)
return the concatenated lists
O(n) (if k=Θ(n))
Dr. AMIT KUMAR @JUET
Sorting integers
• Can we perform bucket sort on any array of (non-
negative) integers?
– Yes, but note that the number of buckets will depend on
the maximum integer value
• If you are sorting 1000 integers and the maximum
value is 999999, you will need 1 million buckets!
– Time complexity is not really O( n ) because m is much >
than n. Actual time complexity is O( m )
• Can we do better?
Dr. AMIT KUMAR @JUET
Radix Sort
• Represents keys as d-digit numbers in some base-k
key = x1x2...xd where 0≤xi≤k-1
• Example: key=15
key10 = 15, d=2, k=10 where 0≤xi≤9
key2 = 1111, d=4, k=2 where 0≤xi≤1
Dr. AMIT KUMAR @JUET
Radix Sort
• Assumptions
d=O(1) and k =O(n)
• Sorting looks at one column at a time
– For a d digit number, sort the least significant
digit first
– Continue sorting on the next least significant
digit, until all digits have been sorted
– Requires only d passes through the list
Dr. AMIT KUMAR @JUET
Radix Sort
Alg.: RADIX-SORT(A, d)
for i ← 1 to d
do use a stable sort to sort array A on digit i
(stable sort: preserves order of identical elements)
Analysis of Radix Sort
• Given n numbers of d digits each, where each digit may take
up to k possible values, RADIX-SORT correctly sorts the
numbers in O(d(n+k))
– One pass of sorting per digit takes O(n+k) assuming that
we use counting sort
– There are d passes (for each digit)
Dr. AMIT KUMAR @JUET
Analysis of Radix Sort
• Given n numbers of d digits each, where each digit may take
up to k possible values, RADIX-SORT correctly sorts the
numbers in O(d(n+k))
– Assuming d=O(1) and k=O(n), running time is O(n)
Dr. AMIT KUMAR @JUET
Radix Sort as a Bucket Sort
Example: first pass
12 58 37 64 52 36 99 63 18 9 20 88 47
58
12 37 18 9
20 52 63 64 36 47 88 99
20 12 52 63 64 36 37 47 58 18 88 9 99
Dr. AMIT KUMAR @JUET
Example: Second pass
20 12 52 63 64 36 37 47 58 18 88 9 99
12 36 52 63
9 18 20 37 47 58 64 88 99
9 12 18 20 36 37 47 52 58 63 64 88 99
Dr. AMIT KUMAR @JUET
Example: 1st and 2nd passes
12 58 37 64 52 36 99 63 18 9 20 88 47
sort by rightmost digit
20 12 52 63 64 36 37 47 58 18 88 9 99
sort by leftmost digit
9 12 18 20 36 37 47 52 58 63 64 88 99
Dr. AMIT KUMAR @JUET
Radix Sort as a Bucket Sort
Dr. AMIT KUMAR @JUET
Radix sort and stability
• Radix sort works as long as the bucket sort stages are
stable sorts
• Stable sort: in case of ties, relative order of elements
are preserved in the resulting array
– Suppose there are two elements whose first digit is the
same; for example, 52 & 58
– If 52 occurs before 58 in the array prior to the sorting
stage, 52 should occur before 58 in the resulting array
• This way, the work carried out in the previous bucket
sort stages is preserved
Dr. AMIT KUMAR @JUET
About Radix sort
• Note that only 10 buckets are needed
regardless of number of stages since the
buckets are reused at each stage
• Radix sort can apply to words
– Set a limit to the number of letters in a word
– Use 27 buckets (or more, depending on the
letters/characters allowed), one for each letter
plus a “blank” character
– The word-length limit is exactly the number of
bucket sort stages needed
Dr. AMIT KUMAR @JUET
Counting Sort
• Assumptions:
– n integers which are in the range [0 ... r]
– r is in the order of n, that is, r=O(n)
• Idea:
– For each element x, find the number of elements x
– Place x into its correct position in the output array
output array
Dr. AMIT KUMAR @JUET
Step 1
(i.e., frequencies)
(r=6)
Dr. AMIT KUMAR @JUET
Step 2
C (frequencies) Cnew (cumulative sums)
Dr. AMIT KUMAR @JUET
Algorithm
• Start from the last element of A
• Place A[i] at its correct place in the output array
• Decrease C[A[i]] by one
1 2 3 4 5 6 7 8
A 2 5 3 0 2 3 0 3
0 1 2 3 4 5
Cnew 2 2 4 7 7 8
Dr. AMIT KUMAR @JUET
Example
1 2 3 4 5 6 7 8 0 1 2 3 4 5
A 2 5 3 0 2 3 0 3 Cnew 2 2 4 7 7 8
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
B 3 B 0 3
0 1 2 3 4 5
0 1 2 3 4 5
Cnew 2 2 4 6 7 8 Cnew 1 2 4 6 7 8
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
B 0 3 3 B 0 2 3 3
0 1 2 3 4 5
0 1 2 3 4 5
Cnew 1 2 4 5 7 8 Cnew 1 2 3 5 7 8
Dr. AMIT KUMAR @JUET
Example (cont.)
1 2 3 4 5 6 7 8
A 2 5 3 0 2 3 0 3
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
B 0 0 2 3 3 B 0 0 2 3 3 3 5
0 1 2 3 4 5 0 1 2 3 4 5
C 0 2 3 5 7 8 C 0 2 3 4 7 7
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
B 0 0 2 3 3 3 B 0 0 2 2 3 3 3 5
0 1 2 3 4 5
C 0 2 3 4 7 8
Dr. AMIT KUMAR @JUET
COUNTING-SORT ALGO
1 j n
A
Alg.: COUNTING-SORT(A, B, n, k) 0 k
1. for i ← 0 to r C
2. do C[ i ] ← 0 1 n
B
3. for j ← 1 to n
4. do C[A[ j ]] ← C[A[ j ]] + 1
5. C[i] contains the number of elements equal to i
6. for i ← 1 to r
7. do C[ i ] ← C[ i ] + C[i -1]
8. C[i] contains the number of elements ≤ i
9. for j ← n downto 1
10. do B[C[A[ j ]]] ← A[ j ]
11. C[A[ j ]] ← C[A[ j ]] - 1
Analysis of Counting Sort
Alg.: COUNTING-SORT(A, B, n, k)
1. for i ← 0 to r
O(r)
2. do C[ i ] ← 0
3. for j ← 1 to n
O(n)
4. do C[A[ j ]] ← C[A[ j ]] + 1
5. C[i] contains the number of elements equal to i
6. for i ← 1 to r
O(r)
7. do C[ i ] ← C[ i ] + C[i -1]
8. C[i] contains the number of elements ≤ i
9. for j ← n downto 1
10. do B[C[A[ j ]]] ← A[ j ] O(n)
11. C[A[ j ]] ← C[A[ j ]] - 1
Overall time: O(n + r)
Analysis of Counting Sort
• Overall time: O(n + r)
• In practice we use COUNTING sort when r =
O(n)
running time is O(n)
Dr. AMIT KUMAR @JUET
Dr. AMIT KUMAR @JUET