Part II Sorting and Order Statistics II.1
Introduction
This part gives several algorithms that solve
the sorting problem:
Input: A sequence of n numbers:
Output: A permutation (reordering) of the
input sequence such
that a_1' <= a_2' <= ... <= a_n'.
The structure of the data
In the real world, each number, called the
"key" is part of a record that contains other
"satellite data". Often the satellite data is
carried around with the key when they are
permuted, but if the satellite data is large,
usually only a pointer to it is carried with
the key (or just the pointers are permuted).
In this course, we are only interested in the
sorting process, not the implementation
details, which are usually straightforward.
Why sorting? II.2
Sorting is considered to be fundamental since
- Sometimes it is required by the application,
for example sorting checks for a statement.
- Some algorithms use sorting as a subroutine.
- Due to the large variety of sort algorithms,
they have provided examples of many methods
of algorithm design over the years, and so
have historical interest also.
- There is a theoretical lower bound, which we
can match, and we can usually prove good
upper bounds for sorting algorithms. The
lower bound can often be used to establish
lower bounds on algorithms that use sorting.
- Many software engineering issues arise when
implementing sorting algorithms. Obtaining
the best performance often depends more on
choosing best algorithm based on the
situation rather than on "tweaking" the code.
Sorting algorithms II.3
In Chapter 2, we saw insertion sort, which
has Theta(n^2) worst (and average) case run
time, and merge sort, which does not sort "in
place" (recall that an algorithm sorts in
place if only a constant number of elements
are ever stored outside the input array).
In Chapter 6, we introduce heapsort, which
sorts in place in O(n lg n) time. It uses an
important data structure called a heap, which
can be used to implement a priority queue.
In Chapter 7, we study quicksort, which also
sorts in place, but has Theta(n^2) worst case
run time, but Theta(n lg n) average case run
time, and a smaller constant than heapsort.
The sorting algorithms above are "comparison
sorts": they determine the sorted order by
comparing elements. Chapter 8 shows that
such sorts have a lower bound of Omega(n lg n)
for the worst case run time.
Chapter 8 also presents three algorithms for
sorting in linear time -- if their input data
satisfies special conditions: counting sort,
radix sort, and bucket sort.
Algorithm Worst Case Average Case II.4
Running Time Running Time
------------ --------------
Insertion-sort Theta(n^2) Theta(n^2)
Merge sort Theta(n lg n) Theta(n lg n)
Heapsort O(n lg n) O(n lg n) ??
Quicksort Theta(n^2) Theta(n lg n)
Counting sort Theta(k + n) Theta(k + n)
Radix sort Theta(d(k+n)) Theta(d(k+n))
Bucket sort Theta(n^2) Theta(n)
The i-th order statistic of n numbers is the
i-th smallest number in that set. One way to
obtain the i-th order statistic is to sort the
numbers in O(n lg n) time and extract the i-th
one. Chapter 9 presents two algorithms: one
that finds the i-th smallest element in O(n^2)
in the worst case, but Theta(n) on average,
and a more complicated algorithm that runs in
O(n) worst case time.
Background
Average case analyses often require some
mathematical sophistication and involve
probabilistic methods (which are reviewed in
Appendix C). Such analysis is also required
for randomized algorithms.
Chapter 6 Heapsort 6.1.1
Running Time Memory
------------ --------------
2
Insertion-sort O(n ) sorts in place
Merge-sort O(n*lg(n)) requires
auxiliary array
Heap-sort O(n*lg(n)) sorts in place
In addition to sorting, a heap can be used to
implement priority queues and is useful in
other data structures.
The term "heap" was first used with this
meaning, not allocatable storage which is
entirely different.
6.1 Heaps
Q: What is a (binary) heap?
A: A "nearly full" binary tree (only the
bottom level may not be complete)
16
/ \
/ \
14 10
/ \ / \
/ \ / \
8 7 9 3
/ \ /
/ \ /
2 4 1
6.1.2
A heap has the property that, for every node
other than a leaf, its value is greater than
or equal to that of its children.
A heap can be implemented as an array, A :
+--+--+--+--+--+--+--+--+--+--+
A |16|14|10| 8| 7| 9| 3| 2| 4| 1|
+--+--+--+--+--+--+--+--+--+--+
1 2 3 4 5 6 7 8 9 10
where nodes are described by array indexes i.
It has two attributes: A.length, the size of
the array, and A.heap-size, the number of
elements that are actually part of the heap
so A.heap-size <= A.length
The root is A[1], and given an index i of a
node, the parent, left, and right child can
easily by found by the following functions:
PARENT(i)
return floor( i/2 )
LEFT(i)
return 2i
RIGHT(i)
return 2i + 1
Note: these can be implemented by bit shifts.
Example: if i = 4 = 0100 6.1.3
then LEFT(i) => 8 = 1000 (shift left one bit)
RIGHT(i) => 9 = 1001 (shift left one bit
and "bit-or" 1)
PARENT(i) => 2 = 0010 (shift right one bit)
There are two kinds of heaps, max-heaps and
min-heaps, which satisfy a heap property:
The max-heap property:
A[parent(i)] >= A[i], all i except root
so that the largest element is in the root.
Similarly we have the min-heap property:
A[parent(i)] <= A[i], all i except root
with the smallest element in the root.
The height of a node is the number of edges
on the longest path downward to a leaf; the
height of the heap is the height of its root.
Since a heap of n elements is based on a
complete binary tree, its height, h, is
Theta(lg n) (Exercise 6.1-2). We shall see
that the basic heap operations run in time at
most proportional to h, and thus O(lg n) time.
A useful utility routine is: 6.1.4
MAX-HEAPIFY, which runs in O(lg n) time, is a
routine that maintains the heap property.
The five basic operations on a max-heap are:
BUILD-MAX-HEAP, which runs in Theta(n) time,
produces a max-heap from an unordered array.
MAX-HEAP-INSERT which runs in O(lg n) time and
inserts an element into a max-heap.
HEAP-EXTRACT-MAX which runs in O(lg n) time
and removes and returns the maximum element.
HEAP-MAXIMUM which runs in Theta(1) time and
just returns the maximum element.
HEAP-INCREASE-KEY which runs in O(lg n) time
and increases the key value of a node.
The last four allow the heap to be used as a
priority queue. Also, a heap can be used to
sort an array by using the procedure:
HEAPSORT, which runs in O(n lg(n)) time.
6.2 Maintaining the heap property 6.2.1
MAX-HEAPIFY(A,i) assumes the binary trees
at LEFT(i) and RIGHT(i) are max-heaps, but
that A[i] may be smaller than its children.
MAX-HEAPIFY lets the value at A[i] "float
down" the heap so that the subtree rooted at i
becomes a max-heap.
MAX-HEAPIFY(A,i)
1 l = LEFT(i)
2 r = RIGHT(i)
3 if l <= heap-size[A] and A[l] > A[i]
4 largest = l
5 else largest = i
6 if r <= heap-size[A] and A[r] > A[largest]
7 largest = r
8 if largest != i
9 exchange A[i] <-> A[largest]
10 MAX-HEAPIFY(A,largest)
At each step, we determine "largest", the
index of the biggest of A[i], A[LEFT(i)], and
A[RIGHT(i)]. If A[i] is biggest, MAX-HEAPIFY
terminates. Otherwise one of the children is
biggest and we swap it with A[i], which causes
the heap rooted at i to satisfy the max-heap
property. But the heap rooted at "largest"
may not satisfy the max-heap property, so we
recursively call MAX-HEAPIFY on it.
Note that MAX-HEAPIFY is tail-recursive, so
it could be implemented as a while loop.
Figure 6.2 shows how MAX-HEAPIFY works. 6.2.2
(a) The value 4 at index 2 is out of place.
16
/ \
/ \
i = 2 (4) 10
/ \ / \
/ \ / \
14 7 9 3
/ \ /
/ \ /
2 8 1
We fix it with a call to MAX-HEAPIFY(A,2),
which determines that 14 is the biggest of 4,
14, and 7, and so swaps 4 and 14, giving:
(b) The value 4 at index 4 is out of place.
16
/ \
/ \
14 10
/ \ / \
/ \ / \
i = 4 (4) 7 9 3
/ \ /
/ \ /
2 8 1
We fix it by calling MAX-HEAPIFY(A,4), 6.2.3
which determines that 8 is the biggest of 4,
2, and 8, and so swaps 4 and 8, giving:
(c) The value 4 at index 9 is all right.
16
/ \
/ \
14 10
/ \ / \
/ \ / \
8 7 9 3
/ \ /
/ \ /
2 (4) 1
i = 9
The conditions for all the if-statements are
FALSE, so MAX-HEAPIFY terminates. Here is the
trace:
MAX-HEAPIFY(A,2) |> swap A[2] <-> A[4]
MAX-HEAPIFY(A,4) |> swap A[4] <-> A[9]
MAX-HEAPIFY(A,9) |> largest <- 9
return
return |> tail recursion optimization
return |> can eliminate this
The running time of MAX-HEAPIFY on a subtree
of size n rooted at node i is Theta(1) to fix
up the relationships among A[i], A[LEFT(i)],
and A[RIGHT(i)], plus the time to run the
recursive call on one of the subtrees.
6.2.4
The size of a subtree is at most 2n/3, which
occurs when the last row is exactly half full.
So the worst case running time T(n) satisfies
the recurrence:
T(n) <= T(2n/3) + Theta(1) (*)
Which by case 2 of the "Master Theorem 4.1",
has the solution: T(n) = O(lg n). Or we may
get the same result by noticing that
MAX-HEAPIFY spends Theta(1) time at each level
and it does so at at most O(h) levels, and
h = Theta(lg n).
Or use the substition method and prove by
induction that (*) has solution T(n) = O(lg n)
i.e. that T(n) <= clg(n), for some c > 0
(definition of big-O)
Induction Hypothesis (ind. hyp.):
T(k) <= clg(k), k < n
T(n) <= T(2n/3) + a given by (*)
<= clg(2n/3) + a by ind. hyp.
= clg(2/3) + clg(n) + a by log property
= clg(n) + clg(2/3) + a
<= clg(n) + c(-1/2) + a lg(2/3) = -.58
= clg(n) - (c/2 - a)
So,
T(n) <= clg(n) if we pick c > 2a.
6.3 Building a heap 6.3.1
We can use BUILD-MAX-HEAP to convert any
A[1..n], where n = A.length into a heap. We
note that the entries in A[floor(n/2)+1..n]
are all leaves (by Exercise 6.1-7) and so are
also 1-element heaps. Then we go through the
remaining nodes applying MAX-HEAPIFY to each.
BUILD-MAX-HEAP(A)
1 A.heap-size = A.length
2 for i = floor(A.length/2) downto 1
3 MAX-HEAPIFY(A,i)
Figure 6.3 shows an example:
+--+--+--+--+--+--+--+--+--+--+
A | 4| 1| 3| 2|16| 9|10|14| 8| 7|
+--+--+--+--+--+--+--+--+--+--+
1 2 3 4 5 6 7 8 9 10
(a) 4
/ \
/ \
/ \
1 3
/ \ / \
/ \ / \
/ \ / \
2 i=5 (16) 9 10
/ \ /
/ \ /
/ \ /
14 8 7
(b) 4 6.3.2
/ \
/ \
/ \
1 3
/ \ / \
/ \ / \
/ \ / \
i=4 (2) 16 9 10
/ \ /
/ \ /
/ \ /
14 8 7
(c) 4
/ \
/ \
/ \
1 i=3 (3)
/ \ / \
/ \ / \
/ \ / \
14 16 9 10
/ \ /
/ \ /
/ \ /
2 8 7
(d) 4 6.3.3
/ \
/ \
/ \
i=2 (1) 10
/ \ / \
/ \ / \
/ \ / \
14 16 9 3
/ \ /
/ \ /
/ \ /
2 8 7
(e) i=1 (4)
/ \
/ \
/ \
16 10
/ \ / \
/ \ / \
/ \ / \
14 7 9 3
/ \ /
/ \ /
/ \ /
2 8 1
(f) finished. 16 6.3.4
/ \
/ \
/ \
14 10
/ \ / \
/ \ / \
/ \ / \
8 7 9 3
/ \ /
/ \ /
/ \ /
2 4 1
To prove BUILD-MAX-HEAP works correctly, we
use the following loop invariant:
At the start of each iteration of lines 2-3,
each node i+1, i+2, ..., n is the root of a
max-heap.
Intialization: Prior to the first iteration,
i = floor(n/2), and each node floor(n/2)+1,
floor(n/2)+2, ..., n is a leaf and thus a
root of a trivial max-heap.
Maintenance: Note that each child of i has a
higher index than i, so are both max-heaps.
This is the condition MAX-HEAPIFY(A,i) needs
to make node i a max-heap root, and it
preserves the property that nodes i+1, i+2,
..., n are roots of max-heaps. Decrementing
i re-establishes the loop invariant.
6.3.5
Termination: At termination, i = 0, so by the
loop invariant, each node 1, 2,..., n is the
root of a max-heap, node 1 in particular.
Q: What is BUILD-MAX-HEAP's running time?
One answer: sum up the cost of build-heap at
each NODE of the heap:
Total cost = number of calls x cost of
to MAX-HEAPIFY MAX-HEAPIFY
<= n/2 x O(lg n)
= O(nlg(n))
This is correct, but not asymptotically tight.
Better: sum up the cost at each _height_, h,
noting heap height is floor(lg n), and there
h+1
are at most ceiling(n/2 ) nodes at height
h and the cost of MAX-HEAPIFY is O(h). So:
Total cost <=
lg(n)
Sum ( number of nodes x MAX-HEAPIFY's )
h = 0 at height h cost at height h
6.3.6
lg(n) n
= Sum ( ceiling(----) x O(h) )
h = 0 h+1
2
lg(n) 1 inf. 1
<= Sum ( nO(h) --- ) <= O(n Sum ( h --- ) )
h = 0 2^h h = 0 2^h
1/2 inf. k x
= O(n -------- ) by Sum (kx ) = ------
2 k=0 2
(1-1/2) (1-x)
Formula A.8 Appendix A
= O(2n) = O(n)
So, BUILD-MAX-HEAP builds a max-heap from an
unordered array in _linear_ time.
Similarly, we can use BUILD-MIN-HEAP to build
a min-heap in linear time. BUILD-MIN-HEAP is
the same as BUILD-MAX-HEAP, except MAX-HEAPIFY
is replaced by MIN-HEAPIFY (Exercise 6.2-2).
6.4 The heapsort algorithm 6.4.1
Heapsort starts by building a max-heap on the
array A[1..n], where n = A.length, by using
BUILD-MAX-HEAP. Since A[1] is the largest, it
can be put in the correct position by swapping
it with A[n]. Then we "discard" node n from
the heap by decrementing A.heap-size. Then
we use MAX-HEAPIFY(A,1) to re-establish the
heap property for A[1..n-1]. We then repeat
the process on the subarray A[1..n-1], ending
with the two largest elements in A[n-1] and
A[n], and a max-heap in A[1..n-2]. We keep
doing this until we decrement the heap-size to
1, at which point we are done.
This can be shown to be correct, using a loop
invariant (Exercise 6.4-2).
HEAPSORT(A)
1 BUILD-MAX-HEAP(A) // cost O(n)
2 for i = A.length downto 2 // n-1 times
3 exchange A[1] <-> A[i] // O(1)
4 A.heap-size = A.heap-size - 1 // O(1)
5 MAX-HEAPIFY(A,1) // O(lg n)
Total cost = O(n) + (n-1)( 2*O(1) + O(lg n) )
<= O(n) + n*O(lg n)
= O(n*lg(n))
Figure 6.4 The operation of HEAPSORT. 6.4.2
The array after it has been heapified by the
call to BUILD-MAX-HEAP:
+--+--+--+--+--+--+--+--+--+--+
A |16|14|10| 8| 7| 9| 3| 2| 4| 1|
+--+--+--+--+--+--+--+--+--+--+
1 2 3 4 5 6 7 8 9 10
(a) 16
original heap / \
/ \
/ \
14 10
/ \ / \
/ \ / \
/ \ / \
8 7 9 3
/ \ /
/ \ /
/ \ /
2 4 1
(b) 14
after line 5 / \
when i = 10 / \
/ \
8 10
/ \ / \
/ \ / \
/ \ / \
4 7 9 3
/ \
/ \
/ \
2 1 (16) i = 10
(c) after line 5 10 6.4.3
when i = 9 / \
/ \
/ \
8 9
/ \ / \
/ \ / \
/ \ / \
4 7 1 3
/
/
/
2 (14) (16)
i = 9
(d) after line 5 9
when i = 8 / \
/ \
/ \
8 3
/ \ / \
/ \ / \
/ \ / \
4 7 1 2
(10) (14) (16)
i = 8
(e) after line 5 8 6.4.4
when i = 7 / \
/ \
/ \
7 3
/ \ /
/ \ /
/ \ /
4 2 1 (9)
i = 7
(10) (14) (16)
(f) after line 5 7
when i = 6 / \
/ \
/ \
4 3
/ \
/ \
/ \
1 2 (8) (9)
i = 6
(10) (14) (16)
(g) 4
after line 5 / \
when i = 5 / \
/ \
2 3
/
/
/ i = 5
1 (7) (8) (9)
(10) (14) (16)
(h) after line 5 3 6.4.5
when i = 4 / \
/ \
/ \
2 1
i = 4 (4) (7) (8) (9)
(10) (14) (16)
(i) after line 5 2
when i = 3 /
/
/
1 (3) i = 3
(4) (7) (8) (9)
(10) (14) (16)
(j) after line 5 1
when i = 2
i=2 (2) (3)
(4) (7) (8) (9)
(10) (14) (16)
(k) final array:
+--+--+--+--+--+--+--+--+--+--+
A | 1| 2| 3| 4| 7| 8| 9|10|14|16|
+--+--+--+--+--+--+--+--+--+--+
1 2 3 4 5 6 7 8 9 10
6.5 Priority queues 6.5.1
Binary heaps make efficient priority queues.
There are two kinds of priority queues:
max-priority queues and min-priority queues,
which are based on max-heaps and min-heaps
respectively.
A priority queue is a data structure for
maintaining a set S of elements, each with an
associated key value. A max-priority queue
supports the following operations:
INSERT(S,x) inserts the element x into the set
S, which we can write: S <- S union {x}
MAXIMUM(S) returns the element in S with the
largest key.
EXTRACT-MAX(S) returns the element in S with
the largest key and removes it.
INCREASE-KEY(S,x,k) increases the value of x's
key to k (assumed to be >= x's current key).
One application: job scheduling on a shared
computer. When a job is interrupted or
finished, the highest priority job is obtained
from those pending using EXTRACT-MAX. New
jobs are added or suspended jobs are put back
by using INSERT.
A min-priority queue supports the 6.5.2
operations INSERT, MINIMUM, EXTRACT-MIN, and
DECREASE-KEY. They can be used in an event-
driven simulator, where an element is an event
and the key value is the time it occurred.
See Exercise 6.5-3.
We can use a heap to implement a priority
queue. In an application, it is often
necessary to know which application object
corresponds to a priority queue element, and
vice-versa. So we store a handle (a pointer
or index, depending on the application) to the
application object in each heap element. And
similarly, the application object has a handle
(e.g. array index) back to the heap, which
needs to be updated when a heap operation
changes it. We ignore these details below.
The procedure HEAP-MAXIMUM implements MAXIMUM
in Theta(1) time:
HEAP-MAXIMUM(A)
1 return A[1]
HEAP-EXTRACT-MAX implements EXTRACT-MAX:
HEAP-EXTRACT-MAX(A)
1 if A.heap-size < 1
2 error "heap underflow"
3 max = A[1]
4 A[1] = A[A.heap-size]
5 A.heap-size <- A.heap-size - 1
6 MAX-HEAPIFY(A,1)
7 return max
6.5.3
HEAP-EXTRACT-MAX runs in O(lg n) time since
there are just a constant number of operations
on top of MAX-HEAPIFY which uses O(lg n) time.
HEAP-INCREASE-KEY(A,i,key) first updates the
key of A[i] to "key". This may violate the
max-heap property, so we push A[i] up the heap
to its proper place by exchanging it with its
parent -- similar to the insertion loop of
INSERTION_SORT.
HEAP-INCREASE-KEY(A,i,key)
1 if key < A[i]
2 error "new key < current key"
3 A[i] = key
4 while i > 1 and A[PARENT(i)] < A[i]
5 exchange A[i] <-> A[PARENT(i)]
6 i = PARENT(i)
HEAP-INCREASE-KEY runs in O(lg n) time since
the path traced from node i to the root has
length O(lg n) for an n-element heap.
It can be proved to be correct by using a
loop invariant (Exercise 6.5-5).
Figure 6.5 shows an example of its operation:
(a) 16 6.5.4
original heap / \
i = 9 / \
/ \
14 10
/ \ / \
/ \ / \
/ \ / \
8 7 9 3
/ \ /
/ \ /
/ \ /
2 (4) 1
(b) 16
heap with the / \
key increased / \
/ \
14 10
/ \ / \
/ \ / \
/ \ / \
8 7 9 3
/ \ /
/ \ /
/ \ /
2 (15) 1
i = 9
(c) 16 6.5.5
after one / \
iteration / \
of while / \
loop 14 10
/ \ / \
/ \ / \
/ \ / \
i = 4 (15) 7 9 3
/ \ /
/ \ /
/ \ /
2 8 1
(d) 16
after a second / \
iteration / \
of while / \
loop i = 2 (15) 10
/ \ / \
/ \ / \
/ \ / \
14 7 9 3
/ \ /
/ \ /
/ \ /
2 8 1
At this point the loop terminates since
A[PARENT(i)] >= A[i].
6.5.6
MAX-HEAP-INSERT(A,key) implements the INSERT
operation. It expands the heap by one element
and sets its key value to -infinity. Then it
makes a call to HEAP-INCREASE-KEY to set the
key value to "key" and restore the max-heap
property.
MAX-HEAP-INSERT(A,key)
1 A.heap-size = A.heap-size + 1
2 A[A.heap-size] = -infinity
3 HEAP-INCREASE-KEY(A, A.heap-size, key)
The running time of MAX-HEAP-INSERT is
O(lg n) since that is the running time of
HEAP-INCREASE-KEY and only a constant number
of other operations are performed.
Note: One could implement a stack or an
ordinary FIFO queue with priority queues (see
Exercise 6.5-7).
Variations on max-heaps:
1. A min-priority queue based on a min-heap.
2. One could have a d-ary heap instead of a
binary heap (Problem 6-2).