Chapter 2 Getting Started 2.1.1
2.1 Insertion sort
 Solves the sorting problem:
Input: A sequence of n numbers:
also called keys
Output: A permutation of the input sequence
such that: a_1 <= a_2 <= ... <= a_n
Insertion sort:
 Efficient for small n
 How many people sort cards
 We give pseudocode for InsertionSort,
a procedure that takes as a parameter an
array A[1..n] of length n ( = length[A] ).
 It sorts "in place": the elements are
moved around within A, with at most a
constant number of them stored outside of
A at any one time.
 The array A contains the sorted output
sequence when InsertionSort is finished
InsertionSort 2.1.2
1 for j < 2 to length[A]
2 do key < A[j]
3 > Insert A[j] into the sorted sequence
> A[1..j1].
4 i < j  1
5 while ( i > 0 ) and ( A[i] > key )
6 do A[i+1] < A[i]
7 i < i  1
8 A[i+1] < key
Loop invariants and the correctness of 2.1.3
insertion sort
 A loop invariant for the outer loop of
insertion sort:
At the start of each iteration of the for
loop of lines 18, the subarray A[1..j1]
consists of the elements originally in
A[1..j1], but in sorted order.
 We can show insertion sort is correct if
we show 3 things about the loop invariant:
Initialization: It is true prior to the
first iteration of the loop.
Maintenance: If it is true before an
iteration of the loop, it remains true
before the next iteration of the loop.
Termination: When the loop terminates, the
invariant gives us a useful property that
helps show that the algorithm is correct.
2.1.4
These properties hold for insertion sort:
Initialization: Prior to the first
iteration, j = 2, so A[1..j1] is just
A[1..1] = A[1], which is sorted.
Maintenance: Informally, the code in lines
27 moves A[j1], A[j2], A[j3], etc.
one position to the right until the
proper position for A[j] is found, at
which point it is inserted. Note: a
more formal argument would also require
a loop invariant for the inner loop.
Termination: The outer loop terminates when
j > n, i.e. when j = n+1. Substituting
n+1 for j in the loop invariant says that
A[1..n+11] = A[1..n] consists of the
elements originally in A[1..n], but in
sorted order. But A[1..n] is the entire
array  which is sorted, so the
algorithm is correct.
Pseudocode conventions 2.1.5
1. Indentation indicates block structure
2. The looping constructs, while, for, and
repeat (dowhile of C/Java), and the
conditional ifthenelse have similar
meanings to those in Pascal (and C/Java),
except that the loop counter is defined to
be one more than the loop bound at the
termination of a for loop.
3. Comments are preceded by >
4. Multiple assignments i < j < e assign
both variables i and j the value of the
expression e.
5. Variables (such as i, j, and key) are
local to the given procedure. Global
variables are not used without explicit
indication.
6. Array elements are accessed by specifying
the array name followed by the index in
square brackets: e.g. A[i] Ellipsis is
used to indicate a subarray: A[2..j]
indicates the subarray consisting of
A[2], A[3], ..., A[j]
Pseudocode conventions (continued) 2.1.6
7. Compound data are organized into objects,
which are composed of data or fields.
A particular field is accessed by its name
followed by the name of its object in square
brackets: e.g length[A] for the attribute
"length" of the array A.
A variable representing an array or object
is treated as a pointer to the data
representing the array or object. For all
fields f of an object x, setting y < x sets
f[y] to f[x]. If we now set f[x] < 3 then
f[y] = 3 also since x and y point to the
same object after the assignment y < x
If a pointer points to no object at all,
we give it the special value NIL
8. Parameters are passed by VALUE (copied) to
a procedure, so if it changes the parameter,
the calling procedure sees no change. Note
that pointers to objects are copied, but the
object's fields are not, so if x is a
parameter, assigning x < y in the procedure
is not visible, but f[x] < 3 is visible.
9. The operators "and" and "or" are "short
circuiting". So in "x and y", x is
evaluated first  if x is FALSE, y is not
evaluated and the value of "x and y" is
FALSE.
2.2 Analyzing algorithms 2.2.1
 Analyzing an algorithm means predicting
the resources (time, memory) it requires.
 We will assume a simplified model of a
computer  a randomaccess machine (RAM),
which is sequential.
The text will perform many analyses such as
the following.
Analysis of insertion sort
 The time taken by an algorithm depends on
the input, in particular the "input size",
which depends on the problem. In the case
of many algorithms, it is the number of
items in the input  e.g. the size of an
array.
 The running time of an algorithm on a
particular input is the number of primitive
operations or "steps" executed. We assume
that each line of pseudocode takes a
constant amount of time to execute (which
may be different for different lines).
2.2.2
let t = the number of times the "while"
j test is made during the jth
iteration of "for"
t = 1 in best case (array already
j sorted)
t = j in worst case (reverse sorted)
j
recall summation rules:
n
Sum ( j ) = n(n+1)/2  1
j=2
n
Sum ( j  1 ) = n(n1)/2
j=2
Analysis of insertion sort 2.2.3
times best worst
line cost executed case case

1 c n n n
1
2 c n1 n1 n1
2
3
4 c n1 n1 n1
4
n n n
5 c Sum (t ) Sum (1)=n1 Sum (j)
5 j=2 j j=2 j=2
n n n
6 c Sum (t 1) Sum (0)=0 Sum (j1)
6 j=2 j j=2 j=2
n n n
7 c Sum (t 1) Sum (0)=0 Sum (j1)
7 j=2 j j=2 j=2
8 c n1 n1 n1
8
Total running time T(n) = sum of the products:
n 2.2.4
T(n) = c n + c (n1) + c (n1) + c Sum t +
1 2 4 5 j=2 j
n n
c Sum (t 1) + c Sum (t 1) + c (n1)
6 j=2 j 7 j=2 j 8
Best case: T(n) = c n + (c + c + c + c )(n1)
1 2 4 5 8
Worst case: T(n) = c n + c (n1) + c (n1) +
1 2 4
c (n(n+1)/2  1) + (c + c )n(n1)/2 + c (n1)
5 6 7 8
2
= (c /2 + c /2 + c /2)n +
5 6 7
(c + c + c + c /2  c /2  c /2 + c )n
1 2 4 5 6 7 8
2
 (c + c + c + c ) = an + bn + c
2 4 5 8
Often used as measure of an algorithm because:
1. It is an upper bound (guarantee)
2. It can occur often
Example: searching for a nonexistent item
2.2.5
3. It is often not much worse than the
"average" case
Example: in the average case of insertion
sort, t = j/2, which is still quadratic.
j
Order of growth
In measuring the cost of an algorithm, we
make one more simplifying assumption  we
ignore slower growing terms and also drop
the (positive) coefficient of the fastest
growing term. We call the resulting term
the rate of growth or order of growth.
For example, gathering terms in the worst
case analysis of insertion sort gives a
running time of:
2
an + bn + c
As n grows, the terms bn + c and the
factor a become relatively insignificant.
So we make a simplification, saying that
the algorithm has a worstcase running time
of
2
Theta(n ) ("Theta of nsquared")
2.3 Designing algorithms 2.3.1
The approach taken by insertionsort is
"incremental", that is, it makes a sorted
array of 1 element,
2 elements,
3 elements,
...
n elements
2.3.1 The divideandconquer approach
Many useful algorithms are recursive: to solve
a problem, they call themselves recursively to
solve closely related subproblems. They
follow the "divideandconquer" approach: they
break the problem into smaller pieces, solve
the pieces, and then combine those solutions
to get a solution to the original problem.
The divideandconquer method involves three
steps at each level of the recursion:
Divide: the problem into subproblems.
Conquer: the subproblems by solving them
recursively, or directly if small enough.
Combine: the solutions of the subproblems into
the solution to the original problem.
Example: merge sort: 2.3.2
Divide: Divide the nelement sequence into two
subsequences of size n/2 elements each.
Conquer: Sort the two subsequences recursively
using merge sort.
Combine: Merge the two sorted subsequences to
produce the sorted answer.
The base case of the recursion occurs when
the sequence is of length 0 or 1, in which
case it is sorted.
The key operation is Merge(A,p,q,r) where A
is an array, and p <= q < r are array indices.
Merge() assumes that the subarrays A[p..q] and
A[q+1..r] are already sorted. It merges them
to form a single sorted subarray that replaces
A[p..r].
The Merge() operation below takes Theta(n)
time, where n = r  p + 1 is the number of
elements being merged.
A simplification to avoid checking for empty
arrays: put a "sentinel" element, "infinity"
(abbreviated "inf"), which is larger than any
actual element, at the end of the two
subarrays to be merged.
Merge(A,p,q,r) 2.3.3
1 n1 < q  p + 1
2 n2 < r  q
3 create arrays L[1..n1+1] and R[1..n2+1]
4 for i < 1 to n1
5 do L[i] < A[p+i1]
6 for j < 1 to n2
7 do R[j] < A[q+j]
8 L[n1+1] < inf
9 R[n2+1] < inf
10 i < 1
11 j < 1
12 for k < p to r
13 do if L[i] <= R[j]
14 then A[k] < L[i]
15 i < i+1
16 else A[k] < R[j]
17 j < j+1
Loop invariant for loop of lines 1217:
At the start of each iteration of the for
loop, A[p..k1] contains the kp smallest
elements of L and R in sorted order. Also,
L[i] and R[j] are the smallest elements of
their arrays that haven't been copied into A.
Proof of correctness of the for loop 2.3.4
Initialization: Initially, k = p, so the
subarray A[p..k1] is empty and contains the
k  p = 0 smallest elements of L and R. And
since i = j = 1, L[i] and R[j] are the
smallest elements of L and R not copied to A
Maintenance: Two cases: If L[i] <= R[j], then
L[i] is the smallest element not copied to A
and after line 14 copies L[i] to A[k], the
subarray A[p..k] will contain the kp+1
smallest elements. Incrementing k (line 12)
and i (line 15) reestablishes the invariant.
Similarly, if L[i] > R[j] the invariant is
also maintained by lines 16 and 17.
Termination: At termination, k = r+1, so
A[p..k1] = A[p..r] contains the kp = rp+1
smallest elements of L and R in sorted order
and since L and R together have n1 + n2 +2 =
r  p +3 elements, all but the two sentinels
have been copied back to A.
Note that Merge() runs in Theta(n) time since
lines 13 and 811 take constant time, the for
loops of lines 47 take Theta(n1 + n2) =
Theta(n) time, and there are n iterations of
the for loop of lines 1217 each of which
takes a constant time.
MergeSort(A,p,r) 2.3.5
1 if p < r > if p >= r, the array is sorted
2 then q < floor((p+r)/2) > floor(x) =
> greatest int <= x
3 MergeSort(A,p,q)
4 MergeSort(A,q+1,r)
5 Merge(A,p,q,r) > merge A[p..q]
> and A[q+1..r]
To sort an entire sequence :
MergeSort(A,1,n), where n = length[A]
Example: <1 2 2 3 4 5 6 6> final
^

merge
/ \
<2 4 5 6> <1 2 3 6>
^ ^
 
merge merge
/ \ / \
<2 5> <4 6> <1 3> <2 6>
^ ^ ^ ^
   
merge merge merge merge
       
<5> <2> <4> <6> <1> <3> <2> <6> initial
2.3.2 2.3.6
Analyzing divideandconquer algorithms
The running time of a recursive algorithm can
be described by a recurrence equation or
recurrence, which describes the total running
time, T(n), in terms of the running time of
the algorithm with smaller inputs.
Let n = size of the data
a = number of subproblems
n/b = size of subproblem (often, a = b)
General recurrence equation (based on the
three steps of divideandconquer):
T(n) = Theta(1) if n <= n0 (a small constant)
= aT(n/b) + D(n) + C(n) otherwise
= cost of cost of cost of
conquer divide combine
Analysis of merge sort
Recurrence equation for Mergesort:
T(n) = Theta(1) if n = 1
= 2T(n/2) + Theta(1) + Theta(n) if n > 1
We can assume: 2.3.7
T(n) = c if n = 1
= 2T(n/2) + cn if n > 1
So the recursion tree for Mergesort (assuming
n is a power of 2) is:
Cost
T(n) 
cn cn
/ \
/ + \ +
/ \
T(n/2) T(n/2)
cn/2 cn/2 2cn/2 = cn
/ \ / \
/ + \ / + \ +
/ \ / \
T(n/4) T(n/4) T(n/4) T(n/4)
cn/4 cn/4 cn/4 cn/4 4cn/4 = cn
. . .
. . .
c ... c nc = cn
Height of tree = lg(n) + 1, where lg is the
logarithm with base 2
Total cost = cn * (lg(n) + 1)
= Theta(n*lg(n))
Recall that Insertionsort was Theta(n^2)
Now nlg(n) < n^2 for n > 0, but due to the
overhead of recursion, Mergesort beats
Insertionsort for n > 30 or so.