Assignment 5 -- due Friday, October 14
CS 4521 Fall Semester, 2005
18 Points
Topics: Quicksort, randomized quicksort, and their run times
The assignment:
consists of two parts:
In the first part, you are asked to implement quicksort,
randomized quicksort, and heapsort,
and compare their running times with merge sort;.
in the second part, you are asked to prove facts about the run time
of quicksort.
Part 1: Comparing the run times of sorting routines (10 points)
Implement the following three sorting routines: heapsort (page 136),
quicksort (page 146), and randomized-quicksort (page 154).
After you have coded all three of the sort routines, load the array
A
with the integers 1, 2, ...,
n
in order, i.e.,
A[i] = i.
This should produce worst-case performance for quicksort; the run times
of the other two sorts should by independent of the order of the input.
Make timing runs for each of the sorts for the largest value of
n
that you used for merge sort in Assignment 2
and record the times for each of the four sorts:
merge sort (from Assignment 2) and the three new ones.
Compare the results -- which is fastest? slowest?
Discussion: for randomized-quicksort,
you can use the C library routine
rand()
to obtain (pseudo-) random number (int) in the range 0 to
RAND_MAX (for example,
RANDOM(p,r) could simply return
p + rand() % (r - p + 1)
-- or, better: replace Line 1 of
RANDOMIZED-PARTITION()
with
i <- p + rand() % (r - p + 1) ).
Note:
rand()
&
RAND_MAX
are
declared in
<stdlib.h>.
What To Hand In:
The results of your timing runs and the
code for each of your new sort routines.
Part 2: Proving facts about quicksort's run time (8 points)
Do the following Exercises from the text, which ask you to prove
facts about quicksort's run time.
- (4 points) Exercise 7.4-2, page 159.
Similarly to Section 7.4.1, we want to solve the recurrence:
T(n) = min ( T(q) + T(n - q - 1) ) + Θ(n)
where the minimum is taken over q in [0,n-1].
The problem suggests that T(n) ≥ cnlg(n), which can be shown to be
true by the substitution method for a suitable value of c.
Using the Fact: f(q) = q lg q + (a-q) lg (a-q) attains its
minimum at q = a/2 (and substituting n-1 for a), we have
T(n) ≥ c(n-1) lg((n-1)/2) + Θ(n)
= cn lg(n-1) - c( lower order terms in n ) + Θ(n)
≥ cn lg(n/2) - c( lower order terms in n ) + Θ(n) (for what n?)
= cn lg n - c( new lower order terms in n ) + Θ(n)
≥ cn lg n
for a suitable value of c (how should it be chosen?).
You can assume the Θ(n) term can
be replaced by c1n for some c1 > 0.
Don't forget to prove the Fact.
- (4 points) Exercise 7.4-4, page 159.
Actually, prove that the average case run time of quicksort is
Ω(n lg n).
Start with the second equation in (7.4) on page 158:
E[X] = Σi=1n-1
Σk=1n-i 2/(k+1)
and make the substitutions i' = n-i and then k' = k+1 to obtain:
E[X] = 2 Σi'=1n-1
Σk'=2i'+1 1/k'
Then make one last substitution m = i'+1 to obtain:
E[X] = 2 Σm=2n
Σk'=2m 1/k'
> 2 Σm=2n
∫2m+1 1/x dx (see page 1067)
= 2 Σm=2n [ln(m+1) - ln 2] (*)
Now ln(m+1) - ln 2 > ln m - ln 2 ≥ 1/2 ln m for m ≥ 4
(you need to show this)
So (*) is greater than
2[ln(2+1) - ln 2] + 2[ln(3+1) - ln 2] +
2 Σm=4n 1/2 ln m
= 2 ln 3 - 2 ln 2 + 2 ln 4 - 2 ln 2 +
Σm=4n ln m
= 2 ln 3 +
Σm=4n ln m
(since ln 4 = ln 22 = 2 ln 2)
> ln 3 + ln 2 + ln 1 +
Σm=4n ln m
(since ln 3 > ln 2, and ln 1 = 0)
= Σm=1n ln m
= 1/lg(e) Σm=1n lg m
(by a property of logarithms - the last equation on page 55)
≥ 1/lg(e) 1/4 n lg n for n ≥ 4
by the proof that lg(n!) = Ω(n lg n) in Assignment 3.
Thus, letting c = 1/4lg(e) and n0 = 4,
we have E[X] = Ω(n lg n), by the definition of Ω.
Page URL: http://www.d.umn.edu
/~ddunham/cs4521f05/assignments/a5/assignment.html
Page Author: Doug Dunham
Last Modified: Tuesday, 04-Oct-2005 17:41:39 CDT
Comments to: ddunham@d.umn.edu