Part I Foundations I.1
Outline:
Chapter 1: Gives an overview of algorithms and
their role in computing. The term algorithm
is defined and a case is made that
algorithms are a kind of technology.
Chapter 2: Gives examples of algorithms and
a pseudocode used to express them. These
are examples of the incremental and divide-
and-conquer techniques for algorithm design.
Chapter 3: Defines "asymptotic notation",
which is used to measure running times and
memory requirements of algorithms.
Chapter 4: Shows how to solve recurrences that
usually arise from run-time analysis of
recursive algorithms, and divide-and-conquer
in particular.
Chapter 5: Treats probabilistic analysis and
randomized algorithms. Probabilistic
analysis is useful in analyzing average-case
performance of algorithms. An algorithm is
said to be randomized if its behavior is
determined partly by using a random-number
generator -- to ensure average-case
performance for example.
Chapter 1 The Role of Algorithms in 1.1.1
Computing
1.1 Algorithms
Informal definition: an algorithm is any well-
defined (computational) procedure that takes
a value or set of values as input and
produces a value or set of values as output.
Thus an algorithm is a sequence of
(computational) steps that transform the
input into the output.
Also we can view an algorithm as a tool for
solving a computational problem, specified in
terms of its input/output relationship.
Example: the sorting problem:
Input: a sequence of n
numbers.
Output: a permutation of
the input with a_1' <= a_2' <= ... <= a_n'.
A particular input is called an instance of
a computational problem. For example, the
input sequence <31,41,59,26,41,58> is an
instance of the sorting problem.
Definition: we say an algorithm is correct if
it halts with correct output for every input
instance.
What kinds of problems are solved by 1.1.2
algorithms?
Examples of "real-world" problems solved by
algorithms:
- The Human Genome Project: identifying all
100,000 genes in human DNA, determining the
sequences of the 3 billion chemical base
pairs the make up the DNA, storing this
information in databases, and developing
tools for analysis of this data.
- On the Internet: routing data and quickly
locating web pages with a search engine.
- Electronic commerce: using secure passing of
critical information such as credit card
numbers, passwords, etc.
- In manufacturing and business: allocating
resources in the most beneficial way. This
can often by solved by linear programming.
Examples of problems solved in this book:
- Given a road map marked with distances
between cities, find the shortest route from
one city to another -- solved using graph
algorithms in Chapter 24.
- Solving integer equations ax = b (mod n),
which is used in cryptography (Chapter 31).
1.1.3
- Given a sequence of n
matrices of different (compatible) sizes,
find their product A_1xA_2x...xA_n. Since
matrix multiplication is associative, this
can be done in many ways. Dynamic
programming (Chapter 15) shows how to do the
matrix multiplications with the fewest
element-wise multiplications.
- Given n points in the plane, find its convex
hull, the smallest polygon containing the
points (Chapter 33).
Two characteristics of interesting algorithms:
1. There are many possible algorithms that
solve the problem, most of which are not
what we want.
2. There are practical, i.e. "real-world"
applications.
Data structures
Definition: A data structure is a way to store
and organize data so as to facilitate access
and modifications. No one data structure is
best for all situations, so it is important
to know strengths and weaknesses of a
selection of data structures.
Technique 1.1.4
Although many algorithms have been published,
you may encounter a problem for which you
can't find a solution. In that case, you must
design and analyze an algorithm yourself --
this book presents such design and analysis
techniques.
Hard problems
We are interested in efficient algorithms -
how long the algorithm takes to compute its
result. Some problems have no known efficient
solution. A subset of these problems are the
NP-complete ones (Chapter 34).
Why are NP-complete problems interesting?
1) No one has proved that an efficient
algorithm cannot exist.
2) The set of NP-complete problems has the
property that if there is an efficient
algorithm for any one of them, then
efficient algorithms exist for all of them.
3) Several NP-complete problems are similar to
problems with efficient solutions - a small
change in the problem can result in a big
change in the efficiency of its solution.
1.1.5
It is useful to know about NP-complete
problems since they often show up in real-life
applications. Probably the best known example
is the "traveling-salesman problem". Rather
than attempting to find an efficient solution
for such a problem (which is unlikely), it is
better to come up with an approximate
solution efficiently. Chapter 35 discusses
such "approximate algorithms".
Also, it is useful to know about problems
that have been proved to have no solutions.
Two examples are the halting problem (proved
insolvable by Alan Turing), and the word
problem for groups (proved insolvable by
Petr Sergeevich Novikov in 1953).
1.2 Algorithms as a technology 1.2.1
Even if computers were infinitely fast and
memory was free, it would still be important
to study algorithms since we would still want
to know that they terminated - and with the
correct answer. Then you would probably use
the easiest algorithm to implement.
Of course computers are not infinitely fast,
nor is memory free, so it is important that
algorithms be efficient in time and space.
Efficiency
As an example of comparative efficiency, we
consider insertion sort and merge sort - both
used to sort n items. Insertion sort takes
time roughly equal to c1 x n^2, where c1 is a
constant (independent of n). Merge sort takes
time roughly equal to c2 x n x lg(n), where
lg(n) = log_2(n) and c2 is a constant (which
is usually larger than c1).
As a concrete example, we assume insertion
sort has been programmed in machine language
by a good programmer and runs on computer A,
which executes a billion instructions per
second. And merge sort has been programmed by
a so-so programmer in some high-level language
and runs on computer B, which can execute ten
million instructions per second.
1.2.2
Taking this into account, assume that
insertion sort takes 2n^2 instructions to sort
n numbers, and merge sort takes 50n x lg(n).
To sort one million numbers, computer A with
insertion sort takes
2 x (10^6)^2 instructions
------------------------- = 2000 seconds
10^9 instructions/second
and computer B with merge sort takes:
50 x 10^6 lg(10^6) instructions
------------------------------- = about 99.66
10^7 instructions/second seconds
To sort 10 million numbers, insertion sort
takes about 2.3 days, and merge sort takes
about 19.4 minutes -- about 170 times faster.
Algorithms and other technologies
The above example shows that algorithms are
just as important as other technologies when
evaluating system performance.
1.2.3
How important are algorithms when compared to
other advanced computer technologies such as:
1) hardware with high clock rates, pipelining,
and superscalar architectures,
2) easy-to-use GUIs,
3) object-oriented systems, and
4) local-area and wide-area networking?
Both GUIs (Graphical User Interfaces) and
especially hardware are designed by using
algorithms. Object-oriented software is
created with compilers, interpreters, and
assemblers, all of which make heavy use of
algorithms. Heavy use is made of algorithms
also for both designing and routing data
around networks.
Also, as hardware becomes faster, it becomes
possible to solve problems that were too big
before. But these solutions can only be found
by using the most efficient algorithms.
According to the authors of the text:
"A solid base of algorithmic knowledge and
technique is one characteristic that
separates the truly skilled programmers from
the novices."