CS 8995 - Corpus Based Natural Language Processing - Spring 2001
Instructor: Dr. Ted Pedersen
Office: 309 Heller Hall
Office Hours: MWF 3-4 pm
Natural Language Processing is concerned with developing techniques
that allow us to analyze, understand, and generate human language with
computers. Corpus Based Natural Language Processing is based on the
premise that we can use existing sources of online text to achieve these
goals. This course will provide students with a theoretical and practical
understanding of the techniques used to develop empirical approaches to
syntactic analysis, semantic understanding, and discourse processing.
Specific topics to be discussed include word sense disambiguation and
machine translation. Practical work in the course will involve the design
and implementation of natural language processing tools.
Foundations of Statistical Natural Language Processing by
Christopher Manning and Hinrich Schutze. MIT Press.
There is a
supporting Web Site
with quite a bit of information.
There is a copy of the text on 2-hour reserve in the library. You will
still need to have your own copy of the text, however, this might prove
useful if you forget your book, etc.
As of 12/14/00 the lowest online price I have seen for the text is
$48 at bn.com. This compares to a list price of $60, which is
approximately the price at the bookstore.
Reading assignments will be given in the lecture and posted
We will do our programming assignments in Perl. While we will discuss
Perl from time to time in the lecture, there will be a fair bit of
self-study required. As such you are strongly advised to have at least one
of the following at your disposal:
Learning Perl by Randal Schwartz and Tom Christiansen. O'Reilly
Publishers. You can get this book from
or most any bookstore. This takes a tutorial approach and is
especially good if you have limited C or Unix experience.
Programming Perl by Larry Wall, Tom Christiansen and Randal Schwartz.
You can get this book from
or most any bookstore. This book is more like a reference
manual, although it is still very readable. This is a good choice
if you have extensive C and Unix backgrounds.
This class is only open to currently enrolled CS graduate students.
- Programming Assignments : 40%
- Final Project : 20%
- Exam1 : 20% (Date Monday Feb 26)
- Exam2 : 20% (Date Monday April 23)
Programming assignments are to be completed in Perl.
Programming Assignments must be submitted on time. Late work is not
accepted and will result in a score of zero for that assignment.
We will use an automatic turnin procedure that will require you to log
into hh33812 to perform. Further details can be found
You are expected to write your own code. If you turn in code that is not
your own (e.g., code taken from a book or online archive, code written
by a colleague or classmate, etc) I reserve the right to immediately
dismiss you from the class.
- 93 - 100 = A, 90 - 92 = A-
- 87 - 89 = B+, 83 - 86 = B, 80 - 82 = B -
- 77 - 79 = C+, 73 - 76 = C, 70 - 72 = C -
- 67 - 69 = D+, 63 - 66 = D, 60 - 62 : D -
- 0 - 59 = F
All exams are closed-note, closed-book.
You must take exams at the scheduled time and place. Exams will
not be given early. Make-up exams will only be offered in the
event of documented personal emergencies.
Last update: 01/11/2001