CS 5761 - Introduction to Natural Language Processing - Spring 2002

Instructor: Dr. Ted Pedersen
Office: 309 Heller Hall
Office Hours: Tue, Thu 3:30 - 5:00 pm
Email: tpederse@d.umn.edu

TA: Siddharth Patwardhan ("Sid")
Office Hours (HH314) : Wed 3:00-5:00 pm, Fri 3:00-4:00 pm
Email: patw0006@d.umn.edu

Class Web Page: http://www.d.umn.edu/~tpederse/Courses/CS5761/class.html

Course Objectives:

Explore techniques for creating computer programs that analyze, generate, and understand natural human language. Topics include syntactic analysis, semantic interpretation, and discourse processing. Applications selected from speech recognition, conversational agents, machine translation, and language generation. Substantial programming project required.

Required Text:

Speech and Language Processing : an Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition by Daniel Jurafsky and James H. Martin. Prentice-Hall. There is a supporting Web Site with quite a bit of information.

There is a copy of the text on 2-hour reserve in the library. This is not meant to replace your own copy, but may prove useful if you find yourself in the library without your text.

Reading assignments will be given in the lecture and posted here. Our focus will be on written language, and we will not consider the many very interesting issues involved in processing spoken language. We may cover portions of Chapters 1, 2, 3, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 19, and 21.

Suggested Perl Texts:

We will do our programming assignments in Perl. While we will discuss Perl from time to time in the lecture, there will be a fair bit of self-study required. As such you are strongly advised to have at least one of the following at your disposal:

Learning Perl by Randal Schwartz and Tom Phoenix. O'Reilly Publishers. AKA the Llama book. You can get this book from amazon.com , or any other bookstore. You can also get an electronic copy of this book from the UMD library! This book takes a tutorial approach and is especially good if you have limited C or Unix experience.

Programming Perl by Larry Wall, Tom Christiansen and Jon Orwant. O'Reilly Publishers. AKA The Camel Book. You can get this book from amazon.com or any other bookstore. It is also on 2 hour reserve at the UMD library. This book is more like a reference manual, although it is still very readable. This is a good choice if you have extensive C and Unix backgrounds.

Mastering Regular Expressions: Powerful Techniques for Perl and Other Tools by Jeffrey E. Friedl. O'Reilly Publishers. AKA The Owl Book. You can get this book from amazon.com , or any other bookstore. You can also get an electronic copy of this book from the UMD library! This book is a very in-depth treatment of regular expressions in Perl and is really invaluable. It is more advanced than either Learning Perl or Programming Perl.


You must have already taken and passed CS 2511 (Software Development) and Math 3355 (Discrete Mathematics). If you have not already taken and passed both of these classes you must drop this class.

Grading Basis: Programming Assignments and Project:

Programming assignments and your project are to be completed in Perl. There will be 4 to 6 programming assignments and one project. The assignment and project together account for 25% of your grade. The exact grading breakdown between the assignments and project is yet to be determined. Each programming assignment is worth 10 points.

All programming assignments and your project will be demonstrated during designated lab sessions. You should also submit an electronic copy of your source code to the TA prior to the designated demo session. (His email address is patw0006@d.umn.edu.) There is no other way to submit your programming assignments or project. Failure to submit AND demo on time will result in a zero.

Any code you submit should be commented. I must be able to understand what your code does simply by reading the comments. This understanding should extend down to the details of your code. So do not simply describe the input and output, also include comments that describe your particular algorithm and coding techniques. Failure to comment to this degree will result in a zero.

All assignments and the project are to be done individually. You are required to write your own code. Unless otherwise specified, you must only turn in code that you personally wrote. The only possible exception to this is if I tell you to use a module that is available in a book or online archive. However, I will clearly indicate when this is permissible. Violations of this policy will result in severe grading penalties and/or failure in the class.


There will be unannounced quizzes in the lecture that may concern any topics discussed up to and including the previous lecture session. They may also include any readings assigned up to and including the previous week. There will be scheduled quizzes during the lab that will require you to write and demo a short program during the lab period. There will be approximately 8-10 quizzes, and your lowest score will be dropped. Quizzes may only be made up in the case of documented personal emergencies. Each quiz is worth 10 points.


All exams are closed-note, closed-book.

You must take exams at the scheduled time and place. Exams will not be given early. Make-up exams will only be offered in the event of documented personal emergencies.

Grading Scale: Grades will be curved only if the overall median grade at the end of the semester is less than 75. In this case grades will be curved so that the median class grade is 75.

Lecture Notes:

After a few years of experimentation, I have concluded that posting lecture notes online isn't especially helpful, and can in some cases act as a deterrent to attending lecture and/or taking careful notes in class. Since attendance and note taking are important life skills in general, and will help you in this class in particular, I will not be posting or distributing my lecture notes.

However, if you have some temporary or permanent disability that affects your ability to take notes then please let me know at the start of the semester and I will make alternate arrangements with you.

Equal Access:

If you have any disability (either permanent or temporary) that might affect your ability to perform in this class, please inform me at the start of the semester. I may adapt methods, materials, or testing so that you can participate equitably. To learn about the services that UMD provides to students with disabilities, contact the Access Center, 138 Kplz, phone 8217 or visit their web page.

By: Ted Pedersen - tpederse@d.umn.edu
Last update: 01/20/2002