CS 5761 - Introduction to Natural Language Processing - Spring 2004

Instructor: Dr. Ted Pedersen
Office: 309 Heller Hall
Office Hours: Mon, Wed 3:00 - 4:30 pm
Email: tpederse@d.umn.edu

TA: Bridget McInnes
Office Hours (314 Heller Hall) : Weds 9 - 10 am, Thu 12 - 1 pm, Fri 8 - 10 am
Email: bthomson@d.umn.edu

Please consider using email for smaller or fairly precise questions. We can usually respond quite quickly and at odd hours via email. Send any email questions to both tpederse and bthomson.

Class Web Page: http://www.d.umn.edu/~tpederse/Courses/CS5761-SPR04/class.html

Course Objectives:

Explore techniques for creating computer programs that analyze, generate, and understand natural human language. Topics include syntactic analysis, semantic interpretation, and discourse processing. Applications selected from speech recognition, conversational agents, machine translation, and language generation. Substantial programming project required.

Required Text:

Speech and Language Processing : an Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition by Daniel Jurafsky and James H. Martin. Prentice-Hall. There is a supporting Web Site with quite a bit of information.

There is a copy of the text on 2-hour reserve in the library. This is not meant to replace your own copy, but may prove useful if you find yourself in the library without your text.

Reading assignments will be given in the lecture and posted here. Our focus will be on written language, and we will not consider the many very interesting issues involved in processing spoken language. We will also augment readings from the text with outside material that I will provide.

Suggested Perl Texts:

We will do our programming assignments in Perl. While we will discuss Perl from time to time in the lab and lecture, there will be a fair bit of self-study required. As such you are strongly advised to have at least one of the following at your disposal:

Learning Perl by Randal Schwartz and Tom Phoenix. O'Reilly Publishers. AKA the Llama book. You can get this book from amazon.com , or any other bookstore. You can also get an electronic copy of this book from the UMD library via Safari! This book takes a tutorial approach and is especially good if you have limited C or Unix experience.

Programming Perl by Larry Wall, Tom Christiansen and Jon Orwant. O'Reilly Publishers. AKA The Camel Book. You can get this book from amazon.com or any other bookstore. You can also get an electronic copy of this book from the UMD library via Safari! This book is more like a reference manual, although it is still very readable. This is a good choice if you have some C and Unix background.

Mastering Regular Expressions: Powerful Techniques for Perl and Other Tools by Jeffrey E. Friedl. O'Reilly Publishers. AKA The Owl Book. You can get this book from amazon.com , or any other bookstore. You can also get an electronic copy of this book from the UMD library via Netlib! This is an in-depth treatment of regular expressions in Perl and is quite invaluable. It is more advanced than either Learning Perl or Programming Perl.


You must have already taken and passed CS 2511 (Software Development) and Math 3355 (Discrete Mathematics). If you have not already taken and passed both of these classes you must drop this class.

Grading Basis: Programming Assignments and Project:

Programming assignments and your project are to be completed in Perl. There will be 4 to 6 programming assignments and one project. Each programming assignment is worth 10 points.

All programming assignments and your project will be demonstrated during designated lab sessions. You should also submit an electronic copy of your source code to the webdrop prior to the designated demo session. If you aren't familiar with webdrop, here are the instructions. There is no other way to submit your programming assignments or project. Failure to submit AND demo on time will result in a zero.

Please comment your code. I must be able to understand what your code does simply by reading the comments. This understanding should extend down to the details of your code. So do not simply describe the input and output, also include comments that describe your particular algorithm and coding techniques. I reserve the right to deduct some of all of the points from an assignment if you don't comment your code to this degree.

Unless otherwise indicated, all assignments and the project are to be done individually. You are required to write your own code. Unless otherwise specified, you must only turn in code that you personally wrote. The only possible exception to this is if I tell you to use a module that is available in a book or online archive. However, I will clearly indicate when this is permissible. If you submit code that is not your own, I reserve the right to give you a zero on that assignment.


There will be unannounced quizzes in the lecture that may concern any topics discussed up to and including the previous lecture session. They may also include any readings assigned up to and including the previous week. There will be scheduled quizzes during the lab that will require you to write and demo a short program during the lab period. There will be approximately 8-10 quizzes, and your lowest score will be dropped. Quizzes may only be made up in the case of documented personal emergencies. Each quiz is worth 10 points.


All exams are closed-note, closed-book.

You must take exams at the scheduled time and place. Exams will not be given early. Make-up exams will only be offered in the event of documented personal emergencies.

Grading Scale: Grades will be curved only if the overall median grade at the end of the semester is less than 75. In this case grades will be curved so that the median class grade is 75.

Equal Access:

If you have any disability (either permanent or temporary) that might affect your ability to perform in this class, please inform me at the start of the semester. I may adapt methods, materials, or testing so that you can participate equitably. To learn about the services that UMD provides to students with disabilities, contact the Access Center, 138 Kplz, phone 8217 or visit their web page.

By: Ted Pedersen - tpederse@d.umn.edu