CS 8761 - Natural Language Processing - Fall 2004

Instructor: Dr. Ted Pedersen
Office: 309 Heller Hall
Office Hours: Mon & Wed 4:45pm-6:00pm, Thu 1:00pm-2:00pm
Email: tpederse@umn.edu

Course Web Page: http://www.d.umn.edu/~tpederse/Courses/CS8761-FALL04/class.html

Course Mailing List: http://groups.yahoo.com/group/duluth-cs8761-fall2004/

Please make sure you are signed up for this mailing list, and that you check it regularly. There are online archives available, and you are also free to post your own questions and comments.

Course Objectives:

Natural Language Processing seeks to analyze, generate, and understand human language via computational techniques. This course focuses on empirical approaches to lexical and syntactic analysis, semantic interpretation, and discourse processing. Specific applications include part-of-speech tagging, machine translation, and authorship attribution.

Required Text:

Foundations of Statistical Natural Language Processing by Christopher Manning and Hinrich Schutze. MIT Press. There is a supporting Web Site with quite a bit of information.

Required reading assignments from Manning and Schutze will be given in the lecture and also posted here.

There is a copy of the Manning and Schutze text on 2 hour reserve in the UMD library.

Other Required Reading :

The following document (see html or pdf) describes what plagiarism is, why it's a bad thing, and how you can easily avoid it. You are required to read and understand this document before submitting any assignment or project work.

Supporting Texts (optional):

There are two textbooks on 2-hour reserve in the library that may prove useful. The first is by Daniel Jurafsky and James Martin, Speech and Language Processing, Prentice-Hall. The second is by Eugene Charniak, Statistical Language Learning, MIT Press.

The Charniak book focuses on empirical methods, while the Jurafsky and Martin book is more general in nature and includes some discussion of speech processing. Both are excellent complements to our Manning and Schutze text.

Both of these books are available at the UMD library on 2 hour reserve.

Suggested Perl Texts:

We will do our programming assignments and projects in Perl on computers running the Solaris Unix or Linux operating systems. While we will discuss Perl from time to time in the lecture, there will be a fair bit of self-study required. As such you are strongly advised to have at least one of the following at your disposal:

Learning Perl (3rd Edition) by Randal Schwartz and Tom Phoenix. O'Reilly Publishers. You can get this book from amazon.com or most any bookstore. This takes a tutorial approach and is especially good if you have limited C or Unix experience.

Programming Perl (3rd Edition) by Larry Wall, Tom Christiansen, and Jon Orwant. O'Reilly Publishers. You can get this book from amazon.com or most any bookstore. This book is more like a reference manual than Learning Perl, although it is still very readable. This is a good choice if you have extensive C and Unix backgrounds.

Mastering Regular Expressions: Powerful Techniques for Perl and Other Tools (2nd Edition) by Jeffrey E. Friedl. O'Reilly Publishers. You can get this book from amazon.com , or most any bookstore. This is an in-depth treatment of regular expressions in Perl and is the best available reference on this topic.

You are strongly encouraged to view the books available via the online Safari service from O'Reilly Publishers - you will find the complete text of many Perl and Linux books availble here, including all three of the Perl books mentioned above!

Prerequisites:

This class is only open to currently enrolled CS graduate students.

Grading Basis: Grading Scale: Programming Assignments:

Programming assignments are to be completed in Perl and must be submitted on time. Late work is not accepted and will result in a score of zero for that assignment. You must use the web drop link on the class web page to turn in assignments.

Each programming assignment is worth 10 points. There will be 3-5 programming assignments.

All programming assignments are individual. You are required to write your own code. If you turn in code that is not your own (e.g., taken from a book or online archive, written by a friend, etc.) the best you can hope for is a zero on that assignment. I reserve the right to deal more harshly with such cases if I deem it necessary.

Final Project:

You will be assigned to a team and given a challenging problem in natural language processing to tackle. You must deliver a software solution and a final report that includes a discussion of your team's solution, an evaluation of its effectiveness, and a survey of related work. All teams will work on the same problem and we will have a comparative evaluation to see how well each team fares relative to the others.

You are expected to collaborate and work closely with your teammates. You may not collaborate with anyone outside of your team for any reason. All members of a team will receive the same grade. Your team must produce original work. Any type of plagiarism, whether it is deliberate or accidental, will be dealt with harshly.

Exams:

All exams are closed-note, closed-book. You must take exams at the scheduled time and place. Exams will not be given early. Make-up exams will only be offered in the event of documented personal emergencies. The final exam must be taken at the date and time as determined by the official university schedule (Saturday, December 18).

Quizzes:

All quizzes are closed-note, closed-book. They may cover any topic discussed in the lecture or included in the assigned readings. They will not be announced ahead of time. If you are not in the lecture at the time the quiz is given you will receive a 0. Your low quiz score will be dropped. We will have at least 8 but no more than 12 quizzes. Each quiz will be worth 10 points.

Equal Access:

If you have any disability (either permanent or temporary) that might affect your ability to perform in this class, please inform me at the start of the semester or as soon as you learn of such a condition. I may adapt methods, materials, or testing so that you can participate equitably. To learn about the services that UMD provides to students with disabilities, contact the Access Center, 138 Kplz, phone 8217 or visit their web page.

By: Ted Pedersen - tpederse@umn.edu