CS 5761 - Introduction to Natural Language Processing

Project Proposal due by noon Tuesday March 26 via email to tpederse and patw0006. Please submit plain text, with no attachments.


To find a topic for your project, and do some background research on it.


You will complete an individual project that involves producing both a Perl implementation and a written report dealing with some interesting problem in Natural Language Processing. You may select your own topic. My only requirements are that it be in text processing and have been at least mentioned (if only in passing) in our textbook.

Please make sure you do not choose an overly broad topic. 'Part of Speech Tagging using a Trigram model and Good-Turing Smoothing' is an example of an interesting and well-focused project. 'A Program to Translate English Text into German' is potentially interesting but will take you several decades to properly complete. Make your project focused, but on the other hand it should not be trivial. Implementing a minimum-edit distance program like we did for one of our programming assignments is focused but fairly trivial relative to the amount of time we are allowing for project completion. Please check with me if you aren't sure if your topic is suitable or not.

By Tuesday March 26 you should have produced a project proposal. It should include the following: This does not need to be a lengthy document, but it should be well written and carefully thought-out. It will provide a road map for your project so the more you put into this the more smoothly your project will go. If I have significant concerns about your topic or some aspect of your proposal I will let you know by Monday April 1 and possibly request that you choose a new topic or provide additional details.

Policies (from syllabus)

All programming assignments and your project will be demonstrated during designated lab sessions. You should also submit an electronic copy of your source code to the TA prior to the designated demo session. (His email address is patw0006@d.umn.edu.) There is no other way to submit your programming assignments or project. Failure to submit AND demo on time will result in a zero.

Any code you submit should be commented. I must be able to understand what your code does simply by reading the comments. This understanding should extend down to the details of your code. So do not simply describe the input and output, also include comments that describe your particular algorithm and coding techniques. Failure to comment to this degree will result in a zero.

All assignments and the project are to be done individually. You are required to write your own code. Unless otherwise specified, you must only turn in code that you personally wrote. The only possible exception to this is if I tell you to use a module that is available in a book or online archive. However, I will clearly indicate when this is permissible. Violations of this policy will result in severe grading penalties and/or failure in the class.