CS 5761 - Introduction to Natural Language Processing 
 Project Proposal due by noon Tuesday March 26 via email to tpederse  
and patw0006. Please submit plain text, with no attachments. 
 Objectives 
To find a topic for your project, and do some background research on it. 
 Specification 
You will complete an individual project that involves producing both a 
Perl implementation and a written report dealing with some interesting 
problem in Natural Language Processing. You may select your own topic.
My only requirements are that it be in text processing and have been at 
least mentioned (if only in passing) in our textbook. 
 
Please make sure you do not choose an overly broad topic. 'Part of Speech  
Tagging using a Trigram model and Good-Turing Smoothing' is an example of  
an interesting and well-focused project. 'A Program to Translate English  
Text into German'  is potentially interesting but will take you several  
decades to properly complete. Make your project focused, but on the other  
hand it should not  be trivial. Implementing a  minimum-edit distance  
program like we did for one of our programming assignments is focused but  
fairly trivial relative to the amount of time we are allowing for project  
completion. Please check with me if you aren't sure if your topic is  
suitable or not. 
 
By Tuesday March 26 you should have produced a project proposal. It should 
include the following:
-  Problem Description (1-2 paragraphs) : What is the problem you are 
attempting to solve in your project? Describe it in general terms, and 
explain why it is important. You should provide at least two references to 
papers that discuss the same problem, and you should read these papers in 
preparation for writing the proposal. Make sure you properly quote, 
footnote, etc. any information you get from these papers (Don't pull an  
Ambrose, as some now say...)
-  Overview of Solution/Approach (1-2 paragraphs) : What is the general 
approach you plan on taking. Will your approach follow one of the 
references you mention in the description, or will you invent your 
own? (It is ok to re-implement an existing solution, or propose your 
own.) The description of your solution is of course tentative in that as  
you begin to work on the problem a bit more you may realize that another  
approach is better. 
-  Evaluation Plan (1-2 paragraphs) : How will you show that your 
solution does something? Will you need to find or create "gold standard"  
data to use as a point of comparison? If so, where will you get that, or
how will you create it. For example, suppose you are working on a 
Context Sensitive Spelling Correction program based on n-grams (a good 
focused topic). How will you measure how many misspelled words your 
program finds, and how will you know that it proposes a proper correction? 
This does not need to be a lengthy document, but it should be well written 
and carefully thought-out. It will provide a road map for your project so 
the more you put into this the more smoothly your project will go. If I 
have significant concerns about your topic or some aspect of your proposal 
I will let you know by Monday April 1 and possibly request that you choose 
a new topic or provide additional details.
 
 
 Policies (from syllabus) 
 
All programming assignments and your project will be demonstrated during
designated lab sessions. You should also submit an electronic copy of
your source code to the TA prior to the designated demo session. (His
email address is patw0006@d.umn.edu.) There is no other way to submit
your programming assignments or project. Failure to submit AND demo on
time will result in a zero. 
Any code you submit should be commented. I must be able to understand
what your code does simply by reading the comments. This understanding
should extend down to the details of your code. So do not simply
describe the input and output, also include comments that describe
your particular algorithm and coding techniques. Failure to comment
to this degree will result in a zero. 
All assignments and the project are to be done individually. You are
required to write your own code. Unless otherwise specified, you must
only turn in code that you personally wrote. The only possible exception
to this is if I tell you to use a module that is available in a book
or online archive. However, I will clearly indicate when this is
permissible. Violations of this policy will result in severe grading
penalties and/or failure in the class.