CS 8995 - Corpus Based Natural Language Processing - Spring 2001
Instructor Office Hours
Required Readings (week by week)
Program Samples (example code from lecture)
Final Project Teams (stage 1) (curious about the
team names ?)
Final Project Teams (stage 2)
Stage 1 Sentence Alignment, Gold
Standard Data due Friday March 23, 4 pm, the rest is due Monday March 26,
4pm. As of Mar 20 the evaluation requirements
have been expanded! Please make sure to include the new
GOLD STANDARD DATA posted (3/23)
EVALUATION PROGRAMS posted (3/26)
EVALUATION RESULTS posted
Stage 2 More Sentence Alignment, due
Monday April 16
Consider using some of the bitext data now available from Assignment 4 for
testing purposes. See below.
GOLD STANDARD DATA updated (4/16)
EVALUATION PROGRAMS posted (4/16)
Stage 3 Building a Translation
Dictionary with the EM algorithm (Optional extra credit) due Thursday May
All programming assignments should be turned in using
turnin on machine hh33812.
Unless specified otherwise, assignments are to be completed individually.
Here is a reminder about that policy.
Assignment 1 Mutual Information, due Wed
Jan 31, 4 pm
Solution Key for this
text with N=10.
Assignment 2 pointwise Mutual Information,
due Mon Feb 12, 4 pm
An analogy of sorts that may provide
a little guidance. A few more thoughts .
Preliminary info about the write-up .
Even more info about the write-up .
Assignment 3 N-gram models, due Mon Feb 26,
Solution Key using Witten Bell
Assignment 4 parallel corpus collection,
due Wed Mar 07, 4 pm. A reminder
about our objectives.
Here's the bitext that you created!
(posted 4/3/01). A note on how to
use it for stage 2.
Further details on assignment 4
grading, as well as EXTRA CREDIT OPPORTUNITY.
Sources of Text:
Lecture meets MW 4-5:40 pm in HH 302.
Last update: 1/21/2000