CS 8761 Natural Language Processing - Fall 2004 - Automated Essay Grading

Final Project - Final version, due Wed, Dec 22, before noon

This may be revised in response to your questions. Last update Thu Dec 9, 4pm

Objectives

To design, implement and deploy an essay grading system. Your final version should incorporate all of my previous comments on your alpha and beta versions, plus what is described below.

This system should be web based, and include support for all of the following four components to your system: You will submit your system such that I can install and use it, and you will also have a web site up and running that can be used. Please make sure that your installation documentation is complete, and includes instructions on how to install your web based interface. I should be able to install and run your web based interface with minimum effort (5-10 minutes at most) by following your instructions. If I am unable to do this, your project will not receive full credit.

Specification

Your system should have a simple web interface that provides the user with a prompt (i.e, question) to answer. This question should be randomly (and automatically) selected from a file of possible prompts that you provide. The user should enter their essay, and then submit it for grading. The system should respond relatively quickly and be interactive. Make sure that your system has both a user response and a more detailed trace or diagnostic response. You should develop a scoring mechanism for your essays that is based on information collected from the four components. In addition, make sure you keep a log of activity on your system, that should include the user identify (via ip address, date, time), the essay written, and the system feedback. Each essay should result in a separate log file. This log file should contain enough information to allow us to see exactly what each component is doing, and how it is making the decisions that it does.

Thus, to summarize, your beta version should also include the following system-level features: Your system should be submitted "pre-trained", so that I can begin using it immediately. In other words, if you do any calculations based on large corpora, you should have already done those computations and provide me with a data file that includes those results. Of course there may be some dynamic computation required, and that is fine. For example, you may look up information on the Web based on the users input. However, if you use an LSA style approach, please create the co-occurrence matrix ahead of time, and include that in your distribution.

It is up to your team to decide how you will approach each of these problems. You may certainly use ideas from the published literature, just make sure to acknowledge this in your documentation (described below) by providing citations and references.

Improvements upon Beta Version

Based on my review of your beta systems, I am asking that each team address the following issues in their final systems.

Installation

The installation of your system should simple, and follow the standard "4 step Perl" install. These steps consist of the following:
perl Makefile.PL PREFIX=/home/cs/tpederse etc.
make
make test
make install
Note that if you rely on existing tools (like the Brill Part of Speech Tagger, WordNet, etc.) you can simply provide me with detailed instructions about where I can find those tools, and how to install them. Do not assume that I know how to install those, and do not assume that I already have them available. Also, do not simply refer me to the instructions in that package, please provide a concise set of instructions that includes any and all tools that might need to be available.

Note that the PREFIX variable indicates that I will not have supervisor access, and I will install into my own personal directory. It is fine if there are other directives required for Makefile.PL, simply indicate what they are in your INSTALL file.

Your distribution should include a plain text file named INSTALL that provides detailed instructions on how to install your system. This file should be in your top level directory. You should assume that the user of your system only has this documentation available, and is not an expert in system administration. Thus, if any other tools need to be installe, paths need to be set, Makefile.PL variables need to be set, etc. please provide detailed instructions about how to do that. Assume that your user does not have supervisor access! Please revise these instructions based on actual tests you run with each other installing the complete system, to make sure that a new user can successfully install your system.

Documentation

Your distribution should include a file named README.pod that is written in perldoc that describes your overall system. This should be in your top level directory. You should assume that the user has no specialized knowledge of automated essay grading. Thus, your README.pod should begin with a general introduction to the problem of automated essay grading. This should include a brief description of the history and related work in this area, and it should be written in the style and form of a related work section in a thesis or thesis proposal. Please work on making your writing more formal, and pay particular attention to the introductory remarks that motivate why automated essay scoring is both feasible and useful. In addition, continue to develop and expand your historical review. Do not limit this discussion to systems that have been mentioned in class or in the readings already assigned, try and identify new systems and background papers. Make sure you provide references to each of the systems that you discuss. This is so that your reader will know where to go to obtain additional information about a system if they so desire.

You should then introduce each of the four problems described above. In other words, do not assume that the user will know what gibberish detection, or relevance measurement is. Provide examples of text that would "trigger" your system to provide guidance to the essay writer. You should do this for all four systems. Make sure that you extend your discussion to include additional examples that illustrate the specific issues that you encounter when you address these problems. Provide examples of problem cases (where it's not clear what you should do) as well as more obvious cases. Also, you should start to discuss the interaction between the problems represented by these components. For example, does gibberish have any effect on fact checking, or does relevance have an impact on gibberish detection. In other words, while you have separate components for each problem, discuss in general terms how they might interact with each other.

Then, you should describe the specific approach that you are taking for your four solutions, and also describe your overall plan for each other components. You should clearly indicate what possible approaches you are considering for each component, and also describe who is going to do what. For all of the modules, make sure you discuss the evaluation of the module from alpha to beta to final versions, and what you have learned and observed about each along the way.

You should address the issue of how you will evaluate your system. Make sure to follow the guidelines above, and feel free to add onto that. The important thing is to demonstrate that your individual components are working, and that your overall scoring mechanism is reasonable.

Finally, compare and contrast your proposed approach with existing techniques, and clearly credit any publications, systems, etc. that might have given you ideas.

To summarize, your README.pod should consist of the following sections:

Submission Guidelines

You should package your system as a single compressed tar file that is named as TEAMNAME-1.00.tar.gz. This should include all of the code needed to run your system, and the README.pod and INSTALL files as described. If you are using CPAN modules in your system, make sure those dependencies are described in your Makefile.PL, and I will obtain those modules via CPAN. The same is true of other outside packages. If they are not available via CPAN (such as the Brill Tagger, WordNet) provide detailed instructions as to where I can find those and how to install them. You do not need to include those in your distribution.

Your system should unpack into a directory named as TEAMNAME-1.00. TEAMNAME should be your assigned team name, written in all capital letters. If there are embedded spaces in your name (e.g., BOCA JUNIORS) you should replace the space with a -, to result in the name (e.g., BOCA-JUNIORS-1.00).

If you are using sourceforge, upload your system and I will download from there. Please note that sourceforge provides a time stamp when you upload. In your README, please prominently provide a URL where I can run your system. Please note that I will also install it on my own system as well.

Finally, make sure that you specifically address the issue of "terms of use". In the end, we will be posting your code on the Web, and you should specifically address the issue of how the code can be used and distributed. One option is the GNU CopyLeft. You are free to investigate other licensing terms, but you must specifically address this issue and follow the accepted standards for including licensing information in your code and documentation. Also make sure to clearly identify the authors of individual programs, and provide contact information. If one person does all the work for a particular component, then their name should be the only one on that particular component. In cases where the effort is shared, then all team members should be credited.

Policies

This is a team assignment. You are strongly advised to divide up the work of the project into tasks that can be carried out in parallel by various team members. All team members should be acknowledged in the comments, etc. and all teammates will receive the same grade. Do not work with other teams, and do not discuss your approach with other teams. Each team should operate independent of all the other teams. Make your own decisions as a team and do not be influenced by the decisions of other teams if you happen to hear of them accidentally.

by: Ted Pedersen - tpederse@umn.edu