RIVER PLATE ESSAY GRADING SYSTEM [REGS] VERSION 1.0

http://68.191.146.185:8081/~tarun/cgi-bin/REGS-website/index.htm

DESIGNED AND DEVELOPED BY

  1. Tarun Kapoor, University of Minnesota Duluth
  2. Poorva Potnis, University of Minnesota Duluth
  3. Lalit Nookala, University of Minnesota Duluth
  4. Anoop Reddy, University of Minnesota Duluth
  5. Developed as a class project during the Natural Language Processing Course at the University of Minnesota Duluth.

MOTIVATION

Essay Writing has been a part of the curriculum of high school education system for a long time. Most of the competitive exams like Graduate Record Exam(GRE), Graduate Management Admission Council(GMAT) and Test Of English as a Foreign Language(TOEFL) involve writing an essay. Grading these essays demands a lot of time and energy on the part of the human grader. It introduces various factors like the capability of the grader, his opinions on certain subjects, his psychological condition etc. These vary from grader to grader. Also, it might not always be possible for the grader to provide detailed feedback to the essay writer. Often, the essay writer wants to practice writing essays and it might not always be feasible to have a human grader grade every practice essay. All these problems in human essay grading motivated researchers in Natural Language Processing(NLP) to develop software tools that would be able to grade essays. Some advantages of using a computer to grade essays include consistency in grading, unbiased grading that is not affected by the grader's fatigue and emotions, and speed of grading.

HISTORY AND RELATED WORK

The research in the field of Automated Essay Scoring started in 1966 when Ellis Page developed Project Essay Grader(PEG)[1]. PEG[1] graded an essay based on its textual features like number of words, number of punctuations, length of the essay etc. PEG was trained on pre-graded essays. It used mathematical techniques like multiple regression to learn how to combine the features of the essay to give it a goodness score. However, this system drew lot of criticism because it was easy for a user to fool it by writing long essays that need not necessarily be good.

Another tool developed in the early times was Writer's Work Bench (WWB)[2] developed in early 1980. Though not an essay grading tool, it provided the user with feedback on spelling errors and other readability issues. It promised to help the users of this software in improving their writing skills. It has been incorporated in Microsoft Word and is predominantly used as a spell checker and to an extent as grammar checker.

The main drawback of the above-mentioned systems was that they did not extract features from the essays that gave a direct evaluation of the writing quality. They could not evaluate an essay for it's organization, relevance of content and vocabulary. Recent research on Natural Language Processing aims at extracting features that will help evaluate these.

A significant impact and breakthrough in the automated essay grading was the development of Computer Analysis of Essay Content[3] by Dr. Jill Burstein and her co-workers. This system analysed essay content based on the summary generated by the computer. Yet another important step was the development of the Intelligent Essay Assessor[4] by a team led by Dr. TK Landaeur. It is based on the technique of Latent Semantic Analysis(LSA)[5]. This involves complex vector calculations on the sentences in the essays, where the student's essay is compared with pre-graded essays to determine how good it is. There are a couple of systems that use Artifical Intellgence to proctor essays. One of them is ``RocketScore'' by Adam Robinson and the other one is ``IntelliMetric'' by Vantage Learning.

Yet another system that uses linguistic features of an essay is the e-rater[6] developed by ETS. It is used to grade GMAT essays. It became operational in 1999. It takes into account grammatical and syntax errors, richness and complexity of vocabulary, topic content and organization of the essay. It also uses features like word length, essay length, repetitious words and verbosity.

ETS has successfully employed this system to grade GMAT essays. This system also provides feedback to the writer.

OBJECTIVE

The River Plate Automated Grading System (REGS) will grade an essay based on the following features:

  1. Relevance to the topic
  2. Grammar
  3. Presence and correctness of facts
  4. REGS will grade essays written in the standard 5 paragraph style with paragraphs for the main idea, supporting ideas and conclusion. The essays will be prompt specific, that is, the prompt will be provided to the writer and he should present his views on the topic of the prompt. If required he can include facts that support his arguments.

    Our proposed solution for REGS invloves designing four components to assess the above aspects of an essay. These are explained briefly below.

GIBBERISH DETECTOR

It is also known as Word Salad Detector. Word Salad is the term used to refer to a sequence of words forming a sentence that does not follow any grammatical rules and neither has a meaning attached to it. A sentence is called Word Salad when it is incoherent. It is also called ``gibberish'' which is more formally defined next. A sentence is categorized as gibberish if it is not decipherable i.e. the meaning of the sentence can not be determined. A sentence entered by the student can be gibberish in two cases: a) when the test taker is not a native speaker of the language having a poor sense of language and grammar b) when the user just used the prompt specific words in the sentences in order to get a high score. The Gibberish Detector should be able to identify gibberish sentences in the student's essay.

Examples:

1. Uniform good students help good management school.

2. Students school no uniform equality management responsible uniform costing high.

3. Uniforms help students in equality and unbiased on schools.

Sentences 1 and 2 are gibberish. One can not make any sense out of this conglomeration of words. However sentence 3 is ungrammatical, but can be understood. While implementing the Gibberish Detector we have to be careful on how to deal with borderline cases like case 3. These are sentences that make some sense but still are ungrammatical. REGS aims at clearly distinguishing between gibberish and ungrammatical sentences.

RELEVANCE DETECTOR

This module will determine if the sentences in the student's essay are relevant to the prompt of the essay. Sometimes while writing, a student might go off-track from the topic or the student might misunderstand the topic. This module should be able to identify such sentences.

The present solution uses LSA[5] to determine the semantic similarity between a paragraph and the essay prompt. If the relevance measure as determined by LSA[5] is lower than a certain threshold, then the paragraph is irrelevant. A sentence from the essay is not used in the comparison because it is too small a unit. In itself, it may not seem relevant, but when considered as a part of the paragraph in which it appears, it may be relevant.

Following is the example of what the problem of irrelevance is:

Examples:

For the following prompt:

PROMPT: ``People in positions of power are most effective when they exercise caution and restraint in the use of that power.'' (GRE issue topic: http://www.gre.org/issuetop.html)

1. With great power comes great responsibilities. People in positions of power have to act carefully while making decisions. It can be likely that many life depends on their decisions.

2. Power is the rate at which the work is done and it can be calculated easily if the work done is known which has to be catiously calculated. It can have a great effect on others.

In the example above, 1 is a clearly formed and very much relevant to the prompt, but 2 is irrelevant.

FACT IDENTIFIER

In an essay response each sentence can be classified as an opinion or a fact. While opinion is basically the perspective of the student, the fact is used by the student to support his views and arguments. Consider the following examples:

Examples:

1. I think that economies of the country are the actual index of a country's future. George Bush believes in increasing taxes to promote economic growth. I feel that this is wrong. USA is trying to tarnish China's image in the world market.

2. Economy is the greatest concern of a country. The Great Depression of 1930 forced the World War. Gambling is an addiction. Natural resources are important. Scott Peterson was convicted of murder. Lake Superior is the largest lake.

Here sentences in 1 are opinions and those in 2 are facts. The Fact Identifier module should be able to identify sentences in 2 as facts.

FACT CHECKER

Once it is determined whether a statement is fact or not we have to check for its accuracy. We should be sure that the student has not given an incorrect fact to support his arguments. However, before this we would like to identify facts that are worth checking for accuracy.

Examples:

1. India won the war against pakistan in 1971.

2. Abraham Lincoln died in 2003.

3. Tarun Kapoor went to a restaurent yesterday.

In the above examples all are facts. 1 is a correct fact, but 2 is not, while 3 is a fact that we do not want to verify.

IMPLEMENTATION OF BASELINE APPROACHES

Following is a detailed description of the implementation of each of the above modules.

RELEVANCE DETECTOR

OVERVIEW

Relevance is determined in REGS by measuring the semantic similarity between the components of the essay and the prompt. The more the similarity, the more relevant is the essay. Latent Semantic Analysis, LSA[5] is used to measure this semantic similarity.

In an LSA-based approach, the system first learns from a training corpus, which words in English mean the same, or in more technical terms, are semantically similar. It then uses this knowledge to identify pieces of text that are similar in meaning, though they may not use the same words.

In this approach, text from a training corpus is analysed to create a word-by-context co-occurance matrix. The context can be a sentence, paragraph or a document. For every word in the text, this matrix gives the vector of the contexts, the word appears in. A mathematical technique called Singular Value Decomposition, SVD is then applied to this matrix. This operation updates the context vectors for the words in such a way that vectors for similar words look similar. Similarity between two words is then computed using these context vectors.

Relevance Detector computes similarity between every paragraph of an essay and the prompt and then assigns a score to the paragraph, based on this. These scores are then used to indicate what percentage of the essay is irrelevant to the given prompt. Following is the detailed discussion of the approach.

DETAILED APPROACH

Training:

Relevance Detector is trained on a corpus of 750 GRE and TOEFL essays. These essays are prompt-specific, 5-paragraph essays where each essay has a main idea, supporting ideas and conclusion.

Following steps were taken to train the Relevance Detector.

Preprocessing of the training corpus:

1. Removal of stop words: Stop words are the common words in English. For example, ``the'', ``a'', ``what'', ``when'', ``if'', ``because''. These words do not carry any useful information for the purpose of this task.

2. Stemming: Words in the corpus are stemmed. Stemming is the process in which a word is reduced to its root form. For example, ``approaches'', ``approached'' and ``approaching'' have the root form ``approach''. Often, different forms of the same word are used in similar context, hence we use only the root form.

3. Removal of words that occur only once: Many words in the corpus occur only once. Clearly, they will not be semantically similar to any other word in the corpus. Also, the vectors for these one-time occuring words will look similar, wrongly identifying these words as semantically similar to each other.

The training corpus initially had a total of 13000 distinct words. After the preprocessing, the number of words reduced to 5000. It had about 3500 paragraphs.

This preprocessed training corpus was then used to create a word-by-context co-occurance matrix. The context used was a paragraph. Each cell in this matrix contained the value log_base2(number_of_occurances_of_this_word_in_this_context + 1).

Applying SVD:

The las2 program from SVDPACK[8] was used to apply SVD to the co-occurance matrix. Following steps were taken:

1. las2 from SVDPACK[8] requires the co-occurance matrix to be in the Harwell-Boeing (HB) sparse format. The module mat2harbo.pl from SenseClusters[9] was used to convert the matrix to HB format.

2. This HB-formatted matrix was provided to the las2 program from SVDPACK[8] along with a parameter file lap2. This file had parameters that specified the number of dimensions that the reduced SVD matrix should have.

3. The output of las2 was then given to another module svdpackout.pl from SenseClusters[9]. This module reconstructs the sparse matrix from the singular values and singular vector created by las2.

After various experiments, the dimensionality reduction factor of 25 was identified as the one that gave the best results. The matrix was reduced to dimensions of 5000 x 25 from 5000 x 3500. This matrix was then used as training data for REGS.

Detecting Irrelevance:

Similarity between the prompt and every paragraph of the essay is measured to determine whether the essay is irrelevant or not. First, a context vector is created for the paragraph and the prompt. The paragraph vector is the average of the word-vectors of the constituent words in the paragraph. Similarly, the prompt vector is the average of the word-vectors of the constituent words in the prompt. Cosine similarity measure is then used to compute the similarity between the paragraph vector and the prompt vector. The score for a paragraph is its similarity value. Lower the value, more irrelevant is the paragraph.

Scoring Irrelevance:

The irrelevance score of the essay is indicated as the percentage of the essay that is irrelevant.

KNOWN ISSUES

1. Training corpus used is small.

2. Might fail if the prompt is too small.

FUTURE IMPROVEMENTS

Experiments will be carried out to identify how the current approach can be improved:

1. The method of computing the paragraph and prompt vectors can take into consideration the important and distinguishing words in the paragraph.

2. Repetative use of a word in a certain relevant paragrapsh should not make it dis-similar to the prompt which has only one of no occurance of this word.

3. The training corpus can be augmenting with a thesaurus. Synonymns of words occuring in the corpus can be added to the co-occurance matrix such that vectors of synonyms look similar. This is simply facilitating the learning process by providing extra knowledge.

4. A bigger training corpus can be used.

GIBBERISH DETECTOR

OVERVIEW

The Gibberish Detector uses a syntactic parser to determine if a sentence is gibberish. The syntactic parser used is Link Parser[10] developed at the Carnegie Mellon University. (http://www.link.cs.cmu.edu/link/)

The Link Parser[10] is based on the concept of Link Grammar. Given a sentence, the Link Parser[10] creates 'links' between the words in the sentence. The 'link' between two words specifies the relationship between them. For example, in the sentence, ``The monkey has gone behind the house'', there is a ``subject'' relationship between ``monkey'' and ``has'', a ``past participle'' relation between ``has'' and ``gone'', a determiner relation between ``the'' and ``house'', a preposition relation between ``gone'' and ``behind'' etc. A 'linkage' is a parse of a sentence.

For example, the following will be one possible linkage for the sentence ``The monkey has gone behind the house''. This is the output of Link Parser after parsing the above sentence.

++++Time 0.00 seconds (72.23 total) Found 1 linkage (1 with no P.P. violations) Unique linkage. cost vector = (UNUSED=0 DIS=0 AND=0 LEN=9)

    +-------------------------Xp-------------------------+
    +------Wd-----+                     +-----Js----+    |
    |      +--Ds--+--Ss--+--PP--+--MVp--+     +--Ds-+    |
    |      |      |      |      |       |     |     |    |
LEFT-WALL the monkey.n has.v gone.v behind.p the house.n .

Constituent tree:

(S (NP The monkey) (VP has (VP gone (PP behind (NP the house)))) .)

(http://www.link.cs.cmu.edu/link/)

Link Parser follows a cost system based on a) number of words unused while forming the links b) length of a link c) conjunctions used in the sentence.

The Link Parser provides APIs to the user to extract cost information while parsing a sentence. Of the various factors that affect the cost of parsing, the number of unused words is the most important one. Number of unused words are the total number of words that have not been used by the link parser for forming linkages within the sentence. The parsing process returns the number of linkages which is the total number of linkages found for the sentence.

DETAILED APPROACH

The gibberish detector module reads the essay sentence by sentence, checking every sentence to detect gibberish. The sentence is parsed through Link Parser[10] for gibberish detection.

Each sentence that is parsed by the Link Parser is assigned a parsing cost. The lowest cost link is returned.

Link Parser uses its dictionaries and knowledge base of grammar rules to parse a sentence. Before a parsing can start, the dictionaries are created. The API dictionary_create(``data/4.0.dict'', ``data/4.0.knowledge'', NULL, ``data/4.0.affix'') is used for this. It takes the path to the dictionary, the knowledge base and affix files provided by the Link Parser.

The sentence is then converted to a Link Parser understandable sentence using the API sentence_create(line, dict).

It is then parsed using the API sentence_parse(sent,opts). This returns the number of linkages found for the sentence.

If the number of complete linkages is greater than 0, it means this sentence was grammatically correct. It is not gibberish. If not, then it is either gibberish or ungrammatical. To check this, the options for parsing are slightly relaxed using the API parse_options_set_min_null_counts(opts, num) The minimum null count is set to 1.

The sentence is the parsed again. This returns the number of linkages found. The number of null words is obtained using the API sentence_null_count. The length of the sentence is obtained by using the API sentence_length. This length is 2 more than the actual number of words in the sentence. The ratio of number of linkagess found to number of words in the sentence is calculated. This is the link ratio. The ratio of number of null words found to number of words in the sentence is calculated. This is the null ratio.

After experimentations it was found that high unused ratio means that the sentence is gibberish. Similarily it was observed that high link ratio points out that sentence is ungrammatical.

The sentence is given the following scores based on the link ratio and the unused ratio.

Score 6: A grammatically correct sentence.

Example: The knowledge from experience can improve and advance the world and our society Example: For example, mould and tools design for plastics industry, the university course only taught me very simple cases, most knowledge are obtained from various different and complicated cases in my career.

Score 5: A sentence that is very long with a few grammatical erros, but makes semantic sense.

Example: Our society and world are developed through continuous practices, those knowledge, never found in books, such as internet, e-business etc. are all developed through new practices Example: Experience first can prove if the knowledge from books are true or false

Score 4: A sentence that is grammatically incorrect, but is understandable easily.

Example: We can learn a lot through primary school, secondary school until university Example: Failures where people stumbling learning to raise and stepping stone of success

Score 3: A sentence that is grammatically incorrect and is difficult to understand.

Example: Failure often seems embarrassing and people hide it but learn lot

Score 2: A sentence that is grammatically incorrect and tends towards gibberish.

Example : Success failure no think, Johnson failure new jobs agencies help

Score 1: A sentence that has no structure, but can still be understoof with some effort.

Example: Chinese, Japanese music not I in Singapore.

Score 0: Gibberish

Example: Breathing photosynthtic brings green.

Scoring mechanism:

To decide whether a sentence gets a score between 0-5; the following mechanism is used.

The Link Ratio and Null Ratio is calculated.

Score 5: If Link Ratio is greater than 1.0 and Null Ratio is less than or equal to 0.1 then the sentence gets the score of 5. A sentence also gets a score of 5 if Link Ratio is between 0.25 and 0.125 and Unused Ratio is less than 0.08. It might also get a score of 5 when Link Ratio is less than 0.125 and Unused Ratio is less than 0.1.

Score 4: If a sentence has Link Ratio is more than 1.0 and Unused Ratio is between 0.1 to 0.3 then a score of 4 is assigned. If Link Ratio is between 1.0 and 0.25 and unused Ratio is less than 0.05 a score of 4 will be assigned to the sentence.

Score 3: If Link Ratio score is more than 1.0 and Unused Ratio is more than 0.3 then a sentence gets a score of 3. If Link Ratio is between 0.25 and 0.125 and Unused ratio is between 0.08 and 0.1 then a score of 3 is assigned.

Score 2: If the Link Ratio is less than 0.125 and Unused Ratio greater than 0.1 then the sentence is assigned a score of 2

Score 1: If the Link Ratio is between 1.0 and 0.25 and Unused Ratio is between 0.1 and 0.05 then a sore of 1 is assigned. Similarliy a socre of 1 is assigned when Link Ratio is between 0.25 and 0.125 and Unused Ratio is between 0.1 and 0.13

Score 0: If Unused ratio is more than 0.1 and Link Ratio is between 1.0 and 0.25 then the sentence is gibberish. If Unused Ratio is more than 0.13 when Link Ratio is between 0.25 and 0.125 then a score of 0 is assigned.

KNOWN ISSUES

The Gibberish Detector does not deal well with a sentences that has a lot of punctuation marks. This is because the Link Parser[10] considers these punctuation marks as unused words in the sentence. After running test cases it is found that in a regular essay there is 0.2 probability of having such incorrect scoring.

The Gibberish Detector penalizes heavily for missing out determiners in the sentences.

Example: From reading words in text book such as toy, car, train etc. , people have the concept and ideas currently is marked as cannot be determined.

FUTURE IMPROVEMENTS

The gibberish detector module is not foolproof system and fails for certain cases that have been identified, like sentences that are grammatical correct but miss determiners where they should have been. Example: 1. Hobby favorite activity. 2. Wright brother invented airplane. It can be made more robust by handling these cases.

FACT IDENTIFIER

OVERVIEW

The approach for Fact Identifier is based on the idea that each sentence is either a fact or an opinion. So this approach involves, first, determining sentences which are opinions. The remaining sentences are identified as facts.

Identifying opinions involves looking at the sentence for various features, which can make it an opinion. The next section explains in details what these features are.

DETAILED APPROACH

The features of a sentence which can make it an opinion are discussed below:

1. It contains a Modal like ``would'', ``could'', ``can''.

Example:

a. It will rain tomorrow.

b. George Bush should have made better decisions.

2. It contains an Adverb.

Example:

a. The war is quickly going to end.

b. Geoge Bush will immediately improve the tax policies.

Adverbs are words that are used by the writer/speaker to emphasize an opinion. The adverbs in the above sentences like ``quickly'', ``immediately'' do not say anything definite about the subject/object. They involve an opinion of the writer/speaker.

3. It contains an Adjective that is a positive or a negative sentiment.

Adjectives can be classified into two categories.

a. The one's which have a positive or a negative sentiment. Examples are ``perfect'', ``best'', ``beautiful'', ``ugly''.

b. The one's which do not have a positive or a negative sentiment. Examples are ``populous'', ``largest'', ``widest''.

Adjectives which have a positive or a negative sentiment involve an opinion of the speaker/writer. Fact Identifier tries to identify such adjectives in the sentence. For this purpose, the adjectives training data (appendix [2]) is used. It was obtained from the Sentiment Classification work by Agarwal et. al.[11]

Examples:

Ferrari makes the best sports cars.

San Francisco is the most beautiful place that i have seen.

4. It contains a verb that describes a private state of the writer/speaker as described by Wiebe J.[12].

These words describe a thought, belief, like, dislike and judgements of the writer/speaker. The examples of such verbs are ``think'', ``believe'', ``like'', ``dislike'', ``hate'', ``decision'', ``feel'' etc.

Fact Identifier tries to identify such verbs in the sentence. For this purpose, the verbs training data (appendix [1]) is used. The verbs in the sentence are considered in their stemmed form.

Example:

a. He likes to eat in a restaurent.

The above sentence is an opinion as it contains a sentiment verb ``eat''.

5. It contains a verb whose synonym describes the private state of the writer/speaker.

For the verbs not in the training data, the synonyms were found using Wordnet's QueryData package. If any synonym of the verb is defined as a private state verb, then the sentence is identified as an opinion.

Example:

a. He would gradually turn into an expert.

The above sentence is an opinion as it contains a verb ``turn'' whose synonym ``become'' is a sentiment verb.

6. It contains a Verb whose antonym describes the private state of the writer/speaker.

For the verbs not in the training data, the antonyms were found using Wordnet's QueryData package.. If any antonym of the verb is defined as a private state verb, then the sentence is identified as an opinion.

Example:

a. I hate to drive in snow.

The above sentence is an opinion as it contains a verb ``hate'' whose antonym ``like'' is a sentiment verb.

7. It contains some special words like ``conclude''. Usually essays end with ``In conclusion....'' . Other words like this are ``short'' (for ``In short,..... ''), example (for ``For instance .....''), instance (for ``For instance .....'') and other question words like ``How'', ``Why'' etc. For this purpose, training data (appendix [3]) is used.

A sentence that has any of the above features is marked as an opinion by the Fact Identifier. Any other sentence is marked as a fact.

KNOWN ISSUES

None

FUTURE IMPROVEMENTS

More research and testing can be done on finding words which can identify a sentence as an opinion.

FACT CHECKER

OVERVIEW

The approach is based on querying Google for Fact Checking. The details have been specified below.

DETAILED APPROACH

The Fact Checker uses the Google Web APIs and Brill Tagger for verifying facts.

It takes as inputs the set of sentences which have been identified as facts by the Fact Identifier. It then Brill Tags each of the sentences. Then for each sentence,the following processing is done:

1. All the proper nouns in the sentence are identified. The proper nouns are words which have a PRP tag assigned to them by the Brill Tagger. If two or more proper nouns occur consecutively, then they are identified as a single name. If the proper nouns are comma separated, then they are identified as separate proper nouns.

Example:

a. Proper nouns in the sentence ``Abraham Lincoln died in 1865'' is ``Abraham Lincoln''

b. Proper nouns in the sentence ``Kennedy, Lincoln were presidents '' are ``Kennedy'' and ``Lincoln''

Google is queried for all the proper nouns that were obtained. The number of hits returned by Google should be above a certain threshold for the proper nouns to be considered as famous. The threshold for ``one word'' proper nouns is 300,000 and 20,000 for more than ``two word''proper nouns. This just determines that the proper nouns involve some world renowned entities and the sentence is worth checking on Google. If the hits returned are below the threshold or if the sentence does not contain any proper nouns then the sentence is not verified for its correctness.

So for a query like ``Canberra is the capital of France'', Google is queried for the Canberra and France. Since both the words return hits more than the threshold, this fact is worth verifying. But a sentence like ``I met Abraham Lincoln's brother Mr. X Y at a restaurant last Thanksgiving'' would not be queried on Google as all the entities involved in the sentence are not world renowned.

2. In case the sentence passed the above filtering, a query is generated from the sentence, where just the proper nouns, noun, verbs, prepositions of the sentence are taken and the remaining words(usually stopwords) are replaced by a *. So a sentence like ``Canberra is the capital of France'' will be queried as ``Canberra * * capital * France''

3. The query generated in the above step is searched for in Google and if hits are found, then this fact is determined as a correct fact. Also, the results of Google are assumed to be true.

KNOWN ISSUES

The query used is not flexible. Consider the sentence ``Abraham Lincoln was president of America''. The query that will be generated for this sentence is ``Abraham Lincoln * president * America'' and it will be tagged as incorrect if Google does not return any hits for it.

FUTURE IMPROVEMENTS

The query should be made more flexible. As in the above case, ``Abraham Lincoln * president * America'' might not return any hits. In this case queries like ``Abraham Lincoln * * president * America'' or ``Abraham Lincoln * * president * * America'' should be tried too before tagging the sentence as an incorrect fact

RESPONSIBILITIES

Relevance - Poorva Potnis

Gibberish - Poorva Potnis and Lalit Nookala

Fact Identifier - Tarun Kapoor and Anoop Reddy

Fact Checker - Tarun Kapoor and Anoop Reddy

Web Interface - Tarun Kapoor and Lalit Nookala

Installation - Anoop Reddy

EVALUATION

The REGS and its components will be tested as described below. The system will be tested as a whole using essays of score 6, 4, 2 and 0. Each component will then be tested using essays that test its capability specifically.

Following is a detailed listing of the test cases used to test REGS and its components, the expected results and observed results.

SYSTEM TESTING

REGS is tested using actual essays written on the GMAT and TOEFL tests. The appendix provides the essays used for testing REGS. Following are the results of the tests.

TEST 1: A good essay (appendix [4])

Score: 6.00

SCORE BREAKUP Percentage of

Irrelevance 0.00%

Gibberish 0.00%

Grammatical Errors 0.00%

Incorrect Facts 0.00%

FACTS:

The consistency in scores assigned to different essays on the same topic is higher that of the humans

TEST 2: A bad essay (appendix [7])

Score: 0.00

SCORE BREAKUP Percentage of

Irrelevance 40.00%

Gibberish 33.33%

Grammatical Errors 16.67%

Incorrect Facts 0.00%

IRRELEVANT SENTENCES:

i had this computers once that was so stupid it was stupider than my dog and tats pretty stupid. My dog was so dump that he once ate his own droppings dogs are suptid.

Score:0.639100743760205

Evan stupider than dogs are fish they just swim around in the bowl and don't do anything sometimes they don't even swim much and just sit there i hate fish except for eating them.

Score:0.553771185842163

GIBBERISH SENTENCES:

the shoulnt be allowed to grade peoples essays

My dog wood grade esays better than a computer even thou he once hate my homework

UNGRAMMATICAL SENTENCES:

Teacher did not believe me when I told her that Ace had eated my homework so teachers are stupid too

Score:4

TEST 3: An essay to test Relevance Detector and Gibberish Detector. (appendix [6])

Score: 2.00

SCORE BREAKUP Percentage of

Irrelevance 14.29%

Gibberish 8.33%

Grammatical Errors 19.44%

Incorrect Facts 0.00%

IRRELEVANT SENTENCES:

First of all, automated essay grading systems have been used to grade student essays on GMAT and TOEFL tests. There is evidence that the scores assigned by an automated system corelates very well with that assigned by a human grader. The consistency in scores assigned to different essays on the same topic is higher that of the humans.

Score:0.341767022223075

GIBBERISH SENTENCES:

By their stiudies innovated technology and business model

Failures no good times some success need

Also America world war 2 failures success value

UNGRAMMATICAL SENTENCES:

Failure often seems embarrassing and people hide it but learn lot

Score:3

Failures where people stumbling learning to raise and stepping stone of success

Score:4

While working in research field, the beginner faces some challenges

Score:1

Thus, only having theoretical knowledge would not lead to become adept in field of inquiry

Score:1

Success failure no think, Johnson failure new jobs agencies help

Score:2

The examples of history and business demonstrate that failure can be the best catalyst of success, but only if people have the courage to face it head on

Score:4

Then the constitution was drafted that created a more powerful central government

Score:2

FACTS: A month later, Johnson created Johnson Staffing to correct this weakness in the job placement sector

Learning from the weaknesses of the Articles of Confederation, the founding fathers were able to create the Constitution

He was laid off

Today Johnson Staffing is the largest job placement agency in South Carolina

Learning the lessons taught by failure is a sure route to success

It is in the process of expanding into a national corporation

Google has succeeded by studying the failures of other companies

Most of the times discoveries are the outcome of empirical results

He started a recruiting firm that rose out of the ashes of Johnsons personal experience of being laid off

Annapolis Convention was convened in 1786

TEST 4: An essay to test Fact Identifier and Fact Checker. (appendix [5])

Score: 4.00

SCORE BREAKUP Percentage of

Irrelevance 0.00%

Gibberish 3.03%

Grammatical Errors 15.15%

Incorrect Facts 0.00%

GIBBERISH SENTENCES:

At start they were also neophytes

UNGRAMMATICAL SENTENCES:

Thus, I totally disagree with speaker's statement that the beginner is more successful in inventing major discoveries than the expert

Score:4

First they studied the principle behind glider and prepared their own glider attached with engine, which finally converted into airplane

Score:2

But it is undeniable that knowledge of an expert archeologist is much advanced than a neophyte in that field

Score:1

It was the experience that taught them how to improve their result and finally, led him to get great discoveries

Score:4

While working in research field, the beginner faces some successes and failure

Score:1

FACTS:

Graham Bell invented Telephone in 1876

Galileo died in 1642 at an age of 78

He designed a pendulum clock around 1640

Wright brothers invented the airplane

His first invention was done at a young age

In such fields one has to discover by searching fossils, artifacts by digging in some archeological sites

Most of the times discoveries are the outcome of empirical results

COMPONENT TESTING

Each component of REGS is tested using an essay that specifically tests its capabilities. Following are the results of the tests.

Relevance Detector

TEST 1: An irrelevant paragraph that comes in the flow of the essay

Test sample:

Prompt: Automated essay scoring is unfair to students, since there are many different ways for a student to express ideas intelligently and coherently. A computer program can not be expected to anticipate all of these possibilities, and will therefore grade students more harshly than they deserve. Discuss whether you agree or disagree (partially or totally) with the view expressed providing reasons and examples.

Essay: 1. Computers are stupid. the shoulnt be allowed to grade peoples essays.

2. i had this computers once that was so stupid it was stupider than my dog and tats pretty stupid. My dog was so dump that he once ate his own droppings dogs are suptid.

3. My dog wood grade esays better than a computer even thou he once hate my homework. Teacher did not believe me when I told her that Ace had eated my homework so teachers are stupid too.

4. Evan stupider than dogs are fish they just swim around in the bowl and don't do anything sometimes they don't even swim much and just sit there i hate fish except for eating them.

5. I hate computers almost as much as i hate fish. fish should not grade esays and computers shouldnt grade essaiys.

Expected behaviour: Mark paragraph 2 and 4 as irrelevant.

Test Results:

Irrelevant Paragraphs:

2. i had this computers once that was so stupid it was stupider than my dog and tats pretty stupid. My dog was so dump that he once ate his own droppings dogs are suptid.

Score:0.639100743760205

4. Evan stupider than dogs are fish they just swim around in the bowl and don't do anything sometimes they don't even swim much and just sit there i hate fish except for eating them.

Score:0.553771185842163

TEST 2: Completely irrelevant paragraph that breaks the flow of the essay

TEST 3: A relevant paragraph that may not have any words in common with the prompt.

TEST 4: A relevant paragraph that has some words in common with the prompt.

Test sample:

Prompt: Many successful adults recall a time in their life when they were considered a failure at one pursuit or another. Some of these people feel strongly that their previous failures taught them valuable lessons and led to their later successes. Others maintain that they went on to achieve success for entirely different reasons. In your opinion, can failure lead to success? Or is failure simply its own experience?

Essay: 1. Learning the lessons taught by failure is a sure route to success. America can be seen as a success that emerged from failure. Learning from the weaknesses of the Articles of Confederation, the founding fathers were able to create the Constitution. Google is another example of a success that arose from learning from failure. In this case Google learned from the failures of its competitors. Another example that shows how success can arise from failure is the story of Rod Johnson. He started a recruiting firm that rose out of the ashes of Johnsons personal experience of being laid off.

2. America, the first great democracy of the modern world, achieved success by studying and learning from earlier failures. Annapolis Convention was convened in 1786. Then the constitution was drafted that created a more powerful central government. The government also maintained the integrity of the states. Also America world war 2 failures success value.

3. Google has suffered few setbacks in the late 1990s. Google has succeeded by studying the failures of other companies. By their stiudies innovated technology and business model. Google identified and solved the problem of assessing the quality of search results by using the number of links pointing to a page as an indicator of the number of people who find the page valuable. Googles search results became far more accurate and reliable than those from other companies.

4. The example of Rod Johnsons success as an entrepreneur in the recruiting field also shows how effective learning from mistakes and failure can be. He was laid off. Johnson realized that his failure to find a new job resulted primarily from the inefficiency of the local job placement agencies. Success failure no think, Johnson failure new jobs agencies help. A month later, Johnson created Johnson Staffing to correct this weakness in the job placement sector. Today Johnson Staffing is the largest job placement agency in South Carolina. It is in the process of expanding into a national corporation.

5. First of all, automated essay grading systems have been used to grade student essays on GMAT and TOEFL tests. There is evidence that the scores assigned by an automated system corelates very well with that assigned by a human grader. The consistency in scores assigned to different essays on the same topic is higher that of the humans.

6. First, we will consider what is the difference between an expert and the novice. For becoming expert everyone has to pass through the phase of a beginner. While working in research field, the beginner faces some challenges. Gradually, beginner turns to an expert in that field. The tyro will make new discoveries later. Most of the times discoveries are the outcome of empirical results. Thus, only having theoretical knowledge would not lead to become adept in field of inquiry.

7. Failures no good times some success need. Failures where people stumbling learning to raise and stepping stone of success. Failure often seems embarrassing and people hide it but learn lot. But as in the above examples, learning from failures makes you achieve success. The examples of history and business demonstrate that failure can be the best catalyst of success, but only if people have the courage to face it head on.

Expected behaviour: Paragraph 5 should be marked as irrelevant (Test 2). Paragraph 6 should be marked as relevant (Test 3). Paragraph 1 should be marked as relevant (Test 4)

Test Result:

Irrelevant paragraphs:

First of all, automated essay grading systems have been used to grade student essays on GMAT and TOEFL tests. There is evidence that the scores assigned by an automated system corelates very well with that assigned by a human grader. The consistency in scores assigned to different essays on the same topic is higher that of the humans.

Score:0.341767022223075

Gibberish and Grammar Detector

TEST 1: Grammatically correct sentence.

Test Sample:

a. The knowledge from experience can improve and advance the world and our society

b. For example, mould and tools design for plastics industry, the university course only taught me very simple cases, most knowledge are obtained from various different and complicated cases in my career.

Test Results:

Both sentences got a score of 6.

TEST 2: Grammatically incorrect sentences

Test Samples:

a. Our society and world are developed through continuous practices, those knowledge, never found in books, such as internet, e-business etc. are all developed through new practices

b. Experience first can prove if the knowledge from books are true or false

Test Results:

Both sentences got a score of 5.

c. We can learn a lot through primary school, secondary school until university

d. Failures where people stumbling learning to raise and stepping stone of success

Test Results:

Both sentences got a score of 4.

e. Failure often seems embarrassing and people hide it but learn lot

Test Results:

Sentence got a score of 3.

f. Success failure no think, Johnson failure new jobs agencies help

Test Results:

Sentence got a score of 2.

g. Thus, only having theoretical knowledge would not lead to become adept in field of inquiry

Sentence got a score of 1.

TEST 3: Gibberish

Test Samples:

a. Failures no good times some success need

b. Also America world war 2 failures success value

Test Results:

Both sentences got a score of 0.

Fact Identifier

TEST 1: Tests for identification of opinion

Test 1.1: Modal present in a sentence.

Test Sample:

a. He will be the leader of America.

b. I should go to school now.

Expected behaviour: Sentence should be marked as opinion.

Test Results:

a. He will be the leader of America.

12-21-2004 21:6:23 FACT_ID Sentence being checked is --He will be the leader of America

12-21-2004 21:6:23 FACT_ID Brill Tagged form -- He/PRP will/MD be/VB the/DT leader/NN of/IN America/NNP

12-21-2004 21:6:23 FACT_ID Sentence had a modal

b. I should go to school now.

12-21-2004 21:6:23 FACT_ID Sentence being checked is --I should go to school now

12-21-2004 21:6:23 FACT_ID Brill Tagged form -- I/PRP should/MD go/VB to/TO school/VB now/RB

12-21-2004 21:6:23 FACT_ID Sentence had a modal

Test 1.2 Adverb present in a sentence.

Test Sample:

a. It is too early to say anything.

b. The car went by slowly.

Expected behaviour: Sentence should be marked as opinion.

Test Results:

a. It is too early to say anything.

12-21-2004 21:6:23 FACT_ID Sentence being checked is --It is too early to say anything

12-21-2004 21:6:23 FACT_ID Brill Tagged form -- It/PRP is/VBZ too/RB early/JJ to/TO say/VB anything/NN

12-21-2004 21:6:23 FACT_ID Sentence had an adverb

b. The car went by slowly.

12-21-2004 21:6:23 FACT_ID Sentence being checked is --The car went by slowly

12-21-2004 21:6:23 FACT_ID Brill Tagged form -- The/DT car/NN went/VBN by/IN slowly/RB

12-21-2004 21:6:23 FACT_ID Sentence had an adverb

Test 1.3 Sentiment indicating verbs present in sentence.

Test Sample:

a. I think these are our natural resources and our treasure.

Expected behaviour: Sentence should be marked as opinion.

Test Results:

a. I think these are our natural resources and our treasure.

12-21-2004 21:6:23 FACT_ID Sentence being checked is --I think these are our natural resources and our treasure

12-21-2004 21:6:23 FACT_ID Brill Tagged form -- I/PRP think/VBP these/DT are/VBP our/PRP$ natural/JJ resources/NNS and/CC our/PRP$ treasure/NN

12-21-2004 21:6:23 FACT_ID Opinion. VFA Testing. Verb --- think ---

Test 1.4 Words like 'What', 'Why', 'Where', 'When' present in a sentence.

Test Sample:

a. What was the name of the town we visited.

Expected behaviour: Sentence should be marked as opinion.

Test Results:

a. What was the name of the town we visited.

12-21-2004 21:24:36 FACT_ID Sentence being checked is --What was the name of the town we visited

12-21-2004 21:24:36 FACT_ID Brill Tagged form -- What/WP was/VBD the/DT name/NN of/IN the/DT town/NN we/PRP visited/VBD

12-21-2004 21:24:36 FACT_ID Opinion . Word --What

Test 1.5 A verb not present in the REGS verbs data.

Test Sample:

a. Gathering experiences from past experiments, gradually, beginner turns to an expert in that field.

Expected behavior: The synonym of this word is found out. It is searched for in verbs data to see if its a sentiment. If it is a sentiment, the original verb is also a sentiment and the sentence should be marked as an opinion. If the synonym is not present in verbs data, then the antonym of this verb is searched for. If the antonym is a sentiment, then the original verb is also a sentiment and then sentence should marked as an opinion.

Test Results:

a. Gathering experiences from past experiments, gradually, beginner turns to an expert in that field.

12-21-2004 21:6:23 FACT_ID Sentence being checked is --Gathering experiences from past experiments, gradually, beginner turns to an expert in that field

12-21-2004 21:6:23 FACT_ID Brill Tagged form -- Gathering/NN experiences/NNS from/IN past/JJ experiments,/NN gradually,/NN beginner/NN turns/VBZ to/TO an/DT expert/NN in/IN that/DT field/NN

12-21-2004 21:6:23 FACT_ID Opinion. Synonym Testing. Verb -- turns . Synonym -- become

Test 1.6 A sentence that does not satisfy any of the cases stated in 1.1 to 1.7 above.

Test Sample:

a. His first invention was done at a young age.

b. In such fields one has to discover by searching fossils, artifacts by digging in some archeological sites.

c. Most of the times discoveries are the outcome of empirical results.

d. The consistency in scores assigned to different essays on the same topic is higher that of the humans.

e. Johnson created Johnson Staffing to correct this weakness in the job placement sector.

f. Learning from the weaknesses of the Articles of Confederation, the founding fathers were able to create the Constitution.

g. He was laid off.

i. Johnson Staffing is the largest job placement agency in South Carolina.

j. Learning the lessons taught by failure is a sure route to success.

k. Google has succeeded by studying the failures of other companies.

Expected behavior: Sentence should be marked as a fact.

Test Results:

a. His first invention was done at a young age.

12-21-2004 21:6:23 FACT_ID Sentence being checked is --His first invention was done at a young age

12-21-2004 21:6:23 FACT_ID Brill Tagged form -- His/PRP$ first/JJ invention/NN was/VBD done/VBN at/IN a/DT young/JJ age/NN

12-21-2004 21:6:23 FACT_ID Fact

c. Most of the times discoveries are the outcome of empirical results.

12-21-2004 21:6:23 FACT_ID Sentence being checked is --Most of the times discoveries are the outcome of empirical results

12-21-2004 21:6:23 FACT_ID Brill Tagged form -- Most/JJS of/IN the/DT times/NNS discoveries/NNS are/VBP the/DT outcome/NN of/IN empirical/JJ results/NNS

12-21-2004 21:6:23 FACT_ID Fact

i. Johnson Staffing is the largest job placement agency in South Carolina.

12-21-2004 21:6:23 FACT_ID Sentence being checked is --Johnson Staffing is the largest job placement agency in South Carolina

12-21-2004 21:6:23 FACT_ID Brill Tagged form -- Johnson/NNP Staffing/NNP is/VBZ the/DT largest/JJS job/NN placement/NN agency/NN in/IN South/NNP Carolina/NNP

12-21-2004 21:6:23 FACT_ID Fact

Fact Checker

The Fact Checker will verify the facts identified by the Fact Identifier. It does not verify every fact, but only those that it finds worth verifying.

TEST 1: A fact that does not have any Proper Nouns.

Test Sample:

a. In such fields one has to discover by searching fossils, artifacts by digging in some archeological sites.

b. He designed a pendulum clock around 1640

Expected behavior: Fact not worth verifying.

Test Results:

a. In such fields one has to discover by searching fossils, artifacts by digging in some archeological sites.

12-21-2004 20:27:39 FACT_CH Sentence being checked is -- In such fields one has to discover by searching fossils, artifacts by digging in some archeological sites

12-21-2004 20:27:39 FACT_CH Brill Tagged form -- In/IN such/JJ fields/NNS one/CD has/VBZ to/TO discover/VB by/IN searching/VBG fossils,/NN artifacts/NNS by/IN digging/VBG in/IN some/DT archeological/JJ sites/NNS

12-21-2004 20:27:39 FACT_CH Checking if the fact is worth checking at google

12-21-2004 20:27:39 FACT_CH The sentence does not contain any proper nouns

12-21-2004 20:27:39 FACT_CH This fact is not checkable

b. He designed a pendulum clock around 1640

12-21-2004 20:27:39 FACT_CH Sentence being checked is -- He designed a pendulum clock around 1640

12-21-2004 20:27:39 FACT_CH Brill Tagged form -- He/PRP designed/VBD a/DT pendulum/NN clock/NN around/IN 1640/CD

12-21-2004 20:27:39 FACT_CH Checking if the fact is worth checking at google

12-21-2004 20:27:39 FACT_CH The sentence does not contain any proper nouns

12-21-2004 20:27:39 FACT_CH This fact is not checkable

TEST 2: A fact that has a Proper Noun but it is not a famous person or thing.

a. Poorva Potnis studies computers

Expected behavior: Fact not worth verifying.

Test Results:

Poorva Potnis studies computers

12-21-2004 23:14:11 FACT_CH Sentence being checked is -- Poorva Potnis studies computers

12-21-2004 23:14:11 FACT_CH Brill Tagged form -- Poorva/NNP Potnis/NNP studies/NNS computers/NNS

12-21-2004 23:14:11 FACT_CH Checking if the fact is worth checking at google

12-21-2004 23:14:11 FACT_CH Word being ckecked: ``Poorva Potnis '' . No. of Hits: 17

12-21-2004 23:14:11 FACT_CH The number of hits is less than the threshold

12-21-2004 23:14:11 FACT_CH This fact is not checkable

TEST 3: A fact that has a famous Proper Noun.

Test Sample:

a. Galileo died in 1642 at an age of 78

b. Graham Bell invented Telephone in 1876

c. New York is the largest city in America

d. Abraham Lincoln, John Kennedy were Presidents of America.

Expected behavior: Fact is worth verifying.

Test Results:

a. Galileo died in 1642 at an age of 78

12-21-2004 20:27:39 FACT_CH Sentence being checked is -- Galileo died in 1642 at an age of 78

12-21-2004 20:27:39 FACT_CH Brill Tagged form -- Galileo/NNP died/VBD in/IN 1642/CD at/IN an/DT age/NN of/IN 78/CD

12-21-2004 20:27:39 FACT_CH Checking if the fact is worth checking at google

12-21-2004 20:27:39 FACT_CH Word being ckecked: ``Galileo '' . No. of Hits: 1050000

12-21-2004 20:27:39 FACT_CH This fact is checkable

12-21-2004 20:27:39 FACT_CH The generated query is ``Galileo died * 1642 * * age * 78 '' . No of Hits: 2

12-21-2004 20:27:39 FACT_CH This is a fact

b. Graham Bell invented Telephone in 1876

12-21-2004 20:27:39 FACT_CH Sentence being checked is -- Graham Bell invented Telephone in 1876

12-21-2004 20:27:39 FACT_CH Brill Tagged form -- Graham/NNP Bell/NNP invented/VBD Telephone/NNP in/IN 1876/CD

12-21-2004 20:27:39 FACT_CH Checking if the fact is worth checking at google

12-21-2004 20:27:39 FACT_CH Word being ckecked: ``Graham Bell '' . No. of Hits: 479000

12-21-2004 20:27:39 FACT_CH Word being ckecked: ``Telephone '' . No. of Hits: 24100000

12-21-2004 20:27:39 FACT_CH This fact is checkable

12-21-2004 20:27:39 FACT_CH The generated query is ``Graham Bell invented Telephone * 1876 '' . No of Hits: 6

12-21-2004 20:27:39 FACT_CH This is a fact

COMPONENT SCORING

Relevance Detector: Each paragraph will be assigned a score between 0 to 1. The higher the score, the more relevant is the paragraph. Paragrapsh having a score of less than 0.67 are considered irrelevant.

The irrelevance score for the essay is computed as the percentage of irrelevant paragraphs.

Gibberish Detector: Each sentence will be assigned a score between 0 to 6. The higher the score more decipherable the sentence is. A gibberish sentence gets a score of 0, while a correct sentence gets a score of 6. An ungrammatical sentence gets a score between 1 to 5.

The gibberish score for the essay is computed as the percentage of gibberish sentences in the essay. The ungrammatical score for the essay is computed as the percentage of ungrammatical sentences in the essay.

Fact Identifier and Checker: The number of incorrect facts is computed. The fact score for the essay is the percentage of incorrect facts out of the total number of facts in the essay.

The above 4 scores will be used to compute the overall score for the essay.

OVERALL SCORING

OVERVIEW

The score of the essay response is on a 0-6 scale. After scoring, any fractional score is rounded off to the nearest integer.

DETAILED IMPLEMENTATION

The system begins by assuming a total score of 600. The final score is then evaluated depending on the following parameters:

1. Number of paragraphs in the essay

2. Number of sentences in the essay

3. Percentage of irrelevant paragraphs

4. Percentage of gibberish sentence

5. Percentage of ungrammatical sentences

6. Percentage of incorrect facts

ESSAY STRUCTURE:

Paragraphs = 5 Sentences = 25-35 Points Cut = 0

Paragraphs = 5 Sentences = 20-25 Points Cut = 25

Paragraphs = 4 Sentences = 15-20 Points Cut = 50

Paragraphs = 3 Points Cut = 200

Paragraphs = 2 Points Cut = 400

Paragraphs = 1 Points Cut = 500

IRRELEVANCE MEASURE

Irrelevant Paragraphs = 1 Points Cut = 100

Irrelevant Paragraphs = 2 Points Cut = 200

Irrelevant Paragraphs = 3 Points Cut = 300

Irrelevant Paragraphs = 4 Points Cut = 400

Irrelevant Paragraphs = 5 or more Points Cut = 450

GIBBERISH MEASURE

Gibberish Percentage = less that 15% Points Cut = 75

Gibberish Percentage = 15% to 25% Points Cut = 125

Gibberish Percentage = more than 25% Points Cut = 175

UNGRAMMATICAL MEASURE

Ungrammatical Percentage = less than 10% Points Cut = 100

Ungrammatical Percentage = 10% to 25% Points Cut = 150

Ungrammatical Percentage = more than 25% Points Cut = 175

INCORRECT FACTS

Incorrect Facts Percentage = less than 30% Points Cut = 55

Incorrect Facts Percentage = 30% to 60% Points Cut = 75

Incorrect Facts Percentage = more than 60% Points Cut = 125

FINAL SCORE

The final score is evaluated after calculating the total points cut according to the above mentioned criteria. The final scores is then divided by 100 and rounded of to the nearest integer value.

RELATION TO PREVIOUS WORK

1. Intelligent Essay Assesor[4] uses an LSA[5] based approach to grade essays.

2. E-rater[6] determines gibberish sentences and they refer to this as Word Salad Detection.

3. E-rater[6] does not identify/verify facts. They think that incorrect facts are not a measure of student's writing quality.

Following are the ideas/tools used in REGS from previous research:

1. The Gibberish Detector uses the Link Parser[10] to identify gibberish.

2. The Relevance Detector uses LSA[5] to determine relevance between paragraphs and the prompt.

3. In the training for Relevance Detector, SVDPACK[8] was used for SVD. Utility modules from SenseClusters[9] were used to format the input and output to SVDPACK[8].

4. Fact Identifier uses the concept of private states in a sentence which was described by Wiebe J. [12].

5. Fact Identifier also uses the concept of postive/negative sentiment descibing adjectives as used by Agarwal et. al.[11] in Sentiment classification.

REFERENCES

[1] Page, E., Shermis, M. D., Lavoie, M. J., Marsiglio, C. C., Kock, M. M., Fogel, M. (1995). Computer Grading of Essays. Address for APA Annual Meeting. Available at ``http://134.68.49.185/pegdemo''

[2] MacDonald, M. et al. (1982).The writer's workbench: Computer Aids for Text Analysis. IEEE Trans.Comm., 30, 1, 105-110

[3] Burstein, J., Kukich, K., Wolff, S., Chi, L., & Chodorow, M. (1998). ``Computer Analysis of Essays,'' Proc. NCME Symp.Automated Scoring,1998.

[4] Laham, D., & Foltz, P. W. (2000). The intelligent essay assessor. In T.K. Landauer (Ed.), IEEE Intelligent Systems, 2000

[5] Landauer, T. K., Foltz, P. W., & Laham, D. (1998). Introduction to Latent Symantic Analysis. Discourse Processes, 25, 259.

[6] Attali, Y., & Burstein, J. (2004).Automated essay scoring with e-rater V.2.0. To be presented at the Annual Meeting of the International Association for Educational Assessment, Philadelphia, PA.

[7] Marti, A. The debate on automated essay grading. Avialable at ``http://www.knowledge-technologies.com/presskit/KAT_IEEEdebate.pdf''

[8] Berry, M. W., Liang, M. (1992). Large Scale Singular Value Computations. International Journal of Supercomputer Applications. 6:1, 13-49. Available at `` http://www.netlib.org/svdpack''

[9] Purandare, A., Pedersen, T. (2004). Documentation. SenseClusters. Available at ``http://senseclusters.sourceforge.net''

[10] Temperley, D., Sleator, D., Lafferty, J. (2004). Documentation of Link Grammar. Available at ``http://www.link.cs.cmu.edu/link/''

[11] Agarwal, N., Aurangabadkar, K., Bhoite, D., Li, Y., McInnes, B. (2002). README. Available at ``http://www.d.umn.edu/~tpederse/Courses/CS8761-FALL02/Assign/project/minas.tar.gz''

[12] Wilson, T., Wiebe, J. (2003). Annotating opinions in the world press. In proceedings of the 4th SIGdial Workshop on Discourse and Dialogue(SIGdial-03).

[13] Wiemer-Hastings, P., Harter, D., Graesser, A., C., & the Tutoring Research Group (1998). The Foundations and Architecture of Autotutor. Lecture Notes in Computer Science, 1452.

APPENDIX:

[1]

VERBS DATA

Filename: verbs.txt

It contains a list of all the verbs obtained from the UMD-Essay-Corpus [1]. Each verb exists in the stemmed form. The verb is followed by a '+' sign if it is a sentiment. There are a total of 1683 verbs out of which 444 are sentiment verbs.

QUELL:= TRAP:= STICK:= ABATE:= PERVADE:= EASE:= DENY:= STRIVE:+ SKIP:= WORK:= COMPOUND:= ESCAPE:= WEAKEN:+ SHORTEN:+ ESCHEW: MISS:= ENVISION:+ LENGTHEN:+ RELIEVE:= TUCK:= REELECT:= DOWNLOAD:= FUSE:= SURVIVE:= EXHIBIT:= OVERSEE:= AWAIT:= CLOSE:= TRADE:= INGEST:= ACQUAINT:= MANUFACTURE:= PREFER:+ IMMERSE:= PRETEND:+ HALT:= HARM:+ LEVEL:= CRAVE:+ PRESS:= RELOCATE:= STREAMLINE:= BROADEN:+ FAZE:+ APPEAL:= BOTTOM:+ BRING:= DISPEL:= OVERRULE:= CASH:= CHANCE:+ DIGITIZE:= FATIGUE:+ IMPART:= VENTURE:= PROPOSE:+ BLIND:= SHED:= EXERT:= FISH:= WELL:= FOSTER:= WADE:= ENQUIRE:+ DON:= DISAPPOINT:+ PROMOTE:= DO:= DARE:+ LURK:= DERIVE:= ARRANGE:= PROVOKE:+ REINVENT:= ROVE:= EXPLOIT:+ FULFIL:= PERSUADE:+ ACCESS:= DIAGNOSE:= HAVE:= ASSIST:= SHOP:= SAW:+ VOLUNTEER:= OUTLINE:= BOMBARD:= EXPECT:+ GENERATE:= ABOUND:= MODERNIZE:= INCLUDE:= HONOR:+ BLOW:= ORDAIN:= FRUCTIFY:= COUNTER:= BELIEVE:+ NUMBER:= UNDERSTAND:+ MOULD:= WEATHER:= HARDEN:+ PROSCRIBE:= IMBIBE:= RATIONALIZE:+ SELL:= SPOIL:= GRANT:= DECLINE:= PREFORM:= GRAZE:= BOG:= REMOLD:= PERMEATE:= ADORE:+ DISSUADE:+ COOL:= BLESS:+ MITIGATE:= INVENT:= DUMP:= DETAIL:= SCAR:= DEVISE:= BALANCE:= REMUNERATE:= TARGET:= EVIDENCE:= DEBUT:= COMMUTE:= REVERE:+ DELINEATE:= LAYER:= CROW:+ BOTHER:+ MUSTER:+ JUT:= ATTENUATE:= COMPROMISE:+ REMEDY:= PRY:= DISMISS:= SNATCH:+ CROSS:= REST:+ ORGANISE:= GO:= EXHAUST:= EXCEL:= LANGUISH:+ COLOR:= STIR:= RELATE:= ROB:= REAR:= POST:= MATCH:= BYPASS:= ASSEMBLE:= FREE:= PORTRAY:= RUIN:+ TICK:= CELEBRATE:+ BREAK:= ECONOMIZE:= WONDER:+ GET:= LOOP:= REGARD:+ UNLEASH:= RAP:= LIFT:= REPRIMAND:+ PROLONG:= PLAGUE:+ FIT:= SHIFT:= CONSOLE:+ SUPPLEMENT:= COPY:= BREATHE:+ FADE:= HONE:+ GENERALIZE:+ ACCELERATE:= AIR:= TRANSPORT:= DEDICATE:+ STREAM:= SYMBOLIZE:+ PARALLEL:= SINK:= MEDIATE:= EXPERIENCE:+ HARVEST:= ELUDE:= INVEST:= PERSONALIZE:+ SUSPEND:= PAVE:= TAKE:= INTERVIEW:= OVEREMPHASIZE:+ INCREASE:= ELECT:= TRY:+ LIONIZE:+ CATEGORIZE:+ OPINE:+ DAMPEN:= PERSONIFY:= DISILLUSION:+ FLOW:= SHRINK:+ DISRUPT:= KISS:+ LEAF:= REFUTE:= PROCEED:= HIKE:= IMPLEMENT:= RECONCILE:+ ACCOMMODATE:= OFFER:= EVALUATE:= SCHOOL:= OVERREACH:+ BENEFIT:+ CRAWL:= GUARD:= CONNOTE:= AVERAGE:= ADJUDICATE:+ LINK:= INTIMATE:+ FINANCE:= RESEARCH:= CONTEND:+ QUICKEN:= PURR:+ PURCHASE:= RESENT:+ INFECT:= INVESTIGATE:= RECITE:= DOUBT:+ DELIVER:= FEATURE:= CHAMPION:= STEM:= PARALYZE:+ DISTURB:+ PROGRESS:= APPREHEND:= SURPRISE:+ CONFIDE:+ CIRCUMVENT:+ SYNTHESIZE:= ENGAGE:= PERISH:= REPEAT:= SUSTAIN:= DUST:= CONFORM:= OWE:= DIVORCE:= CONFLICT:= COOPERATE:+ DROP:= REVEAL:+ STORE:= HUNT:= CRAFT:= BACK:= BLAME:+ FOSSILIZE:= SKID:= INGRAIN:= HEAT:= EMBODY:+ TRACK:= UTILIZE:= TREAT:+ CONVEY:+ WIELD:+ SHARE:= TAINT:+ DYE:= MEET:= RISK:= UPROOT:= PRACTICE:= UNRAVEL:+ IMITATE:+ SACK:= MOTIVATE:= FORCE:= STEAL:= DECIDE:+ POP:= UTTER:= ABUSE:+ SLOW:+ ADVANCE:+ LOVE:+ FAMILIARIZE:+ CRASH:= INTERRELATE:+ APPROXIMATE:+ AFFIRM:+ WEAR:= SOAR:= SEE:+ ORGANIZE:= INSTALL:= INSTRUCT:= PLEASE:+ GATHER:= WITHSTAND:+ WAR:= SHATTER:+ RING:= SNUGGLE:+ PACK:= MIME:+ RENOVATE:= CONSULT:+ YIELD:= CHAT:= FIGHT:= SUPPORT:+ INTERFERE:= ENRICH:+ IMPROVE:= VOW:+ HIRE:= ENHANCE:+ INNOVATE:= SIGN:= REVIEW:+ OVERCOOK:+ RANK:= TRIUMPH:= CROWD:= SYSTEMATIZE:= RESUME:= VIEW:+ ENACT:= MAN:= PEEL:= FILL:= SPY:= SCOLD:+ REDESIGN:= CONK:= CONSUME:= WED:= DECLARE:= INITIATE:= AWARD:= LOAD:= BEHOLD:+ QUANTIFY:= ROUGH:= MATTER:= REFINE:= EXPERIMENT:= INDULGE:= SCORN:+ CONSIST:= STRAY:= INHERIT:= AMOUNT:= RIDE:= ELIMINATE:= MAXIMIZE:+ FORM:= CONCLUDE:+ TREAD:= COMPLAIN:+ INTRODUCE:= CLING:= GAUGE:= POOL:= CONVICT:= COUPLE:= MULTIPLY:= MARRY:= CRIPPLE:= MOW:= PANDER:+ PERUSE:+ ACCOUNT:= PROTECT:= OBSCURE:+ UNDO:= SUM:= PERMIT:= HURL:= TROUBLE:+ NEGLECT:+ ERUPT:= IMPRESS:+ SUPERSEDE:+ MOUNT:= TRUST:+ WAD:= SUPPLY:= FIGURE:= TIME:= SUCCUMB:= ASSUME:+ ATTEMPT:+ LOOK:+ DISCLOSE:+ VANISH:= WALK:+ DRUM:= SIDE:= SEAT:+ VIE:+ SPEAK:+ SUMMARIZE:= LEGITIMIZE:= ABSORB:= PERCEIVE:+ THREATEN:+ ENDANGER:= ORIENT:+ ADDRESS:= TELEPHONE:+ DEPLETE:= STALL:= WITNESS:= BOOST:+ AWAKEN:+ STIMULATE:+ MISREPRESENT:+ DAMP:= FURTHER:= CIRCULATE:= LOOM:+ STUNT:= DESTABILIZE:+ SPUR:= SQUANDER:= CHOOSE:+ REGRET:+ SOUND:= CONFIRM:= ENTICE:+ ARM:= UTILISE:= DISAPPEAR:+ CONTROL:+ ENSLAVE:= TERM:= CROP:= PROFIT:= PAPER:= THRONG:= DREAM:+ WET:= APPRISE:+ COMBINE:= BLOCK:= FELT:+ MIX:= DISENFRANCHISE:= CATCH:= REGAIN:= PREDICT:+ TRANSFER:= PHONE:= APPOINT:= DISCRIMINATE:+ COMPENSATE:= CLEAR:= EMPHASIZE:+ SURFACE:= SUCCEED:= UNCOVER:= OVERTHROW:= ISOLATE:= MELT:= NEGOTIATE:= STRUGGLE:+ ANALYZE:+ RITUALIZE:+ BAFFLE:+ VILIFY:+ FEAR:+ PURSUE:+ FATHER:+ BLUR:= STAIN:= MARK:= EMPLOY:= SWING:= TRIGGER:= HIGHLIGHT:+ PUBLISH:= DISCOURAGE:+ MASTER:= SURROUND:= POPULATE:= JOIN:= NOTICE:= REFLECT:= ADVERTISE:= LEND:= COMPLY:= EDUCATE:= EXAGGERATE:+ CHARACTERIZE:+ AUGMENT:+ SUFFICE:+ DECEIVE:+ UNIFY:= APPRAISE:+ DUPLICATE:= GIVE:= ROUTE:= OVERSHADOW:+ TEACH:= CART:= PROD:= PRIDE:+ MISBEHAVE:+ EMBARK:= IMPLANT:= DEVOTE:+ DEAL:= REINFORCE:+ DEMARCATE:= DISPROVE:= COVER:= POSTPONE:= VERIFY:= ORIGINATE:= WRENCH:= REVERSE:= VIOLATE:= TRANSLATE:= REBEL:= BESTOW:+ DESIGNATE:= EXPOSE:= PROMPT:= CLAIM:+ PRESERVE:= UPDATE:= FRAME:= TRANSFORM:= FIRE:= PHILOSOPHIZE:+ FALTER:+ CLIMB:= JACK:= CERTIFY:+ INSULT:+ LURE:+ SING:+ TELL:+ DETERMINE:= INHIBIT:+ BUY:= SAVE:= HOOK:= PRECEDE:= CREEP:= ACCEDE:= CEMENT:= COOK:= WHIP:= PENALISE:= WHEEZE:+ PACE:= RENT:= STEP:= INSTIGATE:+ GRASP:+ MONOPOLIZE:+ EMULATE:+ REQUIRE:= GRADE:= IGNITE:= RECONSIDER:+ FOLLOW:= TILT:= KNIT:= EXPEL:= BITE:= BATH:= INDUCE:= AGGRAVATE:+ LESSEN:+ ADMIRE:+ DISLIKE:+ HEAD:= ANNOUNCE:= CONTRAST:+ SAIL:= SEX:= LOCATE:= CHALK:= GUARANTEE:= FLOURISH:= GEAR:= ALIEN:= BUCK:= FANCY:+ OVERESTIMATE:+ PRACTISE:= SEEK:= COPE:= PRAISE:+ HEIGHTEN:+ HAND:= AMELIORATE:= PROMULGATE:= EQUIP:= DEFEND:= ENLIVEN:+ CLOG:= POISON:= EXCEED:= CONNECT:= FELL:= SPICE:= PET:= MIGRATE:= PRESENT:= MODEL:= LINE:= STARVE:= PREOCCUPY:+ REVOLUTIONIZE:+ ENCOURAGE:+ MARVEL:+ OBLITERATE:= DESIRE:+ SUBVERT:= ANTAGONIZE:+ BOLSTER:+ NEED:+ CORRECT:= APPLY:= SPEED:= RESORT:+ TIE:= TRANSPIRE:= MOOT:= CHEAT:+ PRONOUNCE:= SATISFY:+ JAIL:= DEVIATE:= RUN:= ARRIVE:= MISINFORM:+ OUTRAGE:+ SALVAGE:= LEAVE:= POLLUTE:= SIGHT:+ INTERNALIZE:+ PULL:= BEAR:+ PRESCRIBE:+ PROGNOSTICATE:+ ENCUMBER:+ BEHAVE:+ ADAPT:+ CREATE:= QUALIFY:= FLING:= COMPOSE:= ENCASE:= ALLOW:= EXACERBATE:+ TUNE:= ISSUE:= LOITER:= CONCENTRATE:+ GAD:+ END:= FORESEE:+ HAUNT:+ BED:= CRINKLE:= FUSS:+ INHALE:= STAR:= RENEW:= PRECIPITATE:= DESERVE:+ ACCEPT:+ REHEAT:= REFER:+ FUND:= OBTAIN:= SUPPOSE:+ HOLIDAY:= SCOFF:+ RETROFIT:= VOICE:+ HAPPEN:= HAT:= MISUNDERSTAND:+ SCRATCH:= EXCITE:+ ENLIGHTEN:+ CHART:= OCCUR:= STAFF:= SERVICE:= RIGHT:= INJURE:= BRAND:= PROLIFERATE:= CONSTRAIN:= REALISE:+ RECOUNT:= OBLIGE:+ STRETCH:= MARKET:= BREED:= BORROW:= FACILITATE:= ADVOCATE:+ SWEEP:= ENTERTAIN:+ ADMIT:+ ACCUSE:+ PLOUGH:= REGIMENT:= ASCRIBE:+ TACKLE:+ DROWN:= KNOCK:= WHISPER:+ MEMORIZE:+ TORTURE:+ FORGET:+ PRICE:= CHASE:= EQUAL:= HAUL:= DOCUMENT:= DISPIRIT:+ REASON:+ TRACE:= ENGINEER:= ORBIT:= REIGN:= SCORE:= LIGHTEN:+ BETRAY:+ ACT:+ ENGULF:= DIG:= HOME:= POWER:= WORSEN:+ ESTABLISH:= RETURN:= SURPASS:= MINIFY:+ ATTRACT:+ DAWDLE:= ENERGIZE:+ LOBBY:= SHAKE:+ VACATION:= DIVIDE:= TITILLATE:+ ROAM:= PREDISPOSE:+ BATHE:= REWRITE:= SHIP:= ADVANTAGE:+ WIND:= HANG:= SPECIFY:= WASTE:= MAKE:= REPEAL:= REVILE:+ FULFILL:+ DANGLE:= SUBSTITUTE:= EXCLAIM:+ CHECK:= SCAN:= FALL:= TOIL:= PAINT:= APPRECIATE:+ SLEEP:+ ENSURE:+ RECONSTRUCT:= HYPOTHESIZE:+ DEMOTE:= THEORIZE:+ PLANE:= CONCEDE:= ROLL:= PROFESS:+ REVISE:= WHISTLE:= DETER:= TARNISH:+ BEAT:= DISCUSS:+ LINGER:+ COUGH:+ GAIN:= EMBALM:= DEFY:= CAPTURE:= DEIFY:+ AMEND:= CURB:= SURF:= THRILL:+ LABEL:= DRAG:= ENGENDER:= RAISE:= STREAK:= HIDE:= CRY:+ COUNT:= BOIL:= RELEASE:= SUIT:= BASTE:= REPLICATE:= VEX:+ CAPITALIZE:= INTERTWINE:= TIGHTEN:= THWART:+ INFLICT:+ MANIFEST:+ EXTRACT:= LITTER:= PROPEL:= WANT:+ STREW:= BORE:+ MISINTERPRET:+ ELEVATE:= JOUST:= LIGHT:= PARADE:= SPORT:= CLICK:= BELIE:+ CONCEAL:= FLOAT:= ATTRIBUTE:= CONTRADICT:+ ADJUST:= CULMINATE:= SHARPEN:= REDUCE:= MENTION:+ SPEND:= DUB:= COMPEL:= LAMENT:+ OVEREAT:+ SCREAM:+ COMMIT:= PUNCH:= EMAIL:+ RACE:= REFORM:= ATTEST:= DELAY:= OBEY:= SNEAK:+ RECORD:= CIRCUMNAVIGATE:= PURIFY:= RIP:= BATTLE:= DEFROST:= MOLD:= KICK:= SHUT:= UNVEIL:+ DEPLOY:= CONVINCE:+ ESCALATE:+ ACCUMULATE:= SPAWN:= SCALE:= RELY:+ PLOW:= REPUDIATE:+ CONFER:+ DISSEMINATE:= DISCONTINUE:= NIP:= ROCK:= ROAST:= TITLE:= DIVE:= MINE:= ASSOCIATE:= BET:+ CARRY:= SPRING:= ENTAIL:= GRADUATE:= VOTE:+ EMPOWER:= COAT:= MANDATE:= WITHDRAW:= AMUSE:+ MOVE:= OPPOSE:+ DILUTE:= INSPECT:+ DEVELOP:= WARN:+ SCARE:+ GLOW:= WIN:= RECOVER:= GUIDE:= MINIMISE:+ COMBAT:= ENDOW:= CREDIT:+ PRIORITISE:+ SECURE:= FASCINATE:+ SENSE:+ INCASE:= DIRECT:= DESTROY:= FACTOR:= PREVENT:= MAJOR:= EXCLUDE:= SPECULATE:+ SELECT:+ GOSSIP:+ STRIKE:= CONDONE:+ FORMULATE:+ DISCREDIT:+ DISPUTE:= PROCESS:= PATENT:= CAST:= ANSWER:= PARTAKE:= CONDEMN:+ POSITION:= CUT:= SOOTHE:+ STATE:+ SMOKE:= INDUSTRIALIZE:= FORTIFY:= RUSH:= BIAS:+ DISSENT:+ WEED:= TEAR:= CIRCLE:= SITUATE:= SPONSOR:= WEIGHTLIFT:= DREAD:+ KNOW:+ ADOPT:= INTERROGATE:= JUMP:= MANAGE:= DEEM:= CONCEIVE:= FILTER:= SKEW:+ FAVOR:+ OVERRATE:+ PROFILE:= TOLERATE:+ CONJURE:= WEIGH:= SUBSTANTIATE:= REASSURE:+ ARTICULATE:+ CURE:= FOCUS:= EVEN:= BUDGET:= TRIM:= ENVY:+ INCULCATE:= REMEMBER:+ SIGNAL:= MISUSE:= MANIPULATE:= DRAW:= LAST:= INCUR:= HOLD:= MATURE:= HANDLE:= RESEMBLE:+ PUBLICIZE:= UNDERPIN:= STACK:= PUSH:= RELEGATE:= SEAL:= INTERPRET:+ QUESTION:= SHORT:= SPILL:= COMPLICATE:= WAIT:= PICK:= DISTORT:+ CHALLENGE:= DEMONSTRATE:= INCLINE:+ PENALIZE:= SPAN:= BIND:= SERVE:= DEMAND:= OPEN:= LOWER:+ SUGGEST:+ THANK:+ LAND:= BELONG:= DISCIPLINE:+ STAY:= IMAGINE:+ HIT:= PHRASE:= REDEFINE:= BASE:= SEARCH:= PLACE:= LOCK:= SEND:= PREPARE:= NECESSITATE:= RANGE:= ACQUIRE:= KIDNAP:= APPEAR:+ ALLOT:= DELETE:= QUIT:= BROWSE:= COMPLETE:= PRESSURE:= USE:= DESIGN:= CENSOR:= SIGNIFY:= PASS:= RECREATE:= RAT:= SWELL:= CARVE:= CHEW:= INFRINGE:= AUTHORIZE:= DISCOMFIT:+ SATURATE:= REALIZE:= CRUSH:= SMASH:= BULLY:= RECOMMEND:+ FORGE:= SPARK:= STIFLE:= RISE:= OBJECT:= ROUT:= PREVAIL:= IMPORT:= OVERSTATE:+ UP:= TRANSFIX:= GRIND:= RAGE:+ REGULATE:= SENTENCE:= NOTE:= TEST:= PALE:= SUPPRESS:= COMPLEMENT:+ TOUT:= DISALLOW:= DATE:= HARNESS:= OVERWHELM:+ COMMENT:= CALCULATE:= EXPRESS:+ PAL:= PROGRAM:= DISAGREE:+ SUE:= ASK:= EMBARRASS:+ RAPE:= VARY:= INDICATE:= NEIGHBOR:= ACTIVATE:= RESTORE:= FORFEIT:= ALLEVIATE:= BECOME:+ FOOL:+ APOLOGIZE:+ GROW:= JERK:= ABDICATE:+ RAVAGE:= RESPOND:= DEBATE:= ERADICATE:= KILL:= DONATE:= OPERATE:= DISPOSE:= AID:= FLOOD:= CONQUER:= HEAR:+ ORDER:= EXAMINE:= DIE:= COMPILE:= STEER:= TAT:= WANE:= STAND:= POLL:= GUESS:+ MIRE:= RESTRAIN:= SANCTION:= COMPUTERIZE:= CONVERT:= FLY:= RETIRE:= REACT:= SAP:= OWN:= LET:= COLLAPSE:= PROTEST:= ENTER:= BOOM:= ABHOR:+ LAY:= INTERMARRY:= DETECT:= EXPLAIN:= BOUND:= SEPARATE:= EMBRACE:= LEAN:= FILE:= DESTRUCT:= RAZE:= MISLEAD:= MISTAKE:= WIRE:= ACHIEVE:= APPROACH:= NOURISH:= INFORM:= SPREAD:= SIMULATE:= STRUCTURE:= FRUSTRATE:+ ANNOY:+ IDENTIFY:= PROSECUTE:= BELLOW:= OUTSOURCE:= PROMISE:= REPRESENT:= SWIM:= PARSE:= EVOLVE:= INTERWEAVE:= UNDERESTIMATE:+ BE:= OPT:+ PAY:= POSE:= RETAKE:= IMPRISON:= RETAIN:= ALLOCATE:= MODIFY:= UNDERTAKE:= REMAIN:= REGISTER:= CONCUR:= BEWILDER:+ ANTICIPATE:+ DISTRACT:= DESCRIBE:+ SAFEGUARD:= ASSIMILATE:= MANICURE:= DISCARD:= ATTEND:= INTENSIFY:= DEDUCT:= UNDERGO:= CONFINE:= AFFLICT:= WREAK:= CONFUSE:+ RID:= ERASE:= RESIST:= EARN:= SUFFER:+ COMPREHEND:+ NARROW:= PERPETUATE:= EXPLORE:+ OSTRACIZE:= PLAY:= HESITATE:+ COERCE:= SILENCE:= DEPLORE:= RIVAL:= HATE:+ DIAL:= EMERGE:= LIKE:+ RESHAPE:= DEPART:= FETCH:= PREDICATE:= PUT:= RESIGN:= SAY:+ CLASSIFY:= ENJOY:+ SURVEY:= REWARD:= HOP:= DISOBEY:= SUSPECT:+ COVET:+ DECREASE:= PIONEER:= FEE:= COME:= REEXAMINE:= INCORPORATE:= FOUND:= CORRESPOND:= OUTWEIGH:= DISCHARGE:= VISIT:= INSTILL:= ACCORD:= TRANSMIT:= FARE:= ALTER:= DISPARAGE:+ INJECT:= RENDER:= MAINTAIN:= JOG:= PINPOINT:= CONSIDER:+ UPGRADE:= THINK:+ TRICK:= LAUD:+ COMPUTE:= NAME:= SMELL:+ CARE:= PIT:= REAP:= BRIDGE:= JUSTIFY:+ COD:= AIL:= AGE:= TRAIN:= OVERLOOK:= EXPEND:= RESOLVE:= DRENCH:= RESTRICT:= INFER:+ HURT:+ REACH:= ANALYSE:+ ELABORATE:= PASTE:= MINIMIZE:= START:= WIDEN:= OVERBURDEN:= VIBRATE:= WAKE:= ASSIGN:= RELAY:= FINISH:= WRITE:= CRACK:= PRECLUDE:= WISH:+ CLEAN:= STABILIZE:= SHUN:= SEEM:+ MAGNIFY:= LIVE:= FAIL:+ ACCRUE:= GIGGLE:= RESIDE:= DEBASE:+ AUTOMATE:= SHAPE:= KEEP:= PROHIBIT:= EFFECT:= PERSONALISE:= IMPAIR:+ UPSET:+ DAMAGE:= MEDITATE:= ASSESS:+ REJECT:= BURGEON:= REPLACE:= DISTRUST:+ TALK:+ INTERACT:= COLLECT:= RUBBISH:= WELCOME:= IMPEDE:= PARTICIPATE:= HASTEN:= UNDERLIE:= VALUE:= SLICE:= STRESS:+ RECRUIT:= BEG:= DRIVE:= SKIM:= HAMPER:= SUBJECT:= PUNISH:= FINE:= ENDORSE:= EXPAND:= ADMINISTER:= SPELL:= IMPLY:+ COUCH:= JAR:= KEEN:+ ROUND:= DEFINE:= COST:= EMIT:= COMMISSION:= ADVISE:+ LAUNCH:= CONVERSE:+ REPLY:= IGNORE:= REPORT:= INSPIRE:= BAN:= COINCIDE:= REFUSE:= DEFEAT:= ARGUE:+ FEED:= BURN:= CHANGE:= WARD:= POINT:= FEEL:+ SWARM:= LEARN:+ SLAM:= SUPERVISE:= CARESS:= DEPRIVE:= COMPARE:= ARISE:= TELEVISE:= CONTAIN:= DECIPHER:= ENCOUNTER:= MEASURE:= GOVERN:= SOLVE:= AFFECT:+ RULE:= OBSERVE:+ OUTLAW:= WASH:= SOLICIT:= AMAZE:+ WATCH:= SEEP:= AVOID:= TYPE:= FAVOUR:= CONCERN:+ CULTIVATE:= EAT:= SET:= PILOT:= ABANDON:= CRANK:= SUPPLANT:= DICTATE:= LOSE:= STOP:= THROW:= CASE:= COMFORT:= CONTINUE:= HOPE:+ READ:= TEND:+ SKYROCKET:= INVOLVE:= ACCOMPANY:= STRENGTHEN:+ FIND:= CONDUCT:= CONFRONT:= LIE:+ INFUSE:= ESTIMATE:+ PROVIDE:= DOOM:= DANCE:= FUNCTION:= REVOLVE:= LAUGH:= DIMINISH:+ DIFFERENTIATE:+ FORGIVE:= ADD:= SURMISE:= PREACH:= REMIND:+ DETERIORATE:+ DEPEND:= SIMPLIFY:= CITE:= HABITUATE:= INFLUENCE:= CORRUPT:= RESERVE:= PURSE:= PONDER:+ APPROVE:+ LEAD:= EMPATHISE:+ INVITE:= IMPACT:= CLARIFY:= CAUSE:= COMPETE:= JAM:= RECEIVE:= DIGEST:= PURPORT:= REQUEST:= ATTAIN:= SEIZE:= HOUSE:= ENDURE:= ABOLISH:= ACCUSTOM:= CONVENE:= LABOR:= CRITICIZE:+ RESPECT:+ ENROL:= FREEZE:= REMOVE:= DISMANTLE:= CONTAMINATE:= CAMP:= ENROLL:= FARM:= CONSERVE:= LIST:= CHERISH:+ SUBSIDIZE:= STUDY:= TIRE:+ SUBMIT:= ATTACH:= DISPLAY:= MURDER:= STUMBLE:= DENOTE:= EXERCISE:= DOUBLE:= RIDICULE:+ NET:= SPIRAL:= DISTINGUISH:+ DEPICT:= SIT:= ABIDE:= REFRAIN:= CODE:= JEOPARDIZE:= BUILD:= INTEREST:+ ENFORCE:= EXTEND:= FASHION:= LAG:= RECOGNIZE:+ INVADE:= PERFORM:= EXCHANGE:= PROJECT:= DOMINATE:= REPAIR:= CALL:= PEER:= ENCODE:= RESULT:= SHOW:= LEASE:= DWELL:= DISDAIN:+ REPUTE:= WORRY:+ EQUATE:= ASSERT:+ MIMIC:= INSIST:+ WITHHOLD:= LISTEN:= EXIST:= PLANT:= INTERVENE:= RAG:= ILLUSTRATE:+ COMMUNICATE:= STARE:= COORDINATE:= HELP:= FUEL:= PATROL:= REGURGITATE:= CONSENT:+ FACE:= QUOTE:= SUBSCRIBE:= OFFEND:+ INSTAL:= DISTRIBUTE:= RECOGNISE:+ EXECUTE:= SHOUT:= PLAN:= MATERIALIZE:= SOCIALIZE:= OVERCOME:= CLUTTER:= MEAN:+ AGREE:+ MISSPELL:= SULLY:= AIM:= INTEGRATE:= PROVE:= BAR:= RELAX:= CONTRIBUTE:= IMPOSE:= SCRIBBLE:= DISCOVER:= CHARGE:= DIFFER:+ POSSESS:= LACK:= OFFSET:= FORGO:= JUDGE:+ BROADCAST:= PLUMMET:= COMMAND:= ACKNOWLEDGE:= BEGIN:= RATE:+ PRODUCE:= CHILL:= INCITE:= SETTLE:= FREQUENT:= DEMOLISH:= UNDERMINE:+ ALIENATE:= PERTAIN:= WATER:= SINGE:= PRINT:= TEMPER:= FIX:= DEDUCE:= OCCUPY:= SUBSIDE:= AFFORD:= TOUCH:+ PRESUPPOSE:+ PRIZE:= CONSTRUCT:= LIMIT:= OBVIATE:= SPECIALIZE:= SACRIFICE:= ADHERE:= HINDER:= ATTACK:= EVOKE:= FIELD:= ERR:= PRESUME:+ SWITCH:= SLIDE:= INTEND:+ ENCROACH:= CONSTITUTE:= CEASE:= DRINK:= DESPISE:+ ENABLE:= ASSURE:= TASTE:+ FORSAKE:= ACCOMPLISH:= CORRELATE:= DRIFT:= TRAVEL:= MIND:+ REIGNITE:= DRESS:= THRIVE:= PHOTOGRAPH:=

[2]

ADJECTIVES DATA

Filename: adj.txt

It contains a list of adjectives which will be used to describe sentiments. Each adjective is followed by a ``GOOD'' or a ``BAD'' tag which specifies whether the sentiment is positive or negative respectively. There are a total of 803 adjectives.

aberrant BAD abhorrent BAD abominable BAD abrasive BAD abysmal BAD acclaimed GOOD accomplished GOOD accredited GOOD admirable GOOD admire GOOD adorable GOOD advantageous GOOD advisable GOOD aesthetic GOOD agitated BAD agonizing BAD agreeable GOOD angry BAD anguish BAD annoyed BAD annoying BAD atrocious BAD attractive GOOD audacious BAD awe GOOD awe-inspiring GOOD awesome GOOD awestruck GOOD awful BAD bad BAD badly-off BAD bad-off BAD bad-tempered BAD baffle BAD balmy GOOD barbaric BAD baseless BAD beauteous GOOD beautiful GOOD bewildered BAD bewildering BAD bitchin GOOD bitchy BAD blemish BAD blessed GOOD blissful GOOD blister BAD blistering BAD blithering BAD bloody BAD bogus BAD bumbling BAD callous BAD calloused BAD callow BAD cantankerous BAD captivating GOOD catastrophe BAD characterless BAD charismatic GOOD charitable BAD charming GOOD cheerful GOOD cheerless GOOD chilling GOOD classless BAD classy GOOD climatic GOOD closeout GOOD clueless BAD clumsy BAD clunky BAD coarse BAD colourful GOOD colourless BAD comical GOOD compassionate GOOD complimentary GOOD worse BAD congratulate GOOD controversial GOOD conventional BAD convincing GOOD crap BAD crappy BAD crass BAD creditable GOOD creditworthy GOOD credulous BAD crude BAD crummy BAD deficient BAD delightful GOOD depressive BAD desirable GOOD dicey BAD dinky BAD dirty BAD disagreeable BAD disappointed BAD disappointing BAD disastrous BAD disbelieve BAD discreditable BAD disdainful BAD disenchanted BAD disgraceful BAD disgruntled BAD disgusting BAD disharmony BAD disheartened BAD disheartening BAD disinclined BAD disinterested BAD displeased BAD disruptive BAD dissatisfied BAD distasteful BAD distinguishable BAD distressing BAD disturbed BAD drawn-out BAD dreaded BAD dreadful BAD dreadfully BAD dreamlike GOOD dreamy GOOD dreary BAD dubious BAD earth-shattering GOOD easy-going GOOD effective GOOD effortless GOOD elaborate GOOD enchanted GOOD energetic GOOD engaging GOOD enraptured GOOD entertaining GOOD enthralled GOOD enthralling GOOD enthusiastic GOOD enticing GOOD erroneous BAD exasperating BAD excellent GOOD exceptionable GOOD excitable GOOD excited GOOD exciting GOOD excruciating BAD exemplary GOOD explicable BAD explosive GOOD exquisite GOOD failed BAD extraordinary GOOD eye-catching GOOD fab GOOD fabulous GOOD faltering BAD famed GOOD fantastic GOOD farcical BAD far-flung BAD far-out GOOD fascinating GOOD faultless GOOD faulty BAD favorable GOOD favourable GOOD favourite GOOD fawn GOOD featherbrained BAD featureless BAD feeble BAD feeble-minded BAD feel-good GOOD fey BAD filthy BAD happy GOOD first-class GOOD first-degree BAD first-rate GOOD fishy BAD flabbergasted BAD flaky BAD flattering GOOD flavorful GOOD flavoured GOOD flawed BAD flawless GOOD flimsy BAD fond GOOD fool GOOD foolhardy GOOD foolish GOOD fop BAD foredoomed BAD forgettable BAD fortunate GOOD foul BAD foul-mouthed BAD four-star GOOD foxy GOOD fragile BAD fragment BAD fragmentary BAD frazzled BAD freaky BAD frightful BAD fringe BAD frosty BAD frozen BAD fruitful GOOD fruitless BAD frumpy BAD frustrated BAD frustrating BAD fucked up BAD fucking BAD fuddy-duddy BAD fudgy BAD fug BAD fulfilled GOOD fulfilling GOOD furious BAD futile BAD gallant GOOD gallant GOOD galling BAD garrulous BAD gauche BAD gaudy BAD gawky BAD genial GOOD gentile GOOD gentle GOOD gentlemanly GOOD ghastly BAD gibbering BAD gifted GOOD gimmick BAD glamorous GOOD glaring BAD gleeful GOOD glittering GOOD glitz GOOD glorified BAD glorious GOOD glowing GOOD glum BAD god-awful BAD godforsaken BAD godless BAD godlike GOOD godly GOOD goggle-eyed GOOD good-for-nothing BAD good-humoured GOOD good-looking GOOD goodly GOOD good-natured GOOD goo-goo eyes GOOD gorgeous GOOD graceless BAD gracious BAD grandiloquent GOOD grandiose GOOD great GOOD gratifying GOOD gregarious BAD grieved BAD grievous BAD grimy BAD groove GOOD groovy GOOD gross BAD grotesque BAD groundbreaking GOOD groundless BAD grubby BAD grumpy BAD gruesome BAD grungy BAD hair-raising GOOD hairy BAD half-baked BAD half-crazed BAD happy-go-lucky GOOD hard BAD harebrained BAD harmful BAD harmonious GOOD harrowed BAD harrowing BAD harsh BAD harum-scarum GOOD hate BAD hateful BAD heartbreaking BAD heartbroken BAD heartsick BAD heartwarming GOOD heavenly GOOD heaven-sent GOOD heavy-hearted BAD heinous BAD hellish BAD hideous BAD high-minded GOOD high-ranking GOOD hit-and-miss BAD hoity-toity BAD hokey BAD holier-than-thou BAD hollow BAD honky-tonk BAD honorable GOOD honourable GOOD hooked GOOD hopeless BAD horrendous BAD horrible BAD horrid BAD horrific BAD horrify BAD horror-struck GOOD hostile BAD hot-blooded GOOD hotshot GOOD humdrum BAD humiliate BAD humiliating BAD humorous GOOD humourless BAD hunky-dory GOOD icky BAD ideal GOOD idiot BAD iffy BAD igneous GOOD ignorant BAD ill-advised BAD ill-considered BAD ill-defined BAD ill-starred BAD ill-timed BAD impalpable BAD impassable BAD impeccable GOOD improbable BAD improved GOOD inaccessible BAD inadequate BAD inadvisable BAD incapable BAD incoherent BAD incompetent BAD incomplete BAD incomprehensible BAD inconceivable BAD inconsistent BAD incredible GOOD incredulous BAD in-depth GOOD indiscernible BAD ineffectual BAD inefficient BAD inept BAD inexperienced BAD inexpert BAD inexplicable BAD infamous BAD infatuated GOOD inferior BAD inflexible BAD infuriating BAD ingenious GOOD injudicious BAD slow BAD inoffensive GOOD insensible BAD insensitive BAD insignificant BAD insincere BAD insipid BAD insolent BAD inspired GOOD inspiring GOOD instructive GOOD insufferable BAD insufficient BAD intelligent GOOD insulting BAD insupportable BAD intellectual GOOD intelligible GOOD intense GOOD interested GOOD intolerable BAD intolerant BAD intriguing GOOD irksome BAD irredeemable BAD irritating BAD joyful GOOD joyless BAD joyous GOOD jubilant GOOD knockout GOOD lacking BAD leery BAD lifeless BAD likable GOOD likeable GOOD long-lasting BAD longwinded BAD lovable GOOD loveless BAD lovely GOOD low BAD low-cal BAD low-class BAD low-down BAD luscious GOOD lush GOOD luxuriant GOOD luxurious GOOD magical GOOD magnetic GOOD majestic GOOD major-league GOOD marvellous GOOD mournful BAD mournful BAD muddle BAD mundane BAD murky BAD mystical GOOD nasty BAD nauseous BAD nice-looking GOOD noncommittal BAD nondescript BAD nonsensical BAD noxious BAD obsequious BAD odious BAD odoriferous BAD off BAD off-balance BAD offbeat GOOD offensive BAD oppressed BAD oppressive BAD outdated BAD overdeveloped BAD overdone BAD overjoyed GOOD painful BAD painless GOOD passive BAD perfect GOOD peevish BAD perfectible GOOD perfectionist GOOD pessimistic BAD picture-perfect GOOD picture-postcard GOOD picturesque GOOD pinched BAD piss-ass BAD pissy BAD pitiful BAD playful GOOD pleasant GOOD pleased GOOD pleasing GOOD pleasurable GOOD poised GOOD poorly BAD poor-spirited BAD positive GOOD prissy BAD pristine GOOD prize-winning GOOD prominent GOOD purposeless BAD puzzling BAD queasy BAD querulous BAD questionable BAD quirky GOOD rancid BAD rancour BAD rapturous GOOD rash BAD raunchy BAD refreshing GOOD regretfully BAD regrettable BAD remarkable GOOD remedial BAD remiss BAD remorse BAD reproving BAD repulsive BAD reputable GOOD resentful BAD respected GOOD respectful GOOD rewarding GOOD roadworthy GOOD rock-bottom BAD rock-hard GOOD rock-solid GOOD rock-steady GOOD rollicking GOOD romantic GOOD rose-coloured GOOD rose-tinted GOOD rotten BAD roughcast BAD rough-hewn BAD roughshod BAD rousing GOOD rubbish BAD rubbishy BAD rude BAD satisfying GOOD savvy GOOD scrappy BAD second-rate BAD second-string BAD seedy BAD senseless BAD shabby BAD shady BAD shaky BAD shallow BAD shameful BAD shining GOOD shit BAD shitty BAD shoddy BAD shopworn BAD showy GOOD shrewish BAD shrill BAD sickening BAD sickly BAD sketchy BAD skew BAD skewed BAD sleek GOOD slippery BAD sloppy BAD slow-witted BAD sluggish BAD slummy BAD small-scale BAD smarmy BAD smashing GOOD smelly BAD snazzy GOOD sneaky BAD snide BAD solid GOOD solid-state GOOD sophomoric BAD sought-after GOOD soulless BAD soundless BAD sour BAD sparkling GOOD devoid BAD spartan BAD spectacular GOOD spellbound GOOD spicy GOOD spiffy GOOD spiteful BAD splendid GOOD splendiferous GOOD standalone GOOD starry-eyed GOOD star-studded GOOD stinky BAD stupefying BAD stupendous GOOD stupid BAD stylish GOOD stylistic GOOD stylized GOOD suave GOOD sufficient GOOD sulphurous BAD sumptuous GOOD sunny GOOD superb GOOD supercharger GOOD superduper GOOD superfluous BAD superior GOOD superlative GOOD swashbuckling GOOD swell GOOD tacky BAD tasteless BAD tasty GOOD tattered BAD tedious BAD terrific GOOD third-rate BAD thoughtless BAD thrilling GOOD thunderstruck GOOD top-class GOOD top-drawer GOOD top-notch GOOD top-ranking GOOD top-rated GOOD tortuous BAD trendy GOOD unattractive BAD unbearable BAD unbecoming BAD undeveloped BAD unfashionable BAD unfavourable BAD unimpressed BAD unimpressive BAD uninspiring BAD unintelligible BAD uninterested BAD unique GOOD unlovely BAD unrealistic BAD unreasonable BAD upset BAD upstanding GOOD washed-up BAD withered BAD withering BAD witty GOOD worst BAD worthless BAD yummy GOOD zany GOOD zest GOOD superficial BAD negative BAD stagnant BAD sick BAD stunning GOOD repugnant BAD sterling GOOD substandard BAD smart GOOD supreme GOOD sensational GOOD sold-out GOOD svelte GOOD sublime GOOD riveting GOOD stale BAD tantalizing GOOD stellar GOOD standout GOOD tasteful GOOD succulent GOOD putrid BAD taxing BAD shifty BAD soporific GOOD rave GOOD terrible BAD five-star GOOD slimy BAD sensible GOOD sophisticated GOOD sharp-witted GOOD thought-out GOOD sharp GOOD flagging BAD significant GOOD short-sighted BAD septic BAD sizzling GOOD reprehensible BAD show-stopping GOOD retarded BAD refined GOOD memorable GOOD timeless GOOD satisfactory GOOD momentous GOOD ridiculous BAD fluffy BAD revolting BAD second-class BAD polished GOOD run-of-the-mill BAD pedestrian BAD ponderous BAD plebeian BAD reliable GOOD prestigious GOOD primitive BAD powerful GOOD phony BAD fool BAD pretentious BAD pointless BAD plus GOOD poky BAD plum GOOD plucky GOOD foolish BAD patronizing BAD palatable GOOD fraudulent BAD frigging BAD pathetic BAD perturbed BAD outstanding GOOD front-page GOOD opposed BAD frostbite BAD over-the-top BAD passionate GOOD gaga GOOD newsworthy GOOD mortifying BAD narrow-minded BAD generic BAD monotonous BAD monumental GOOD geriatric BAD glad GOOD glassy-eyed GOOD grand GOOD grating BAD grody BAD half-arsed BAD half-hearted BAD handsome GOOD magnificent GOOD headache BAD high-grade GOOD fun GOOD masterful GOOD historical GOOD ho-hum BAD ill BAD iluminating GOOD tip-top GOOD immeasurable GOOD tiresome BAD immense GOOD impressive GOOD vile BAD incipient BAD touch-and-go BAD indecent BAD maddening BAD wary BAD wishy-washy BAD weak BAD ineffective BAD inelegant BAD inexcusable BAD wonderful GOOD with-it GOOD yucky BAD upscale GOOD lurid BAD lame BAD loathsome BAD lousy BAD lukewarm BAD world-class GOOD world-famous GOOD insatiable BAD inspirational GOOD intolerant BAD irresistable GOOD kick-ass GOOD trashy BAD unpopular BAD unimaginative BAD two-star BAD ugly BAD upbeat GOOD unsophisticated BAD unacceptable BAD unappealing BAD unforgettable GOOD underdeveloped BAD unexciting BAD undesirable BAD reasonable GOOD

[3]

OTHER OPINIONS DATA

Filename: other-opinions.txt

It contains a list of words that if present in a sentence, will make the sentence as an opinion. There are a total of 32 words.

what how why who comfortable vital advantage disadvantage important nowadays impossible doubt actually true main obvious therefore necessary moreover reason conclusion nowadays unfortunately fortunately due leads instance example possible impossible obviously furthermore

SAMPLE SCORED ESSAYS

Below is a list of sample essays which have been graded by REGS. The essays have been obtained from the UMD-Essay-Corpus[1].

[4]

----------------

Sample Essay-1

----------------

Score: 6.00

Prompt:

Automated essay scoring is unfair to students, since there are many different ways for a student to express ideas intelligently and coherently. A computer program can not be expected to anticipate all of these possibilities, and will therefore grade students more harshly than they deserve. Discuss whether you agree or disagree (partially or totally) with the view expressed providing reasons and examples.

Essay:

It may seem that an automated essay scoring system is not capable of fairly grading a student essay, but I would disagree with the statement.

There is evidence that the scores assigned by an automated system corelates very well with that assigned by a human grader. The consistency in scores assigned to different essays on the same topic is higher that of the humans.

Also, the technology used in these systems has reached a state of maturity where it is capable of anticipating and evaluating the various ways of saying a particular thing. Automated systems, evolving over a period of time, can now identify and extract features of a written essay, that help evaluate it to a high degree of satisfaction. Such features typically include measures indicating complexity of the vocabulary used and errors in grammar, mechanics and style.

Even though the automated systems of today are doing a good job, there is still scope for improvement. Such systems are unable to correctly evaluate the use of technical, domain-specific terminology and colloquial language as well as factual correctness of the essay. In addition, it might be unable to handle a case where an essay is gramatically well-written, but irrelevant to the topic given.

In spite of such shortcommings, I believe that the automated essay scoring systems are in a state of development where they can handle the complexity of language usage and evaluate student essays fairly.

[5]

---------------

Sample Essay-2

---------------

Score: 4.00

Prompt:

In any field of inquiry, the beginner is more likely than the expert to make important discoveries. Also most scientific discoveries happen in the early ages of scientists. Give arguments supporting or against the fact.

Essay:

In most of fields of inquiry, discoveries originate through efforts of experts. A thorough knowledge and experience is essential for making an important discovery. Thus, I totally disagree with speaker's statement that the beginner is more successful in inventing major discoveries than the expert.

First, we will consider what is the difference between an expert and the novice. For becoming expert everyone has to pass through the phase of a beginner. While working in research field, the beginner faces some successes and failure. Gathering experiences from past experiments, gradually, beginner turns to an expert in that field. The tyro will make new discoveries later in his life. Most of the times discoveries are the outcome of empirical results. Thus, only having theoretical knowledge would not lead to become adept in field of inquiry, but, experience must be accompanied with it to make important discoveries.

Furthermore, we can see many examples of great scientists like Einstein, Newton. They worked hard for accomplishment of their discoveries. At start they were also neophytes. It was the experience that taught them how to improve their result and finally, led him to get great discoveries. Wright brothers invented the airplane. This happened early in their career too. First they studied the principle behind glider and prepared their own glider attached with engine, which finally converted into airplane. But as a novice, they have to face failure many times but eventually experience fill up their deficiency and they got success in their discovery. There are other examples too. Graham Bell was born in 1847. His first invention was done at a young age. Graham Bell invented Telephone in 1876.

Admittedly, in some fields of science, like archeology experience does not matter much. In such fields one has to discover by searching fossils, artifacts by digging in some archeological sites. Thus in such fields beginners can discover many new things. But it is undeniable that knowledge of an expert archeologist is much advanced than a neophyte in that field. Galileo died in 1642 at an age of 78. He liked working in his old age. He designed a pendulum clock around 1640. In most cases experience is important.

In sum, as per my opinion, one can make important discoveries if he is expert in that field. A novice can become an expert by performing experiments to gain significant knowledge. Thus, I think experience plays quite an important role in becoming an expert in any field.

[6]

---------------

Sample Essay-3

---------------

Prompt:

Many successful adults recall a time in their life when they were considered a failure at one pursuit or another. Some of these people feel strongly that their previous failures taught them valuable lessons and led to their later successes. Others maintain that they went on to achieve success for entirely different reasons. In your opinion, can failure lead to success? Or is failure simply its own experience?

Essay:

Learning the lessons taught by failure is a sure route to success. America can be seen as a success that emerged from failure. Learning from the weaknesses of the Articles of Confederation, the founding fathers were able to create the Constitution. Google is another example of a success that arose from learning from failure. In this case Google learned from the failures of its competitors. Another example that shows how success can arise from failure is the story of Rod Johnson. He started a recruiting firm that rose out of the ashes of Johnsons personal experience of being laid off.

America, the first great democracy of the modern world, achieved success by studying and learning from earlier failures. Annapolis Convention was convened in 1786. Then the constitution was drafted that created a more powerful central government. The government also maintained the integrity of the states. Also America world war 2 failures success value.

Google has suffered few setbacks in the late 1990s. Google has succeeded by studying the failures of other companies. By their stiudies innovated technology and business model. Google identified and solved the problem of assessing the quality of search results by using the number of links pointing to a page as an indicator of the number of people who find the page valuable. Googles search results became far more accurate and reliable than those from other companies.

The example of Rod Johnsons success as an entrepreneur in the recruiting field also shows how effective learning from mistakes and failure can be. He was laid off. Johnson realized that his failure to find a new job resulted primarily from the inefficiency of the local job placement agencies. Success failure no think, Johnson failure new jobs agencies help. A month later, Johnson created Johnson Staffing to correct this weakness in the job placement sector. Today Johnson Staffing is the largest job placement agency in South Carolina. It is in the process of expanding into a national corporation.

First of all, automated essay grading systems have been used to grade student essays on GMAT and TOEFL tests. There is evidence that the scores assigned by an automated system corelates very well with that assigned by a human grader. The consistency in scores assigned to different essays on the same topic is higher that of the humans.

First, we will consider what is the difference between an expert and the novice. For becoming expert everyone has to pass through the phase of a beginner. While working in research field, the beginner faces some challenges. Gradually, beginner turns to an expert in that field. The tyro will make new discoveries later. Most of the times discoveries are the outcome of empirical results. Thus, only having theoretical knowledge would not lead to become adept in field of inquiry.

Failures no good times some success need. Failures where people stumbling learning to raise and stepping stone of success. Failure often seems embarrassing and people hide it but learn lot. But as in the above examples, learning from failures makes you achieve success. The examples of history and business demonstrate that failure can be the best catalyst of success, but only if people have the courage to face it head on.

[7]

---------------

Sample Essay-4

---------------

Prompt: Automated essay scoring is unfair to students, since there are many different ways for a student to express ideas intelligently and coherently. A computer program can not be expected to anticipate all of these possibilities, and will therefore grade students more harshly than they deserve. Discuss whether you agree or disagree (partially or totally) with the view expressed providing reasons and examples.

Essay: Computers are stupid. the shoulnt be allowed to grade peoples essays.

i had this computers once that was so stupid it was stupider than my dog and tats pretty stupid. My dog was so dump that he once ate his own droppings dogs are suptid.

My dog wood grade esays better than a computer even thou he once hate my homework. Teacher did not believe me when I told her that Ace had eated my homework so teachers are stupid too.

Evan stupider than dogs are fish they just swim around in the bowl and don't do anything sometimes they don't even swim much and just sit there i hate fish except for eating them.

I hate computers almost as much as i hate fish. fish should not grade esays and computers shouldnt grade essaiys.