A WordNet Stop List 
 What's a Stop List? 
A stop list is a list of words that are excluded from some
language processing task, usually because they are viewed as
non--informative or potentially misleading. Usually they are
non--content words like conjunctions, determiners, 
prepositions, etc. These are often called function words. 
 
 What's a WordNet Stop List? 
Since WordNet only contains nouns, verbs, adjectives, and
adverbs, you might think that a stop list wouldn't really 
be relevant. However, there are words that are normally
used as function words that have senses (usually obscure)
in WordNet. 
 
For example, consider the humble word "at". According to WordNet, 
"at" is  a noun that has two senses, one for the chemical element
astatine and the other for a Laotian monetary unit.  
 
It is very likely that most systems using WordNet are NOT
using "at" in these senses. Thus, a WordNet stop list will list
those words that are typically used as function words and 
yet have unrelated WordNet senses that are obscure and 
potentially misleading. 
 Finding the WordNet Stop List 
 
This project was undertaken by Satanjeev Banerjee, and arose
in the context of an implementation of Lesk's word sense
disambiguation algorithm that will likely yield many interesting
results. The first step was to build a list of likely stop list
words. He found the following:
The stop list formed based on these lists is shown 
 here.  
 
The next step was to determine which of these words have misleading
WordNet senses, which have related WordNet senses, and which have
no WordNet senses at all. 
The following words are normally used as function words,  but also turn 
out to have rather odd (but correct) senses listed in WordNet:  
 
 I, a,  an, as, at, by, he, his, me, or, thou, us, who. 
This is our current WordNet stop list! 
You can view the senses that cause us to arrive at this conclusion
 here . The words in our 
initial stop list that have no WordNet sense are shown  here . And those function words
that also have WordNet senses that seem  to be related are shown  here
.
 
Please let us know if you have any other candidates for membership
in the WordNet stop list! These lists have been constructed using
our intuitive judgements and are not meant to be taken as anything
more than that!
By:
Ted Pedersen
- tpederse@d.umn.edu