
The following directories and files are included in this directory.

LexSample/   -  includes a directory for each word, where the test and
		training samples as used in senseval-2 are given. 

		each directory is named word.pos, and contains the
		senseval2 data in orginal xml format:

			word.pos-test.xml       
			word.pos-training.xml   

		and with the xml tags removed:

			word.pos-test.count     
			word.pos-training.count 


fine.key 	this is a key for the senseval2 test data. please
		note that this is a key I have reconstructed from
		the scoring information provided by senseval2, and
		is not the original key. 

sensemap	supplied by senseval2 for scoring purposes

stop.list	the stop list I used in all senseval2 experiments.
		It was created by using stop.pl

token1.txt	the token definition file used to create the data 
		in LexSample

token4.txt      the token definition file used during preprocessing
		of the senseval data. It consists of a single regular
		expression: /\S+/ This means that every string with
		no blank characters is considered a token. This is
		appropriate for preprocessing since we are simply	
		simply splitting up a single large xml file with all
		of the training data into separate directories for
		each word. 

test.xml 	the original lexical sample file from Senseval-2
		containing evaluation/test data

training.xml 	the original lexical sample file from Senseval-2
		containing sense-tagged training examples


