P R A T H E E P R A V E E N D R A N A T H A N
![]()
Research Interests
Natural Language Processing & Data Mining
Masters Thesis
Identifying Sets of Related Words from the World Wide WebThesis Advisor
Professor. Ted Pedersen http://www.d.umn.edu/~tpederseBackground
The overall goal of my thesis research is to use the World Wide Web
as a source of information to identify sets of words that
are related
in meaning. Methods have been developed to identify words that are
related in meaning in fixed or static corpora
of text. However,
given the availability of huge amounts of text via the World Wide Web
it is important to develop methods
that can take advantage of this fact.
The Web creates a unique set of challenges, including its ever-changing
state, and the
presence of repetitive, noisy, or low-quality data.
We are using the search engine Google to retrieve text
from the Web. Google has released an API
that allows a programmer to
interface with their content, and retrieve
the data in a more convenient form. Thereafter we process that
data to find sets of
related words.
Current Status
Possible use of Google-Hack
Links
Google-Hack v0.13 released on CPAN! (02/23/05) (Click here for more info on Google-hack)Contact Information
- Email : rave0029@d.umn.edu