Research Interests
My research interests are in computational linguistics and natural
language processing. These are closely related areas, but I see them as
separate. Computational linguistics seeks to discover properties of human
language via computational methods, while natural language processing
develops tools and techniques that make it possible for computers to use
and understand human language. In general the methods we develop are
language and application independent, although we also have an interest
in applying these to the medical domain.
We are engaged in three main areas of research right now, each of which in
some way revolves around assigning meanings to words or phrases, or
organizing words and phrases based on their semantic similarity to each
other.
- We are developing unsupervised methods of clustering similar
contexts in the SenseClusters project.
The goal is to develop methods that group together words or short phases
based on their contextual similarity.
- We are also developing methods that use the lexical database WordNet to
assign similarity measurements to concepts (
WordNet::Similarity) and then use that information to carry out word
sense disambiguation (
WordNet::SenseRelate).
- Finally, I have a long standing interest in appplying supervised learning to the problem of word
sense disambiguation, and that work continues.
Both the unsupervised and supervised methods rely heavily on lexical
features that are identified from corpora using the
Ngram Statistics Package.
I am always happy to involve motivated undergraduates in my research
efforts. See my UROP page for
more information.
Whenever possible, we like to take our systems and participate in
shared tasks.
Active Research Projects:
Research Related Links:
Other Research Related Activities:
I was a member of the
Editorial Board (2003-2005)
of the journal
Computational Linguistics .
I was the Secretary of the ACL Special Interest Group on Lexical Semantics
(SIGLEX) (2004-2006)
I review for conferences from time to time.
I have co-organized two workshops on parallel text (at
NAACL-2003
with Rada Mihalcea and at
ACL-2005
with Joel Martin and Rada Mihalcea) and the
interactive poster and demonstration session (at
ACL-2005 with Masaki Nagata).
Rada Mihalcea and I gave a
tutorial on Word Sense Disambiguation in 2004 and 2005.
I gave a
tutorial on Unsupervised Clustering of Similar Contexts (i.e.,
SenseClusters) in 2005, 2006, and 2007.
We have interesting visitors at the NLP
group at UMD every now and then.
I aspire to be an interesting visitor to other
places from time to time.
Brian Rassier and I created and maintain the
Registry of Latin American Researchers in NLP and CL.
The UMD Department of Computer Science has a
Colloquia Series that I coordinated (2002-2005).
I am an academic great great great grandson of
Wilfred Sellars.
Here's
how.
You can also find me in the
AI Genealogy Project.
By:
Ted Pedersen
- tpederse AT d umn edu