Complete List of Publications
This is a complete list of all my publications arranged in chronological
order. This list includes single author papers by graduate students I
have advised, if the work was done at Duluth. I also provide a
seperate listing of all the
PhD dissertations, Master's theses, and Master's projects I have
supervised.
2009
-
UMLS-Interface and UMLS-Similarity : Open Source Software for Measuring
Paths and Semantic Similarity
(McInnes, Pedersen, and Pakhomov) - To appear in the Proceedings of the
Annual Symposium of the American Medical Informatics Association, Nov
14-18, 2009, San Francisco, CA
-
WordNet::SenseRelate::AllWords - A Broad Coverage Word Sense
Tagger that Maximimizes Semantic Relatedness
(Pedersen and Kolhatkar) - Apears in the Proceedings of the
North American Chapter of the Association for Computational Linguistics
- Human Language Technologies 2009 Conference, June 1-3, 2009, pp.
17-20, Boulder, CO. (Demonstration System)
-
Improved Unsupervised Name Discrimination with Very Wide Bigrams and
Automatic Cluster Stopping
(Pedersen) - Appears in the Proceedings of the Tenth
International Conference on Intelligent Text Processing and
Computational Linguistics, March 1-7, 2009, pp. 294-305, Mexico City.
[acceptance rate 26%]
2008
-
Learning High Precision Rules to Make Predictions of Morbidities in
Discharge Summaries (Pedersen) - Appears in the Proceedings of the
Second i2b2 Workshop on Challenges in Natural Language Processing for
Clinical Data, Nov 7-8, 2008, Washington, DC.
-
Empiricism is Not a Matter of Faith (Pedersen), Computational
Linguistics, Volume 34, Number 3, pp. 465-470, September 2008.
[Journal Citation Reports Index Factor 2007: 2.367]
-
Computational Approaches to Measuring the Similarity of Short Contexts :
A Review of Applications and Methods (Pedersen), to appear in the
South Asian Language Review
(Also available from CMP-LG E-Print Archive as
0806.3787)
-
Name Discrimination and E-mail Clustering Using Unsupervised Clustering
of Similar Concepts (Kulkarni and Pedersen),
Journal of Intelligent Systems (Special Issue : Recent Advances in
Knowledge-Based Systems and Their Applications), 17(1-3), 37-50, 2008.
2007
-
Using UMLS Concept Unique Identifiers (CUIs) for Word Sense
Disambiguation in the Biomedical Domain (McInnes, Pedersen, and
Carlis) - Appears in the Proceedings of the Annual Symposium of the
American Medical Informatics Association, Nov 10-14, 2007, pp.
533-537, Chicago, IL. [acceptance rate 45%]
-
Measures of Semantic Similarity and Relatedness in the Biomedical
Domain
(Pedersen, Pakhomov, Patwardhan, and Chute),
Journal of Biomedical Informatics, 40(3), 288-299, June 2007.
[Journal Citation Reports Index Factor 2006: 2.346]
-
Determining the Syntactic Structure of Medical Terms in Clinical Notes
(McInnes, Pedersen, and Pakhomov) - Appears in the Proceedings of
BioNLP 2007, June 29, 2007, pp. 9-16, Prague, Czech Republic.
[acceptance rate 29%]
[ppt]
-
UMND1: Unsupervised Word Sense Disambiguation Using Contextual
Semantic Relatedness (Patwardhan, Banerjee, and Pedersen) -
Appears in the Proceedings of SemEval-2007: 4th
International Workshop on Semantic Evaluations, June 23-24, 2007,
pp. 390-393, Prague, Czech Republic.
-
UMND2 : SenseClusters Applied to the Sense Induction Task of
Senseval-4 (Pedersen) - Appears in the Proceedings of SemEval-2007:
4th International Workshop on Semantic Evaluations, June 23-24, 2007,
pp. 394-397, Prague, Czech Republic.
-
Unsupervised Discrimination of Person Names in Web Contexts
(Pedersen and Kulkarni) - Appears in the Proceedings of the Eighth
International Conference on Intelligent Text Processing and
Computational Linguistics, pp. 299-310, February 18-24, 2007, Mexico
City. [acceptance rate 29%]
Download the
data used in this paper (Kulkarni name corpus).
-
Discovering Identities in Web Contexts with Unsupervised Clustering
(Pedersen and Kulkarni) - Appears in the Proceedings of the
IJCAI-2007 Workshop on Analytics for Noisy Unstructured Text Data,
pp. 23-30, January 8, 2007, Hyderabad, India.
Download the
data used in this paper (Kulkarni name corpus).
2006
-
Determining Smoker Status using Supervised and Unsupervised Learning with
Lexical Features (Pedersen) - Appears in the Working Notes of the
i2b2 Workshop on Challenges in Natural Language Processing for Clinical
Data, Nov 10-11, 2006, Washington, DC.
-
A Comparative Study of Supervised Learning as Applied to Acronym
Expansion in Clinical Reports (Joshi, Pakhomov, Pedersen, and Chute)
- Appears in the Proceedings of the Annual Symposium of the American
Medical Informatics Association, pp. 399-403, Nov 11-16, 2006,
Washington, DC. [acceptance rate 41%]
[ppt]
-
Unsupervised Context Discrimination and Automatic Cluster Stopping
(Kulkarni and Pedersen), University of Minnesota Supercomputing Institute
Research Report UMSI 2006/90, August 2006. [Note: This is Anagha's MS
thesis, from July 2006.]
-
How many different "John Smiths", and who are they?
(Kulkarni and Pedersen) - Appears in the Proceedings
of the Twenty-First National Conference on Artificial Intelligence,
pp. 1885-1886, July 19, 2006, Boston, MA. (Student Poster)
-
Kernel Methods for Word Sense Disambiguation and Acronym Expansion
(Joshi, Pedersen, Maclin, and Pakhomov) - Appears in the Proceedings
of the Twenty-First National Conference on Artificial Intelligence,
pp. 1879-1880, July 19, 2006, Boston, MA. (Student Poster)
-
An End-to-End Supervised Target-Word Sense Disambiguation System
(Joshi, Pakhomov, Pedersen, Maclin, and Chute) - Appears in the
Proceedings of the Twenty-First National Conference on Artificial
Intelligence, pp. 1941-1942, July 19, 2006, Boston, MA. (Intelligent
System Demonstration)
-
Unsupervised Corpus Based Methods for WSD (Pedersen), In Agirre, E.
and Edmonds, P. (Editors), Word Sense
Disambiguation : Algorithms and Applications, June 2006, pp.
133-166, Springer.
-
Automatic Cluster Stopping with Criterion Functions and the Gap Statistic
(Pedersen and Kulkarni), Appears in the Proceedings of the
Demonstration Session of the Human Language Technology Conference and the
Sixth Annual Meeting of the North American Chapter of
the Association for Computational Linguistics, pp. 276-279, June 6,
2006, New York City.
-
Selecting the "Right" Number of Senses Based on Clustering Criterion Functions
(Pedersen and Kulkarni), Appears in the Proceedings of the Posters
and Demo Program of the Eleventh Conference of the European Chapter of
the Association for Computational Linguistics, pp. 111-114, April 5-7,
2006, Trento, Italy. [acceptance rate 40%]
-
Using WordNet Based Context Vectors to Estimate the Semantic Relatedness
of Concepts (Patwardhan and Pedersen) - Appears in the Proceedings
of the EACL 2006 Workshop Making Sense of Sense - Bringing Computational
Linguistics and Psycholinguistics Together, pp. 1-8, April 4, 2006,
Trento, Italy.
-
Improving Name Discrimination : A Language Salad Approach (Pedersen,
Kulkarni, Angheluta, Kozareva, and Solorio) - Appears in the Proceedings
of the EACL 2006 Workshop on Cross-Language Knowledge Induction,
pp. 25-32, April 3, 2006, Trento, Italy.
Download the Bulgarian, English, Spanish, and Romanian
data used in this paper!
-
An Unsupervised Language Independent Method of Name Discrimination Using
Second Order Co-occurrence Features (Pedersen, Kulkarni, Angheluta,
Kozareva, and Solorio) - Appears in the Proceedings of the Seventh
International Conference on Intelligent Text Processing and
Computational Linguistics,
pp. 208-222,
February 19-25, 2006, Mexico City. [acceptance rate 30%]
Download the Bulgarian, English, Spanish, and Romanian
data and
stoplists used in this paper.
2005
-
A Comparative Study of Support Vector Machines Applied to the Supervised
Word Sense Disambiguation Problem in the Medical Domain (Joshi,
Pedersen, and Maclin) - Appears in the Proceedings of the Second Indian
International Conference on Artificial Intelligence,
pp. 3449-3468, December 20-22, 2005,
Pune, India. [acceptance rate 35%]
-
Name Discrimination and Email Clustering using Unsupervised Clustering
and Labeling of Similar Contexts (Kulkarni
and Pedersen) - Appears in the Proceedings of the Second Indian
International Conference on Artificial Intelligence,
pp. 703-722, December 20-22, 2005,
Pune, India. [acceptance rate 35%] Download the data
used in this paper.
-
Abbreviation and Acronym Disambiguation in Clinical Discourse
(Pakhomov, Pedersen and Chute) - Appears in the Proceedings of the Annual
Symposium of the American Medical Informatics Association, pp. 589-593,
October 22-26, 2005, Washington, DC. [acceptance rate 37%]
-
Identifying Similar Words and Contexts in Natural Language with
SenseClusters (Pedersen and Kulkarni) - Appears in the Proceedings
of the Twentieth National Conference on Artificial Intelligence,
pp. 1694-1695,
July 12, 2005, Pittsburgh, PA. (Intelligent Systems Demonstration)
Download the data
used in this demo.
-
SenseRelate::TargetWord - A Generalized Framework for Word Sense
Disambiguation (Patwardhan, Banerjee, and Pedersen) - Appears in the
Proceedings of the Twentieth National Conference on Artificial
Intelligence, pp. 1692-1693, July 12, 2005, Pittsburgh, PA.
(Intelligent Systems Demonstration)
-
Word Alignment for Languages with Scarce Resources
(Martin, Mihalcea, and Pedersen) - Appears in the Proceedings of the
ACL Workshop on Building and Using Parallel Texts, pp. 65-74,
June 29-30, 2005, Ann Arbor, MI.
-
Unsupervised Discrimination and Labeling of Ambiguous Names
(Kulkarni) - Appears in the Proceedings of the Student Research Workshop
of the 43rd Annual Meeting of the Association for Computational
Linguistics. pp. 145-150, June 27, 2005, Ann Arbor, MI. [acceptance rate
28%] Download the data
used in this paper.
-
SenseClusters: Unsupervised Clustering and Labeling of Similar Contexts
(Kulkarni and Pedersen) - Appears in the Proceedings of the Demonstration
and Interactive Poster Session of the 43rd Annual Meeting of the
Association for Computational Linguistics, pp. 105-108, June 26, 2005,
Ann Arbor, MI. [acceptance rate 55%]
Download the
data
used in this paper.
-
SenseRelate::TargetWord - A Generalized Framework for Word Sense
Disambiguation
(Patwardhan, Banerjee, and Pedersen) - Appears in the Proceedings of the
Demonstration and Interactive Poster Session of the 43rd Annual Meeting
of the Association for Computational Linguistics, pp. 73-76, June 26,
2005, Ann Arbor, MI. [acceptance rate 55%]
-
Resolving Ambiguities in Biomedical Text with Unsupervised Clustering
Approaches
(Savova, Pedersen, Purandare and Kulkarni) - University of Minnesota
Supercomputing Institute Research Report UMSI 2005/80 and CB Number
2005/21, May.
-
Measures of Semantic Similarity and Relatedness in the Medical Domain
(Pedersen, Pakhomov, and Patawardhan) - University of Minnesota
Digital Technology Center Research Report DTC 2005/12, May. [This is a
preliminary version of the JBI 2007 article].
-
Maximizing Semantic Relatedness to Perform Word Sense Disambiguation
(Pedersen, Banerjee, and Patwardhan) - University of Minnesota
Supercomputing Institute Research Report UMSI 2005/25, March.
-
Name
Discrimination by Clustering Similar Contexts (Pedersen, Purandare,
and Kulkarni) - Appears in the Proceedings of the Sixth International
Conference on Intelligent Text Processing and Computational
Linguistics, pp. 220-231, February 13-19, 2005, Mexico City. [acceptance
rate 37%]
Download the data
used in this paper.
2004
-
Improving Word Sense Discrimination with Gloss Augmented Feature Vectors
(Purandare and Pedersen) - Appears in the Proceedings of the Workshop on
Lexical Resources for the Web and Word Sense Disambiguation, pp. 123-130,
November 22, 2004, Puebla Mexico.
-
Incorporating Ngram Statistics in the Normalization of Clinical
Notes (McInnes, Pakhomov, Pedersen and Chute) - Appears in
MEDINFO 2004 : Proceedings of the 11th World Congress on Medical
Informatics, p. 1882, September 2004, San Francisco, CA.
(Poster)
-
Word Sense Discrimination by Clustering Similar Contexts
(Purandare and Pedersen), University of Minnesota Supercomputing
Institute Research Report UMSI 2004/146, September 2004. [Note: This is
Amruta's MS thesis, from August 2004.]
-
Polysemy: Theoretical and Computational Approaches. By Yael Ravin and
Claudia Leacock. (Pedersen) - Appears in Minds and Machines, Volume
14, Number 3, pp. 419-423. (Book Review)
-
Discriminating Among Word Meanings by Identifying Similar Contexts
(Purandare and Pedersen) - Appears in the Proceedings of the Nineteenth
National Conference on Artificial Intelligence (AAAI-04), pp. 964-965,
July 25-29, 2004, San Jose, CA (Student Abstract)
[ppt]
-
SenseClusters - Finding Clusters that Represent Word Senses
(Purandare and Pedersen) - Appears in the Proceedings of the Nineteenth
National Conference on Artificial Intelligence (AAAI-04), pp. 1030-1031,
July 25-29, 2004, San Jose, CA (Intelligent Systems Demonstration)
-
WordNet::Similarity - Measuring the Relatedness of Concepts
(Pedersen, Patwardhan, and Michelizzi) - Appears in the Proceedings of the
Nineteenth National Conference on Artificial Intelligence (AAAI-04),
pp. 1024-1025, July 25-29, 2004, San Jose, CA (Intelligent Systems Demonstration)
-
The Senseval-3 Multilingual English-Hindi lexical sample task
(Chklovski, Mihalcea, Pedersen, and Purandare) - Appears in the
Proceedings of the Third International Workshop on the Evaluation of
Systems for the Semantic Analysis of Text (Senseval-3),
pp. 5-8, July 25-26, 2004, Barcelona, Spain.
-
The Duluth Lexical Sample Systems in Senseval-3
(Pedersen) - Appears in the Proceedings of the Third International
Workshop on the Evaluation of Systems for the Semantic Analysis of Text
(Senseval-3),
pp. 203-208,
July 25-26, 2004, Barcelona, Spain.
-
Complementarity of Lexical and Simple Syntactic Features: The Syntalex
Approach to Senseval-3
(Mohammad and Pedersen) - Appears in the Proceedings of the
Third International Workshop on the Evaluation of Systems
for the Semantic Analysis of Text (Senseval-3),
pp. 159-162, July 25-26, 2004, Barcelona, Spain.
-
Word Sense Discrimination by Clustering Contexts in Vector and Similarity
Spaces (Purandare and Pedersen) - Appears in the Proceedings of the
Conference on Computational Natural Language Learning (CoNLL),
pp. 41-48, May 6-7, 2004, Boston, MA. [acceptance rate 48%]
-
Combining Lexical and Syntactic Features for Supervised Word Sense
Disambiguation (Mohammad and Pedersen) - Appears in the Proceedings
of the Conference on Computational Natural Language Learning (CoNLL),
pp. 225-32, May 6-7, 2004, Boston, MA. [acceptance rate 48%]
-
SenseClusters - Finding Clusters that Represent Word Senses
(Purandare and Pedersen) - Appears in the Proceedings
of Fifth Annual Meeting of the North American Chapter of the
Association for Computational Linguistics (NAACL-04),
pp. 26-29, May 3-5, 2004, Boston, MA. (Demonstration System)
-
WordNet::Similarity - Measuring the Relatedness of Concepts
(Pedersen, Patwardhan, and Michelizzi) - Appears in the Proceedings
of Fifth Annual Meeting of the North American Chapter of the
Association for Computational Linguistics (NAACL-04),
pp. 38-41, May 3-5, 2004, Boston, MA. (Demonstration System)
2003
-
An Evaluation Exercise for Word Alignment
(Mihalcea and Pedersen ) - Appears in the Proceedings of the Workshop on
Building and Using Parallel Texts: Data Driven Machine Translation and
Beyond, pp. 1-10, May 31, 2003, Edmonton, Canada.
Also available in postscript
-
The Duluth Word Alignment System
(Thomson-McInnes and Pedersen ) - Appears in the Proceedings of the
Workshop on Building and Using Parallel Texts: Data Driven Machine
Translation and Beyond, pp. 40-43, May 31, 2003, Edmonton, Canada.
Also available in postscript
-
Guaranteed Pre-tagging for the Brill Tagger
(Mohammad and Pedersen) - Appears in the Proceedings of the Fourth
International Conference on Intelligent Text Processing and Computational
Linguistics, pp. 148-157,
February 17-21, 2003, Mexico City. [acceptance rate 46%]
Also available in postscript
2002
2001
-
Materials from the EM algorithm Panel Discussion at EMNLP-01, June
2001, in Pittsburgh PA.
The EM Algorithm : Selected Readings
is a short literature review.
I've also posted my slides as
powerpoint or
handouts
from my short introduction to
EM. It works through a simple example of EM for a multinomial
distribution with hidden data.
-
Lexical Semantic Ambiguity Resolution with Bigram Based Decision Trees
(Pedersen) - Appears in the Proceedings of the Second
International Conference on Intelligent
Text Processing and Computational Linguistics
(CICLING-01),
pp. 157-168,
February 18-24, 2000, Mexico City. [acceptance rate 57%] [This is a
preliminary version of the NAACL 2001 paper.]
2000
1999
1998
-
Knowledge Lean Word Sense Disambiguation
(Pedersen & Bruce) - Appears in the Proceedings of the Fifteenth
National Conference on Artificial Intelligence (AAAI-98), p. 800-805,
July 28-30, 1998, Madison, WI [acceptance rate 30%]
-
Raw Corpus Word Sense Disambiguation
(Pedersen) - Appears in the Proceedings of the Fifteenth
National Conference on Artificial Intelligence (AAAI-98),
p. 1198, July 28-30, 1998, Madison, WI (Student Poster)
-
Dependent Bigram Identification
(Pedersen) - Appears in the Proceedings of the Fifteenth
National Conference on Artificial Intelligence (AAAI-98),
p. 1197,
July 28-30,
1998, Madison, WI (Student Poster)
1997
-
Distinguishing Word Senses in Untagged Text
(Pedersen & Bruce) - Appears in the Proceedings of the Second
Conference on Empirical Methods in Natural Language Processing
(EMNLP-2),
pp. 197-207,
August 1-2, 1997, Providence, RI. [acceptance rate 35%]
(Also available from CMP-LG E-Print Archive as
#9706008
)
-
Naive Mixes for Word Sense Disambiguation
(Pedersen) - Appears in the Proceedings of the Fourteenth
National Conference on Artificial Intelligence (AAAI-97),
p. 841, July 27-31, 1997, Providence, RI (Student Poster)
-
Knowledge Lean Word Sense Disambiguation
(Pedersen) - Appears in the Proceedings of the Fourteenth
National Conference on Artificial Intelligence (AAAI-97),
p. 814, July 27-31, 1997, Providence, RI (Doctoral Consortium)
1996
-
Fishing for Exactness
(Pedersen) - Appears in the Proceedings of the South -
Central SAS Users Group Conference (SCSUG-96),
pp. 188-200,
Oct 27-29, 1996,
Austin, TX (Also available from CMP-LG E-Print Archive as
#9608010
)
-
Significant Lexical Relationships
(Pedersen, Kayaalp, & Bruce) - Appears in the Proceedings of the
Thirteenth National Conference on Artificial Intelligence (AAAI-96),
pp. 455-460, August 4-8, 1996, Portland, OR. [acceptance rate 30%]
-
The Measure of a Model
(Bruce, Wiebe, & Pedersen) - Appears in the Proceedings of the
Conference on Empirical Methods in Natural Language Processing,
pp. 101-112, May 17-18, 1996, Philadelphia, PA. [acceptance rate 30%]
(Also available from CMP-LG E-Print Archive as
#9604018
)
1995
-
Lexical Acquisition via Constraint Solving
(Pedersen & Chen) - Appears in the Working Notes of the
AAAI Spring Symposium on Representation and Acquisition
of Lexical Knowledge, pp. 118-122, March 27-29, 1995, Palo Alto, CA
(Also available from CMP-LG E-Print Archive as
#9502028)
By:
Ted Pedersen
- tpederse AT d umn edu