Text::Similarity

This is a Perl module that measures the similarity of two files or two strings based on the number of overlapping (shared) words, scaled by the lengths of the files. It computes the F-Measure, the Dice Coefficient, the Cosine, and the Lesk measure.

We have mailing lists for users and news and developers.

Download the Current Version (v0.10, released June 26, 2013) from CPAN or Sourceforge

Documentation

See the README and CHANGES files. Browse the current CVS version.

Text-Similarity Development Team

Acknowledgments

The development of Text-Similarity has been supported by a National Science Foundation Faculty Early Career Development (CAREER) Program award (#0092784, 2001-2007), by a Grant in Aid of Research, Artistry and Scholarship from the Graduate School of the University of Minnesota (2003-2004), and by the Digital Technology Initiative of the Digital Technology Center of the University of Minnesota (2004-2005).

SourceForge.net Logo CPAN Logo NSF Logo
By: Ted Pedersen - tpederse AT d umn edu