I want to do cluster analysis on text files. (any time you can calculate some metric between elements in a set, you can perform clustering on that set)

modules that do both metric+cluster

modules that just do string metrics

modules that just do clustering, and work with any metric algorithm

algorithms on Wikipedia

similar fields

How do we do text-file clustering, but FASTER? Random ideas: