paperlined.org
dev > perl > modules > documentation
document updated 11 years ago, on May 12, 2012

I was using this to try to cluster text files with an average size of 10kb, to find text files that are closely-related. Doing this with 10 files took ~1 minute. Trying this with 100 files, I gave up. I didn't realize clustering consumed so much CPU.

Some ways to explore making this process faster: (caveat: I don't know much about clustering)