A database I am working on has a lot of records (40M+) and the guys mining it are currently using python to process it. This is very powerful (nltk) but also quite slow. One of the guys has found that you can do fuzzy string matching straight in Postgres and