I had no idea why Google would buy a company, reCaptcha, that does captchas. For those of you who don’t know, captchas are the little squiggly text that people enter to prove they are human. The word “captcha” actually stands for: Completely Automated Public Turing test to tell Computers and Humans Apart
With these things are all over the internet, why would Google buy this specific company? I found out a few reasons. First, they are the original gangsters – it turns out the guy who invented captchas is the founder of reCaptcha. Second, they way they do the captcha words is quite innovative. Check this out: reCAPTCHA takes scans from newsclippings, articles and old books that can’t be read by machines (because they are scans) then feeds them to humans in a captcha one at a time with other words that it knows. The user then enters both words. The word that reCAPTCHA knows is tested – if correct, it now learns an additional word to use on other challenges. This is how they build up their database of words from scans.
Google has for the past 6 years been scanning books like crazy. They have millions of books scanned. What they don’t have is text of those books available to be searched. The thought is that if you use captchas to surface all the words of those books one at a time, this will enable a massive crowdsourcing project to build a database of literature. Very interesting experiment. I never really hear of such clever business development deals. I love it.
Related articles by Zemanta
- Google Captures reCAPTCHA (sitepoint.com)
- Google acquires ReCaptcha as book-scanning aid (news.cnet.com)
- Google signs deal to print 2m books on Espresso machines (guardian.co.uk)