Google acquires ReCaptcha as book-scanning aid

By Tom Krazit, CNET News.com
Thursday, September 17, 2009 10:34 AM

Google has acquired ReCaptcha, one of those companies behind the distorted text boxes at the bottom of many Web site sign-in pages.

Terms of the deal were not disclosed, but Google plans to use ReCaptcha's technology both as a security measure within certain Google sites and to make its massive book-scanning project a little smarter, the company said in a blog post. ReCaptcha is an offshoot of Carnegie Mellon University's School of Computer Science, and puts a twist on the traditional captcha: a string of letters in squiggly text meant to confuse spam bots and other nonhuman Web pests.

The idea behind a captcha is to confuse a computer, but computers are also confused by some words written in fonts used long ago. ReCaptcha offers two words, one of which is a captcha it already knows, and one of which is a word it doesn't know. The thinking is that if you get the first word right, you're likely a human and you're also probably going to get the second one right.

It can then pool all the answers for the second word and declare with a reasonable amount of certainty that the second word is what most people think it is, thereby updating the vocabulary of participating book scanners. This is of obvious interest to Google, currently bent on scanning as many books as it can find.

This article was first published as a blog post on CNET News.


WORTHWHILE?

0

0 votes
Blog

Talkback 0 comments

There are currently no comments for this post.


Tech Jobs Now!

Search for your ideal tech job:

Use SCP for quick, secure file transfers

Internet Security

When you need to securely transfer a single file, SCP may be the ideal tool.


Read more »



Amendments to empower Copyright Tribunal

Blog thumbnail

As a lawyer, I often inform my clients about the need to clear licenses with the various licensing societies whenever they use works belonging to other parties. This is especially..... by Bryan Tan

Read more »

Tags

  1. advertisement
  2. blog
  3. facebook
  4. google inc.
  5. internet
  6. internet advertising
  7. microsoft corp.
  8. network
  9. revenue
  10. search
  11. social networking
  12. software
  13. u.s.
  14. video
  15. web
  16. web 2.0
  17. web browser
  18. web services
  19. web sites
  20. yahoo! inc.