Google backs character-recognition research

By Caroline McCarthy, CNET News.com
Thursday, April 12, 2007 07:22 AM

Google is sponsoring an artificial-intelligence research group's work to develop advanced technologies for character recognition.

The open-source project, called Ocropus, has several goals, including developing a high-level, easy-to-use handwriting recognition system that can convert handwritten documents to computer text, assisting in the creation of electronic libraries, analyzing historical documents and helping vision-impaired people access information. The "ocr" in Ocropus stands for optimal character recognition.

The project is headquartered at the Image Understanding and Pattern Recognition (IUPR) research group at the German Research Center for Artificial Intelligence (DFKI) in Kaiserslautern, Germany. DFKI Professor Thomas Breuel is leading the project.

Breuel made the announcement on Tuesday through a post on the Google Code blog. In addition to Google's sponsorship, Ocropus is getting funds from several German government agencies and other public and private entities.

The Ocropus team expects the project to last three years, and it will support three Ph.D. students or postdoctoral students. IUPR is basing the software primarily on two research projects: one, a handwriting recognition system developed in the mid-1990s for use by the U.S. Census Bureau; and two, newer layout analysis methods for character recognition.

Other resources include Tesseract, a decades-old engine for optimal character recognition originally developed by Hewlett-Packard Labs and re-released by Google last year as an open-source system.

A preview of the Ocropus system is available on the project's Web site under an Apache license, and the IUPR is soliciting open-source contributions in order to complete a number of goals. These include creating a desktop application for the system, adding third-party tools and adapting Ocropus to a variety of languages. It's currently English-only.


WORTHWHILE?

0

0 votes
Blog

Talkback 0 comments

There are currently no comments for this post.


Tech Jobs Now!

Search for your ideal tech job:

Hands-on programming: Extract plain text from documents with Syncfusion's components

Web Development

Justin James recently tried Syncfusion's Essential DocIO and Essential PDF to help him extract text from documents he downloaded from the Internet. Here's the code he wrote to get the plain text.


Read more »



Will technology divide us further?

Blog thumbnail

So I finally watched 2012 over the weekend, but the film left me feeling extremely agitated.

The possibility that the world may meet its watery end in three years didn't..... by Eileen Yu

Read more »

Tags

  1. antivirus
  2. apple ipod
  3. cnet networks inc.
  4. desktop
  5. e - mail
  6. hard drive
  7. intuit inc.
  8. mcafee inc.
  9. microsoft corp.
  10. microsoft windows
  11. microsoft windows vista
  12. microsoft windows xp
  13. norton co.
  14. pc
  15. performance
  16. security
  17. software
  18. tool
  19. web
  20. web site