We have relaunched: What's new at ZDNet Asia?

The human factor in Google translations

Summary

The tech giant is readying a service to help people get documents translated. Might the service also help train Google's machine translation tech?

Events

Microsoft MSDN/Developer Event
25 Mar 2010

One Marina Boulevard, Microsoft Singapore

IT Architect Regional Conference Singapore 2010
20 - 21 Apr 2010

Singapore Management University, Singapore

The Internet Show 2010
21-22 Apr 2010

Suntec Singapore

Google looks set to launch a beta test of a document translation service, a new move in the company's efforts to break down language barriers.

With the service, the company will connect people who need documents translated with humans who will be paid to do so, according to the Google Translation Center information page. The site was spotted by sharp eyes at the Google Blogoscoped blog.

"Google Translation Center is the fast and easy way to get translations for your content. Simply upload your document, choose your translation language, and choose from our registry of professional and volunteer translators. If a translator accepts, you should receive your translated content back as soon as it's ready," the site said.

Google prefers to rely on computer algorithms rather than humans, so at first glance the Google Translation Center looks somewhat anomalous, even though Google is only playing a middleman role. But it is possible that the human translators might be gradually improving Google's machine translation technology as they work, in effect helping to put themselves out of a job.

That is because Google's translation system uses a statistical model that works better the more it can compare the same text in two different languages. And Google evidently will track translation work in its database; according to the center's introduction for translators, "our translation search feature matches your current translation with previous translations, so you don't have to translate over and over again".

Google is fervently interested in better machine translation. With it, it can use its search technology to link people with data around the world, regardless of language barriers, making its search engine significantly more powerful.

Wanted: More Rosetta Stones
Google's translation technique essentially relies on having as many Rosetta Stone-like documents as possible. The more documents it has in two languages, the better able it is to match words and phrases from one language to another, according to a recent speech by Jeff Dean, a Google fellow who works Google's computing infrastructure.

"By computing statistics over all words and phrases, you...get a model of word-by-word and phrase-by-phrase replacements," Dean said. Machine translation often produces awkward results today, but "the impact of having a really large language model makes the sentences flow a lot more easily".

The screenshot below, from Google, shows the online interface a Google translator apparently will see. It shows text in two languages, with the passage broken down into chunks of text. It also suggests a previous translation of one chunk, offering a "use suggestion" button to employ it. It is not clear if the previous translation draws just on that individual translator's work or a larger collection.

Based on the Bilingual Evaluation Understudy method for rating translation accuracy, Google scored first place in a 2005 evaluation by the National Institute of Standards and Technology evaluation.

Google was mum about the project. "We're always looking at new ways of providing tools for users to connect with each other, share information, and improve access to information on the internet, but we don't have any new details to share at this time," the company said in a statement.

Paying the middleman
It is a time-tested business to be the middleman who connects customers to those willing to pay for a product or service, but the Internet has taken the role to new heights by more easily enabling that process on a national and sometimes global scale. For example, Amazon.com's Mechanical Turk, Serebra Connect, and Elance can help companies that need tasks done find people who can do them.

But the Google Translation Center seems to have a different approach. Translators get access to free Google tools, and it appears Google is not involved in any payment transactions, according to the site.

"Google Translation Center provides a venue for you to enter into and complete translation transactions. Except when you use Google Translation Center as provided in Section 4, Google is not involved in any transactions in Google Translation Center. Your interaction with any third party participant(s) or user(s) within Google Translation Center, including payment and delivery of goods and services...are solely between you and such third party participant(s) or user(s) and Google is not involved in such dealings," according to the terms of service. Section 4, titled "Google Participation", says just that "Google and/or its subsidiaries and affiliates may use Google Translation Center from time to time".

So what is in it for Google?
Of course, Google has a strong search-ad business that it uses to subsidize any number of efforts that may not be profitable for years, if indeed ever. After all, Google's mission is "to organize the world's information and make it universally accessible and useful".

But even if Google does not charge a percentage, improving automated translation could be a powerful incentive as Google tries to keep its core product, the search engine, competitive.

Google's translation technology is available through the Google Translate site, but the company also has technology called Cross Language Information Retrieval (CLIR) that builds translation into its search engine.

Search increasingly is the gateway by which people discover what is on the Internet, so building automated two-way translation into the process could open up the very parts of the Internet that today are available but effectively hidden by language barriers.

CLIR can translate a search query into a foreign tongue then translate the answer back into the search results. Clicking a link produces the translated version of that page.

For example, a search in Russian for Tony Blair's biography will present an option, in Russian and presented at the bottom of the search results page, to search pages written in English. Clicking on a link then translates the English page into Russian.

Google executives have given indications recently about just how grand the company's ambitions are for the automated language translation. The company wants people from any major language to understand any other.

"We will eventually do 100 by 100 languages, to take this set of languages and convert to another," Google Chief Executive Eric Schmidt said in a June talk. "That alone will have a phenomenal impact on an open society," he said, a reference to concerns many have expressed about Google's censored search results in countries such as China.

This article was first published as a blog on CNET News.com.

Talkback

Add your opinion

In order to post a comment, you need to be registered. (Sign In or register below)

Post your comment
Transform your business interactions with real-time voice, video and telepresence solutions.
Tech Vendor: Cisco

ZDNet Asia Live

www.3w.com.au has seen it's outsourced IT Contracting Business in Manila grow at 4 times the rate of its traditional Australian Based...

4 minutes ago by brucemills on Companies' outsourcing spend to increase

RT @3wconsulting: Whitepaper from http://3W.com.au "Outsourcing Your IT Requirements to Philippines" now on @zdnetaustralia & @zdnetasia http://ow.ly/1oY9f

Whitepaper from http://3W.com.au "Outsourcing Your IT Requirements to Philippines" now on @zdnetaustralia & @zdnetasia http://ow.ly/1oYbA

Whitepaper from http://3W.com.au "Outsourcing Your IT Requirements to Philippines" now on @zdnetaustralia & @zdnetasia http://ow.ly/1oYbz

Zdnetasia.com Estimated Worth $178,365 USD. Daily Ad Revenue:$244 USD, Daily Views:81,445 Pages... - http://www.haplog.com/www.zdneta...

recently estimated website net worth of zdnetasia.com - http://www.haplog.com/www.zdneta...

9 hours 56 minutes ago by haplog on topsy

When I create an event, I click on an approximate time during the day when I want the event to occur, then I click "edit event detail...

1 day 33 minutes ago by bessellbrowne on Google Calendar gets 'smart' rescheduling

ipads break alott i had one it broke three times in the month i had it so i got rid of the damn thing id just go for the laptop Top Grade...

1 day 34 minutes ago by bessellbrowne on Report: 'Hundreds of thousands' of iPad preorders

There are a number of websites that still require Internet Explorer to view and IE for Mac Stinks (it is really ies4osx which is the Wind...

1 day 36 minutes ago by bessellbrowne on Microsoft: Only minor tweaks in Windows 7 SP1

The receivers don't transmit back to the satellite. Unless there is a phone line attached to the receiver, they don't have any wa...

1 day 39 minutes ago by bessellbrowne on Apple to join the geolocation craze?

What to expect from open source Symbian http://is.gd/aPIGL

1 day 54 minutes ago by rebelk0de on topsy

"Lead Cognos BI Developer Insurance - Jobs - ZDNet Asia" http://bit.ly/bRcxOG

1 day 34 minutes ago by rhrcognos on topsy

whatever little understanding I have we 'll only progress toward end of the world if we use HPCs to lenthen life of human being. Huma...

1 day 45 minutes ago by abhi32002@gmail.com on High computing promises elixir of life

Thanks for the knowledgeable article on SDDs. Allas...when all this reasearch will happen in Indian Universities. Hope the new bill on Fo...

1 day 58 minutes ago by abhi32002@gmail.com on APAC HPC users eye solid-state drives

It was a good article. This brings a good opportunity for Indian IT firms to come up with new solutions in this field. HPC can become a b...

1 day 17 minutes ago by abhi32002@gmail.com on High computing most-wanted job in Asia

COL KR DHARMADHIKARY(RETD) its very late to reply the link, but if it is still alive and looking for opportunity, i would like to know th...

1 day 14 minutes ago by deb021280 on Education takes off in rural India, helped by PCs

It was just a matter of time until google was marginalised anyway. I'm afraid this will be forgotten in China very quickly. Still, it...

1 day 19 minutes ago by robinsmith on Report: Google to leave China on April 10

High performance computing (HPC) most-wanted job in Asia http://bit.ly/9vFC3i (via @zdnetasia) #singapore

He doesn't care if her shoes are of glass, All he wants to see is a huge rack and nice a*s. Sleeping beauty's not awoken by true ...

1 day 48 minutes ago by warlowdavies on One pair of 3D glasses to rule them all

RT @zdnetasia: EMC COO, Pat Gelsinger, on bridging gaps in the organization and its cloud ambitions in Asia. (cont) http://tl.gd/i5jjd

EMC COO, Pat Gelsinger, on bridging gaps in the organization and its cloud ambitions in Asia. http://bit.ly/9etOZW

Asian SMBs need to pay more attention to disaster recovery planning http://bit.ly/bDet08 via @zdnetasia

Asian SMBs need to pay more attention to disaster recovery planning http://bit.ly/bDet08

[TECH] URL Shorteners slow Web redirection. - http://bit.ly/bySnWK @zdnetasia

URL shorteners are great but they can slow web redirection & you pray it would never go down http://bit.ly/bySnWK via @zdnetasia

Temasek Holdings eyeing tech stocks, indicating optimistic outlook on IT sector. http://bit.ly/aM7VwU

URL shorteners slow Web redirection. http://bit.ly/bySnWK

Chinese agencies cry foul over Google. http://bit.ly/by6rwV

all of sg's isps have been practising compulsory invisible proxy for all home subscribers at their backend since many years back alre...

2 days 58 minutes ago by melvinchia on Web filters mean bad news for business

it is not to good for china.
Proactol

3 days 43 minutes ago by nathonastle on Chinese ad partners beg Google for information

IT security insiders rob casinos of $50K http://is.gd/aPIKR

3 days 14 minutes ago by rebelk0de on topsy

Very good explanation of JMX

3 days 48 minutes ago by Babith B on Managing applications with JMX

The reaction to a report issued Tuesday by Flurry Analytics managed to completely overlook some interesting news--the Android-based Motorola Droid outsold the original iPhone over the same period of time following their respective launches--to focus instead on the sales numbers for the Nexus One.

4 days 51 minutes ago by lonemavericks on diggs

Another ZTE story....

4 days 53 minutes ago by Moderate Your Greed on Philippines opens bid for final 3G license