Five more languages on translate.google.com – Google Translate Blog

Google Inc.
Image via Wikipedia

At Google, we are always trying to make information more accessible, whether by adding auto-captioning on YouTube and virtual keyboards to search or by providing free translation of text, websites and documents with Google Translate. In 2009, we announced the addition of our first “alpha” language, Persian, on Google Translate. Today, we are excited to add five more alpha languages: Azerbaijani, Armenian, Basque, Urdu and Georgian — bringing the total number of languages on Google Translate to 57.

These languages are available while still in alpha status. You can expect translations to be less fluent than for our other languages, but they should still help you understand the multilingual web. We are working hard to “graduate” these new language out of alpha status, just as we did some time ago with Persian. You can help us improve translation quality as well. If you notice an incorrect translation, we invite you click “Contribute a better translation”. If you are a translator, then you can contribute translation memories with the Translator Toolkit. This helps us build better machine translation systems especially for languages that are not well represented on the web.

Collectively, Armenian, Azerbaijani, Basque, Georgian and Urdu have roughly 100 million speakers. We hope that these speakers can now more easily access the entire multilingual web in their own language. Try translating these and other languages at translate.google.com. Here are some phrases from the new alpha languages to get you started:

Baietz lehenengoan

میں خوش قسمت محسوس کر رہا ہوں

բախտաւոր եմ զգում

Mən şanslıyam

იღბალს მივენდობი

via Five more languages on translate.google.com – Google Translate Blog.

Chinese, Japanese or Korean Documents Have You Perplexed?

中文,日语或朝鲜语文件有你困惑?

あなたは困惑し、中国語、日本語、韓国語のドキュメントがありますか?

당신은 당황하게 중국어, 일본어 또는 한국어 문서가 있나요?

Documents containing Chinese, Japanese and Korean (“CJK“) language and character sets have become intertwined within many different legal matters, ranging from international arbitration to  intellectual property litigation to to administrative investigations.   However, the solutions typically used to manage CJK documents have not kept pace with demand and remain slow, cumbersome and expensive.  Most firms, corporations and vendors rely on automated machine translation or certified document translations to understand CJK documents, with the first often revealing giberish results and the later often resulting in extremely high cost to the end client.

Asia Legal Technologies – a joint venture between Global EDD Group and Data Management Corporation – provides innovative custom solutions to clients with CJK document collections.  Each solution is designed to be efficient in both time and cost while leveraging specialized technology, knowledge and human resources to provide multi-lingual services.

  • Data Collection & Preservation
  • Scanning / OCR
  • E-Discovery Processing
  • Automated Language Identification
  • Standard Coding (Chinese, Japanese, Korean to English)
  • Document Summaries (Chinese, Japanese, Korean to English)
  • Translation (Machine, Hybrid, Certified)
  • Document Review

Document containing East Asian languages such as Chinese, Japanese or Korean no longer need to be a perplexing problem with complicated, expensive solutions.  To learn more about the multi-lingual services of Asia Legal Technologies, kindly click to AsiaLegalTech.com or email information@asialegaltech.com.

More love for our multilingual Toolbar users – Google Translate Blog

Last July we enabled automatic page translations in Google Toolbar and we’ve been thrilled by the positive response. Today, we’re taking another step to make automatic translation easier. Now, if Google Toolbar’s default language is set to one of our supported languages, you can use our new Word Translator feature to hover over a word with your mouse and get an automatic instant translation. If you want Toolbar to translate into a different language, you can change it in the Toolbar Options menu.

Entire page translations are great if you have little knowledge of a given language. However, if you’re  a multi-lingual user who just needs certain words translated, hovering is a lot quicker than searching word-by-word on Google Translate.

The new Word Translator feature is available for Internet Explorer and Firefox. And if you use Google Chrome, automatic page translation is already built in, and we're working to build more Translate features.

We hope this helps you browse pages in non-native languages faster, regardless of your language proficiency. Install the latest Toolbar version and give it a try!

via More love for our multilingual Toolbar users – Google Translate Blog.

A brabhsálaí gréasáin ilteangach (or, a multilingual web browser) | Official Google Blog

Since announcing the latest Google Chrome beta earlier this month, we’ve been excited to receive feedback from our beta users on the browser’s new translation and privacy features. Today, we’re introducing these features in the stable channel, so that they’re widely available to everyone who uses Google Chrome on Windows.

Google Chrome’s translation feature is the latest step in the evolution of translation tools across Google. Just a few years ago, Google’s translation tools consisted of a site where you had to copy and paste text into a box — and it only worked for a handful of languages. Today, our translation technology works across 52 languages and can automatically detect and translate entire websites in less than a second. Chrome’s translation feature automatically detects if the language of the webpage you’re on is different from your preferred language setting, The browser will then display a prompt asking if you’d like the page to be translated using Google Translate. With one click, you can instantly translate the page, and all of its text will appear in your preferred language.

via Official Google Blog: A brabhsálaí gréasáin ilteangach (or, a multilingual web browser).

A Primer on Foreign Language E-Discovery | THE REVENUE HERALD

While e-discovery may be Greek to many, it is those documents written in Chinese, Japanese, Korean and Russian that cause much of the trouble. These “multi-byte” languages have exponentially more characters than the 26 letters and few other punctuation marks that Latin languages like English, Spanish, French and German need. In fact, the number of Chinese characters included in the Kangxi dictionary is over 47,000 (though only 3-4,000 are reportedly necessary for full literacy). The impact on e-discovery is significant considering the increased sophistication necessary for case evaluation.

At the most basic level, computers think in ones and zeros, with a one or zero being a bit. Eight bits is a byte. There are 256 different combinations of numbers you can create using a byte (2 (bits) to the 8th power). For languages that are not based solely on letters, i.e., those where symbols represent a concept or a syllable, you need to add bytes (256 x 256, which equals 66,536). That is the essence of multi-byte vs. single-byte languages – single-byte languages have 256 possible combinations, while multi-byte languages have 66,536.

Confused? Then let’s address codings. An encoding is a programmatical translation of what you input to what you get on the screen. The problem is when you have multiple encodings. For example, when analyzing an Outlook 2000 e-mail file (PST format) under a Japanese operating system, which you then convert to an English-language machine for review, there will be problems because the native data in Japanese is corrupted due to linguistic differences.

Unicode was created to solve some of these problems and offer a universal solution; however, it is only available for files created on newer systems, making legacy data a continuing area of concern. “Each language family has its own unique set of problems and solutions,” says Thomas Barnett, Special Counsel for Sullivan & Cromwell, LLP.

In fact, “in some parts of the world, you are not allowed to take the data out of the country due to local data protection laws,” adds Brian Kim of PriceWaterhouseCoopers LLP. He highlights that certain countries also have native applications that are more popular than those commonly used in the United States, requiring additional evaluation of your program inventory.

Whether your data is in Unicode or not, proper preservation is the key. While Microsoft Windows NT, 2000, XP and subsequent versions support Unicode, many archiving or compression tools do not support it. This could result in missing files that may or may not be reported in the error logs. For that reason, you must test carefully, notes Kim. Also, to ensure correct extraction, properly align the regional settings.

via A Primer on Foreign Language E-Discovery | THE REVENUE HERALD.

Foreign Language Documents Have You Perplexed?

Documents containing foreign languages have become a critical part of many e-discovery and disclosure projects, often adding complex questions to the review process.  What language is it?  Does this read right to left or left to right?  How do we review these documents?  Is this searchable?  Is this a menu or is it a memo?  Fortunately there are a variety of different solutions to these questions, ranging from machine translation to multi-lingual first pass review to certified document translation.

No longer is it a requirement to spend significant time and money on the translation of foreign language document collections word for word, regardless of document type or relevance to the case at hand.  We are implementing hybrid solutions that combine best-of-breed technology and multi-lingual legal staffing resources to streamline the review of foreign language documents, reducing cost through efficiency and the implementation of certified linguist translation on a focused subset of relevant documents rather than the entire corpus.

Additional information on Global EDD Group’s Foreign Language Solutions – including mobile machine translation – is available here.