Thursday, May 03, 2012

Translation On Demand

We have long awaited that intuitive voice interface which would gives us back our hands, capture our thoughts as we say them and offer audio play back in voice, pitch and speed that we can easily digest. An intuitive user interface not only changes how we digest information and media but also how it is constructed and served.

Watching my mother in law as she commands interfaces with her RNIB (Royal National Institute for Blind) installed software has long been a marvel. However it is daunting to watch and makes you realise that it isn’t just about the media content, but all the commands to access and control it.

We have seen the great strides taken by the like of Apple with its Siri voice recognition and command interface and we enjoy and often use our Google voice search facility on our smartphone. We have also seen and frequently use the Google translation facilities and although there is sometimes that odd error, the market has moved on significantly.

Now Google are introducing a Gmail facility which will automatically translate messages enabling users to get mail in other languages automatically translated into their own language. Users can choose to have messages in a language automatically translated in their language by selecting "Always Translate", or they can click for translation using a translate message option. They can also turn off translation for a language.

Many speech translations remain flat and generate a somewhat synthetic voice. However, experimental new software developed by Microsoft is now able not only to translate between 26 different languages and play back the speech in the user’s own voice, it does so complete with the inflections the user used when speaking in their own language. Microsoft have demonstrated the facility by using the software to read out and translating it to English, Italian and Madarin. Play the recording and read more at Technology Review.

Obviously users must first spend time training it to recognize and reproduce their voice. Once that's been accomplished, the software applies that user-specific speech model to a generic text-to-speech model for the desired output language. Individual sounds of the user's voice are selected from the training session, then strung together and appropriately altered, in order to create a natural-sounding translation.

So as we move take another huge step forward where language itself, whether text or speech, can be converted to play back in one’s own language and even one’s own voice. Text translation to your own language is an automatic feature and this technology starts to become mainstream. This starts to offer huge opportunities which were previously restricted and only achievable through translators. In theory, translation of text becomes a flick of a switch and speech translation is personalised. When we couple these developments to the vision of Pranav Mistry and the XRay specs we previously wrote about we start to envisage a very Brave New World. This is not just about how we interact with technology but also about how we construct our media to exploit it and add value to reach all.



Unknown said...

I am also using audio translation
for my blog. it's working is really good and be effective for visitors even i am using the free vision. Be Happy to use this one. :)

Unknown said...

Thanks for sharing this information. I found it very informative as I have been researching a lot lately on practical matters such as you talk about..
british voice over