Topical items and views on the impact of digitisation on publishing and its content and the issues that make the news. This blog follows the report 'Brave New World',
(http://www.ewidgetsonline.com/vcil/bravenewworld.html ), published by the Booksellers Association of the UK and Ireland and authored by Martyn Daniels. The views and comments expressed are those of the author.
Thursday, May 03, 2012
Translation On Demand
We have long awaited that intuitive voice interface
which would gives us back our hands, capture our thoughts as we say them and offer
audio play back in voice, pitch and speed that we can easily digest. An
intuitive user interface not only changes how we digest information and media
but also how it is constructed and served.
Watching my mother in law as she commands interfaces
with her RNIB (Royal National Institute for Blind) installed software has long
been a marvel. However it is daunting to watch and makes you realise that it
isn’t just about the media content, but all the commands to access and control
We have seen the great strides taken by the like of
Apple with its Siri voice recognition and command interface
and we enjoy and often use our Google voice search facility on our smartphone.
We have also seen and frequently use the Google translation facilities and
although there is sometimes that odd error, the market has moved on
Now Google are introducing a Gmail facility which
will automatically translate messages enabling users to get mail in other
languages automatically translated into their own language. Users can choose to
have messages in a language automatically translated in their language by
selecting "Always Translate", or they can click for translation using
a translate message option. They can also turn off translation for a language.
Many speech translations remain flat and generate a somewhat
synthetic voice. However, experimental new software developed by Microsoft is now
able not only to translate between 26 different languages and play back the
speech in the user’s own voice, it does so complete with the inflections the
user used when speaking in their own language. Microsoft have demonstrated the
facility by using the software to read out and translating it to English,
Italian and Madarin. Play the recording and read more at Technology Review.
Obviously users must first spend time training it to
recognize and reproduce their voice. Once that's been accomplished, the
software applies that user-specific speech model to a generic text-to-speech
model for the desired output language. Individual sounds of the user's voice
are selected from the training session, then strung together and appropriately
altered, in order to create a natural-sounding translation.
So as we move take another huge step forward where language
itself, whether text or speech, can be converted to play back in one’s own language
and even one’s own voice. Text translation to your own language is an automatic
feature and this technology starts to become mainstream. This starts to offer
huge opportunities which were previously restricted and only achievable through
translators. In theory, translation of text becomes a flick of a switch and speech
translation is personalised. When we couple these developments to the vision of
Pranav Mistry and the XRay specs we previously wrote about we start to envisage
a very Brave New World. This is not just about how we interact with technology
but also about how we construct our media to exploit it and add value to reach