Urdu Translation Software Understands Nuance
Rohini Srihari began working on her software in the hopes of improving computerized translations of Urdu -- a linguistic blend of Hindi and Persian that is widely spoken in Pakistan, and by many Muslims in India. Urdu is a particularly difficult language for computers to translate because its grammar is so different from most Western languages, and its Arabic script can connote important subtleties in meaning. As a result, most digital translations of Urdu remain comparatively basic and literal -- something that Srihari wanted to change.
"What I want is to determine who are the people, places and things being talked about," Srihari told NPR. "Is there an opinion being expressed? Is it a positive or negative opinion being expressed?" According to the researcher, her new software has effectively "learned" Urdu, and can even analyze and capture the language's subtle nuances. With Srihari's program, users can scroll their mouse over Urdu text, and the computer will identify the sentiment behind the text, highlighting the excerpt in red if negative, and green if positive.
Srihari's work is of particular importance following the recent upheaval in the Middle East, where dissidents and demonstrators used social media like Twitter and Facebook to organize protests and disseminate information. With a more accurate translation tool, outsiders may eventually be able to get a better grasp of what's going on in different nations through less distilled translations.
Ernest Tucker, a history professor at the U.S. Naval Academy, told NPR that the software could have far-reaching implications for political scientists and historians, as well. "That's the goal of all historians anywhere," he explained, "to try to get the voices of more and more people into the conversation, and anything that can do that, particularly this kind of thing, is a wonderful gift."