As reported by the Hindustan Times, the University of Edinburgh in Scotland and the Technology Development for Indian Languages (TDIL) of the Indian government have hosted the first  Hindi to Punjabi machine translation software, overcoming several of Moses initial limitations.

The web-interface program has not been released to the public yet. It is a long effort developed by a faculty member from the Punjabi University, Mr. Ajit Singh, an assistant professor at the MM Modi College, with help from the University’s Computer Science Department.

The software was developed recently and tested at both organizations. It took several months to develop as the problem was that the translate.cgi expected to have many copies of the daemon.pl running, all listening on different ports. Each one should wrap a different instance of Moses. Therefore, a web-based translation system could not be based on the latest versions of Moses, which are all multi-thread as it had been written before Moses had threads. The program had to be multi-process.

Initially, Mr Singh installed the moses server and the web server on a single machine on a linux platform. Afterwards, he tested on the local host the system to work.
However, when he installed the system on web server for public use, part of the system worked fine but most of it was not getting the translation of the input text. Instead, the input text was transliterated in the post processing script written in transliterate.pl

Hindi to Punjabi Machine Translation System

Hindi to Punjabi Machine Translation System

Although there is not “from English” or “into English” translation, developers cannot hide their satisfaction. Users of the software are not proficient in English anyway, which makes this development quite unique. Vishal Goyal, assistant professor at the department of Computer Science at the Punjabi University confirmed that “the software has been made available online on the servers of the Edinburgh University and the TDIL.” In 2011, over 33 million people spoke Punjabi in India, whereas around 50% of the population in Pakistan (some 78 million) are native Punjabi speakers. Punjabi is written in two scripts: the Gurumukhī script and Shahmukhī script.

The system provides a Devanagari Typing Pad on the screen, although users can type  from their own system keyboard. In this case, they need to choose Keyboard Mapping for typing. In Pakistan, Punjabi is generally written using the Shahmukhī script, created from a modification of the Persian Nastaʿlīq script. In India, Punjabi is most frequently rendered in Gurumukhī, but it is also written in the Devanagari script or Latin script due to influence from Hindi and English, respectively.

When the input code is other than Unicode encoding, the user has to select the font for the input text and the text is automatically converted from non-unicode to unicode text encoding. The software also provides for font conversion and can translate several file formats. The right side of the panel provides a website conversion feature so websites in Hindi can be translated into Punjabi, although at the time of reporting the service was not available.

Hindi font conversion

Hindi font conversion

This Hindi to Punjabi machine translation development is almost 95% accurate according to Prof Vishal Goyal, which is a extremely high rate of accuracy considering the languages are not related: Punjabi is an Indo-Aryan language and Hindi was developed from the vernacular dialect of Delhi, the Khariboli, and the surrounding area in Uttar Pradesh. In the 1600’s, Hindi was known as “Urdu” due to the Persian influences received during the Mughal Empire. Nowadays, Hindi uses the Devaganari script and uses Sanskrit words for etymology whereas Urdu was and is written using the Persian script and uses more Persian words.

Punjabi University vice-chancellor Jaspal Singh congratulated the department and applauded faculty’s efforts in bringing international recognition to the institution.