When is a text suitable for post-edited machine translation and would you recommend it?
The estimated reading time for this post is 275 seconds
Machine translation depends on algorithms that find patterns. The more regular and “controlled” the written language, the better the output. Having said that, engines trained on a specific domain are always the ones that provide better results. We, at Pangeanic, have built literally hundreds of engines to be applied at specific language domains (automotive, legal, economics, to name a few). As algorithms recognise the language pattern, they can predict better the structure and the vocabulary.
Journalism tends not to perform well (with exceptions) as style plays an important role. For general reports, tenders, etc., we use general machine translation engines we have created with data made public from the EU’s DGT, the UN, the ECB, Sony Europe, TAUS (Translation Automation Users Society) which includes data from many fields including bilingual texts from Dell, Adobe, Microsoft, OpenOffice, legal services, etc., medical and pharmaceutical from relevant European organisations, and data we have crawled from several multilingual websites using our own technologies. (You can see more in Pangeanic’s Translation Technology section).
These engines are high quality, ready-made general language machine translation engines and tend to perform well with general documentation. Users tend to obtain better results into English even with raw machine translation due to the particularities of the language: no inflexion, no cases, quite straight-forward conjugation and fairly structured sentence structures.
You can use them in two ways
- Uploading docx (Word), pptx (PowerPoint), xlsx (Excel) -that means from Office 2011 onwards, as Microsoft adopted xml-friendly document formats. Alternatively, any xml files (OpenOffice and similar). InDesign files are also supported. Documents can also be batch-translated using a zip file.
- Providing your documents to us using our client portal. Here, documents are uploaded directly onto our sever using a GlobalSign Domain Validation high encryption CA – SHA256 – G2. All data uploaded by our clients is kept securely in our company servers, which have redundancy backups created every 24hrs.
All information created as a result of Pangeanic’s relationship with the client is covered by our confidentiality agreements complying with ISO9001 and EN15038 standards. Terminology datasets and translation memories that need to be shared to linguists are queried in an encrypted fashion from our own servers and not cloud services. Translators cannot keep a copy of their translation memories as their work feeds into our online, central TM directly. Thus, you can safely upload pre-release or internal documentation onto our servers, assign a Project Manager who will be knowledgeable of your translation needs, pre-translation /machine translation process.
The client receives an email with a link to our server to download securely and directly the data from our servers – again using our GlobalSign Domain Validation high encryption.
This ideal for clients who are looking for a secure translation environment where data is transmitted to their translation vendor frequently but no information leaks can be permitted.
When would you recommend post-editing of machine-translation?
Machine Translation (MT) is recommend for every situation, always. However, when it is applied correctly and with the right configuration, there are powerful reasons why you would consider automation as a translation tool. The benefits include:
- lower translation costs
- higher productivity
- faster time-to-market
- even real-time translations using our own PangeaMT translation panel.
Remember all this happens in a secure environment, from the moment you decide to upload your documents for pre-translation by one of our engines through all data transmission to final data download.
The fully managed solutions can be hosted by Pangeanic in our secure datacentre. Some clients may prefer to host the engines themselves at one of your own servers (we call this “onsite installation”).
Professionally post-edited can be fed back to a client customised MT engine. This helps engines grow and improve over time as in-domain data keeps improving the statistics and the language patterns and the engines learn language patterns. We can also build engines using your specific data, using your own glossaries and previously translated documents. PangeaMT can integrate into your standard translation process for gisting (understanding what documents in a foreign language are saying) or as a means of enhancing your productivity if you have internal translation resources.
Pangeanic has years of accumulated experience in post-editing and has been an approved post-editor for the European Union. It is our experience which benefits from professional linguists who can provide post-editing of content that has been machine-translated. Here, MT serves as an efficiency tool as it aims to deliver the same linguistic quality as a human translation, for significantly lower time and cost. These benefits will improve over time as MT is deployed more and more, since the intelligent machine is capable of learning, memorising and adapting to both old and new content.
On average, a professional human translator can translate around 300-400 words in an hour, whereas raw MT can translate between 80,000 to 100,000 words in one hour. Saving are as much as 90% in terms of costs. Machine Translation (MT) is the use of a software program based on natural language processing algorithms to translate text or speech-to-text from one natural language into a different language. The main reason for the existence of MT is to speed translation processes, manage translation jobs that would be too expensive to be translated by humans because of cost or time constraints, for information purposes. MT should not be used for serious publication services if it has not been post-edited as a mere tool lower translation costs.
Businesses that require fast translations but are on a limited budget, MT can well be the logical solution to meet your expectations.