Q2 – Why Statistical MT and not Rule-Based MT? What are the advantages and disadvantages?

Any experienced MT (or at least reader or post-editor of MT) will tell you that Statistical MT flows much better than the traditional rule-based systems (RB). Anyone who has studied or implemented SMT will tell you implementation and development times are much shorter (thus ROI). RB is usually bought as a cheaper package once a company has done all the programming of rules and built in the syntactics. The package is closed and customization (or hybridizing) is a longer process. Statistical MT can improve by Coupling Reordering and Decoding, and by applying many many other mathematical and statistical formulas which will determine with certainty that a word (or series or combinations of words) happen together in comparison with other words. Read below if you need a comprehensive listing.

  • SMT only needs to learn parallel corpus to generate a translation engine. In contrast, RBMT needs a great deal of knowledge external to the corpus that only linguistic experts can generate, e.g. superficial categorization, syntax and semantics of all the words of one language in addition to the transfer rules between languages. These latter rules are entirely dependent on language pair involved and are not generally as studied as the characterization of each separate language. Defining general transfer rules is not easy, and so multiple rules according to individual cases need to be defined, especially between languages with very different structures, and / or when the source language has greater flexibility for the management of structural objects in a sentence.
  • An SMT system is developed rapidly if have the appropriate corpus is available, making it more profitable. A RBMT system, in turn, requires great development and customization costs until it reaches the desired quality threshold. Packaged RBMT systems have already been developed by the time the user purchases them:  most users approach MT by purchasing “out of the box” or “server ready” programs. The program works and will work in a certain way, but it is extremely difficult to reprogram models and equivalences. Above all, RBMT deployment generally is a much longer process involving more human resources. This is one key issue when companies calculate full implementation cost.
  • SMT is adapted to automatically be retrained to situations not seen before (hitherto unknown words, new expressions that are translated differently from the way they were previously translated, etc.). RBMT is ‘re-trained’ by adding new rules and vocabulary among other things, which in turns means more time / increased handling by “expert humans”.
  • SMT generates more fluent translations (fluency), although pure statistical systems may offer less consistency and less predictable results if the training corpus is too wide for the purpose. RBMT, however, may not have found the surface / syntactic information or words suitable for analyzing the source language, or does not know the word. This will prevent it from finding an appropriate rule.
  • While statistical machine translation works well for translations in a specific domain, with the engine trained with bilingual corpus in that domain, RBMT may work better for more general domains.
  • It is clear need for powerful computing in SMT in terms of hardware to train the models. Billions of calculations need to take place during the training of the engine and the hardware and computing knowledge required for it is highly specialized. However, training time can be reduced nowadays thanks to the wider availability of more powerful computers. RBMT requires a longer  deployment and compilation time by experts so that, in principle, building costs are also higher.
  • SMT generates statistical patterns automatically, including a good learning of exceptions to rules. As regards to the rules governing the transfer of RBMT systems, certainly they can be seen as special cases of statistical standards. Nevertheless, they generalize too much and cannot handle exceptions.
  • Finally, SMT systems can be upgraded with syntactic information, and even semantics, like the RBMT. But in this case, the statistical patterns that a SMT would learn can be seen as a more general type of transfer rules, although currently the inclusion of such information in current systems does not provide significant improvements.
  • A SMT engine can generate improved translations if retrained or adapted again. In contrast, the RBMT generates very similar translations between different versions.