Feature

Current Typical MT System

Anusaaraka

Goal
To produce Natural translation.
In case of failures produce 'rough' translation.
(what is 'rough' is not well defined.)
To provide access to the Source Language
Unit of input
Currently, independent sentences
a complete XML document
System Components
Morph analysers, POS taggers, Parsers, Sense disambiguation modules, Generator Same as in MT
      +
Anusaaraka User Interface
Sequence of Operations
outputs are cascaded.
so errors too get cascaded.

The basic tasks are processed independently.
Price paid: Duplication of effort
Transparency
working is not transparent to the end-user
working is transparent to the end-user
Access
User has access only to the final output
User has access to the output at each level
Guidelines for Linguists
No specific guidelines
First write an algorithm for 'Human beings' and not necessarily for 'computers'!
Principle
Ad-hoc
"Information Dynamics"
Approaches
-- EBMT,
-- Rule based,
-- Statistical
-- Hybrid

ECLECTIC

Choose the best of each of these approaches.

Use best of the available resources under GPL.

CONESEQUENCES
  • Later modules are affected by the errors of the previous modules
  • Rough is not well defined. Hence users may get mislead.
  • User can not participate in the development process
  • Linguists end up in reinventing the wheel again and again.

  • Parallel processing ensure that different modules do not interfere.
  • Well defined 'Roughness'. Theoretically no chances of user getting mislead.
  • User can participate in the development activity
  • Linguist prepares data only once.