Matador v 0.9

Spanish-English Generation-Heavy
Hybrid Machine Translation


What is Matador?

Matador is a Spanish-English machine translation system implemented following the Genereation-heavy Hybrid approach to Machine Translation (GHMT). The focus of GHMT is addressing the lack of resource symmetry between source and target languages. GHMT exploits symbolic and statistical target language resources in source-poor/target-rich language pairs. Expected source language resources include a syntactic parser and a simple one-to-many translation dictionary. No transfer rules or complex interlingual representations are used. Rich target language symbolic resources such as word lexical semantics, categorial variations and subcategorization frames are used to overgenerate multiple structural variations from a target-glossed syntactic dependency representation of source language sentences. This symbolic overgeneration, which accounts for possible translation divergences, is constrained by multiple statistical target language models including surface n-grams and structural n-grams. The source-target asymmetry of systems developed in this approach makes them more easily retargetable (re-source-able) to new source languages (provided a source language parser and translation dictionary).

The basic intuition of the GHMT approach parallels the experience of most language learners whose lack of symmetrical knowledge impairs their ability to translate into their newly learned language but does not hinder them as much when translating from the foreign language into their native tongue (where they are assisted by rich resources).

For more information check the publications section.


Matador Demo


Explicit Diacritics
EXERGE Options:
Thematic Linking
Structural Expansion
Max Conflations
Max Inflations
Structural N-gram Pruning
HALogen Options:
Language Model
N-gram
Spanish is parsed with Conexor(on-line demo).

Demo Options

Explicit Diacritics

This option allows users to input Spanish diacritized characters (e.g. á or ñ) when no Spanish keyboard is available. The following table describes how these characters can be specified:
Diacritized Character Explicitly Diacritized Character
á a'
é e'
í i'
ó o'
ú u'
ñ n~
ü u"
Diacritized Character Explicitly Diacritized Character
Á A'
É E'
Í I'
Ó O'
Ú U'
Ñ N~
Ü U"

Publications


Credits


Contacts


Matador ©2003 Copyright University of Maryland. All Rights Reserved.