CMU-CS-03-138
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-03-138

Maximal Lattice Overlap
in Example-Based Machine Translations

Rebecca Hutchinson, Paul N. Bennett, Jaime Carbonell,
Peter Jansen, Ralf Brown

June 2003

Also appears as Language Technologies Institute Technical Report
CMU-LTI-03-174


CMU-CS-03-138.ps
CMU-CS-03-138.pdf


Keywords: Machine translation, example-based machine translation, boundary friction, mutual reinforcement


Example-Based Machine Translation (EBMT) retrieves pre-translated phrases from a sentence-aligned bilingual training corpus to translate new input sentences. EBMT uses long pre-translated phrases effectively but is subject to disfluencies at phrasal translation boundaries. We address this problem by introducing a novel method that exploits overlapping phrasal translations and the increased confidence in translation accuracy they imply. We specify an efficient algorithm for producing translations using overlap. Finally, our empirical analysis indicates that this approach produces higher quality translations than the standard method of EBMT in a peak-to-peak comparison.

12 pages


Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by reports@cs.cmu.edu