Improved statistical machine translation using monolingual paraphrases

Research output: Contribution to journalArticle

20 Citations (Scopus)

Abstract

We propose a novel monolingual sentence paraphrasing method for augmenting the training data for statistical machine translation systems “for free” – by creating it from data that is already available rather than having to create more aligned data. Starting with a syntactic tree, we recursively generate new sentence variants where noun compounds are paraphrased using suitable prepositions, and vice-versa – preposition-containing noun phrases are turned into noun compounds. The evaluation shows an improvement equivalent to 33%-50% of that of doubling the amount of training data.

Original languageEnglish
Pages (from-to)338-342
Number of pages5
JournalFrontiers in Artificial Intelligence and Applications
Volume178
DOIs
Publication statusPublished - 1 Jun 2008

    Fingerprint

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this