Improved statistical machine translation using monolingual paraphrases

Research output: Contribution to journalArticle

19 Citations (Scopus)

Abstract

We propose a novel monolingual sentence paraphrasing method for augmenting the training data for statistical machine translation systems “for free” – by creating it from data that is already available rather than having to create more aligned data. Starting with a syntactic tree, we recursively generate new sentence variants where noun compounds are paraphrased using suitable prepositions, and vice-versa – preposition-containing noun phrases are turned into noun compounds. The evaluation shows an improvement equivalent to 33%-50% of that of doubling the amount of training data.

Original languageEnglish
Pages (from-to)338-342
Number of pages5
JournalFrontiers in Artificial Intelligence and Applications
Volume178
DOIs
Publication statusPublished - 1 Jun 2008

Fingerprint

Syntactics

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this

Improved statistical machine translation using monolingual paraphrases. / Nakov, Preslav.

In: Frontiers in Artificial Intelligence and Applications, Vol. 178, 01.06.2008, p. 338-342.

Research output: Contribution to journalArticle

@article{f68eac7d76924cab8a9e4012d73992ee,
title = "Improved statistical machine translation using monolingual paraphrases",
abstract = "We propose a novel monolingual sentence paraphrasing method for augmenting the training data for statistical machine translation systems “for free” – by creating it from data that is already available rather than having to create more aligned data. Starting with a syntactic tree, we recursively generate new sentence variants where noun compounds are paraphrased using suitable prepositions, and vice-versa – preposition-containing noun phrases are turned into noun compounds. The evaluation shows an improvement equivalent to 33{\%}-50{\%} of that of doubling the amount of training data.",
author = "Preslav Nakov",
year = "2008",
month = "6",
day = "1",
doi = "10.3233/978-1-58603-891-5-338",
language = "English",
volume = "178",
pages = "338--342",
journal = "Frontiers in Artificial Intelligence and Applications",
issn = "0922-6389",
publisher = "IOS Press",

}

TY - JOUR

T1 - Improved statistical machine translation using monolingual paraphrases

AU - Nakov, Preslav

PY - 2008/6/1

Y1 - 2008/6/1

N2 - We propose a novel monolingual sentence paraphrasing method for augmenting the training data for statistical machine translation systems “for free” – by creating it from data that is already available rather than having to create more aligned data. Starting with a syntactic tree, we recursively generate new sentence variants where noun compounds are paraphrased using suitable prepositions, and vice-versa – preposition-containing noun phrases are turned into noun compounds. The evaluation shows an improvement equivalent to 33%-50% of that of doubling the amount of training data.

AB - We propose a novel monolingual sentence paraphrasing method for augmenting the training data for statistical machine translation systems “for free” – by creating it from data that is already available rather than having to create more aligned data. Starting with a syntactic tree, we recursively generate new sentence variants where noun compounds are paraphrased using suitable prepositions, and vice-versa – preposition-containing noun phrases are turned into noun compounds. The evaluation shows an improvement equivalent to 33%-50% of that of doubling the amount of training data.

UR - http://www.scopus.com/inward/record.url?scp=85052000071&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85052000071&partnerID=8YFLogxK

U2 - 10.3233/978-1-58603-891-5-338

DO - 10.3233/978-1-58603-891-5-338

M3 - Article

VL - 178

SP - 338

EP - 342

JO - Frontiers in Artificial Intelligence and Applications

JF - Frontiers in Artificial Intelligence and Applications

SN - 0922-6389

ER -