Optimizing for sentence-level BLEU+1 yields short translations

Preslav Nakov, Francisco Guzmán, Stephan Vogel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

24 Citations (Scopus)

Abstract

We study a problem with pairwise ranking optimization (PRO): That it tends to yield too short translations. We find that this is partially due to the inadequate smoothing in PRO's BLEU+1, which boosts the precision component of BLEU but leaves the brevity penalty unchanged, thus destroying the balance between the two, compared to BLEU. It is also partially due to PRO optimizing for a sentence-level score without a global view on the overall length, which introducing a bias towards short translations; we show that letting PRO optimize a corpus-level BLEU yields a perfect length. Finally, we find some residual bias due to the interaction of PRO with BLEU+1: such a bias does not exist for a version of MIRA with sentence-level BLEU+1. We propose several ways to fix the length problem of PRO, including smoothing the brevity penalty, scaling the effective reference length, grounding the precision component, and unclipping the brevity penalty, which yield sizable improvements in test BLEU on two Arabic-English datasets: IWSLT (+0.65) and NIST (+0.37).

Original languageEnglish
Title of host publication24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers
Pages1979-1994
Number of pages16
Publication statusPublished - 1 Dec 2012
Event24th International Conference on Computational Linguistics, COLING 2012 - Mumbai, India
Duration: 8 Dec 201215 Dec 2012

Other

Other24th International Conference on Computational Linguistics, COLING 2012
CountryIndia
CityMumbai
Period8/12/1215/12/12

Fingerprint

ranking
penalty
trend
Electric grounding
scaling
Ranking
Length
interaction

Keywords

  • MERT
  • MIRA
  • Parameter optimization
  • PRO
  • Statistical machine translation

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Language and Linguistics
  • Linguistics and Language

Cite this

Nakov, P., Guzmán, F., & Vogel, S. (2012). Optimizing for sentence-level BLEU+1 yields short translations. In 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers (pp. 1979-1994)

Optimizing for sentence-level BLEU+1 yields short translations. / Nakov, Preslav; Guzmán, Francisco; Vogel, Stephan.

24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers. 2012. p. 1979-1994.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Nakov, P, Guzmán, F & Vogel, S 2012, Optimizing for sentence-level BLEU+1 yields short translations. in 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers. pp. 1979-1994, 24th International Conference on Computational Linguistics, COLING 2012, Mumbai, India, 8/12/12.
Nakov P, Guzmán F, Vogel S. Optimizing for sentence-level BLEU+1 yields short translations. In 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers. 2012. p. 1979-1994
Nakov, Preslav ; Guzmán, Francisco ; Vogel, Stephan. / Optimizing for sentence-level BLEU+1 yields short translations. 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers. 2012. pp. 1979-1994
@inproceedings{db6d3fdbb65e45159f92a6bfdfd86cd9,
title = "Optimizing for sentence-level BLEU+1 yields short translations",
abstract = "We study a problem with pairwise ranking optimization (PRO): That it tends to yield too short translations. We find that this is partially due to the inadequate smoothing in PRO's BLEU+1, which boosts the precision component of BLEU but leaves the brevity penalty unchanged, thus destroying the balance between the two, compared to BLEU. It is also partially due to PRO optimizing for a sentence-level score without a global view on the overall length, which introducing a bias towards short translations; we show that letting PRO optimize a corpus-level BLEU yields a perfect length. Finally, we find some residual bias due to the interaction of PRO with BLEU+1: such a bias does not exist for a version of MIRA with sentence-level BLEU+1. We propose several ways to fix the length problem of PRO, including smoothing the brevity penalty, scaling the effective reference length, grounding the precision component, and unclipping the brevity penalty, which yield sizable improvements in test BLEU on two Arabic-English datasets: IWSLT (+0.65) and NIST (+0.37).",
keywords = "MERT, MIRA, Parameter optimization, PRO, Statistical machine translation",
author = "Preslav Nakov and Francisco Guzm{\'a}n and Stephan Vogel",
year = "2012",
month = "12",
day = "1",
language = "English",
pages = "1979--1994",
booktitle = "24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers",

}

TY - GEN

T1 - Optimizing for sentence-level BLEU+1 yields short translations

AU - Nakov, Preslav

AU - Guzmán, Francisco

AU - Vogel, Stephan

PY - 2012/12/1

Y1 - 2012/12/1

N2 - We study a problem with pairwise ranking optimization (PRO): That it tends to yield too short translations. We find that this is partially due to the inadequate smoothing in PRO's BLEU+1, which boosts the precision component of BLEU but leaves the brevity penalty unchanged, thus destroying the balance between the two, compared to BLEU. It is also partially due to PRO optimizing for a sentence-level score without a global view on the overall length, which introducing a bias towards short translations; we show that letting PRO optimize a corpus-level BLEU yields a perfect length. Finally, we find some residual bias due to the interaction of PRO with BLEU+1: such a bias does not exist for a version of MIRA with sentence-level BLEU+1. We propose several ways to fix the length problem of PRO, including smoothing the brevity penalty, scaling the effective reference length, grounding the precision component, and unclipping the brevity penalty, which yield sizable improvements in test BLEU on two Arabic-English datasets: IWSLT (+0.65) and NIST (+0.37).

AB - We study a problem with pairwise ranking optimization (PRO): That it tends to yield too short translations. We find that this is partially due to the inadequate smoothing in PRO's BLEU+1, which boosts the precision component of BLEU but leaves the brevity penalty unchanged, thus destroying the balance between the two, compared to BLEU. It is also partially due to PRO optimizing for a sentence-level score without a global view on the overall length, which introducing a bias towards short translations; we show that letting PRO optimize a corpus-level BLEU yields a perfect length. Finally, we find some residual bias due to the interaction of PRO with BLEU+1: such a bias does not exist for a version of MIRA with sentence-level BLEU+1. We propose several ways to fix the length problem of PRO, including smoothing the brevity penalty, scaling the effective reference length, grounding the precision component, and unclipping the brevity penalty, which yield sizable improvements in test BLEU on two Arabic-English datasets: IWSLT (+0.65) and NIST (+0.37).

KW - MERT

KW - MIRA

KW - Parameter optimization

KW - PRO

KW - Statistical machine translation

UR - http://www.scopus.com/inward/record.url?scp=84876788727&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84876788727&partnerID=8YFLogxK

M3 - Conference contribution

SP - 1979

EP - 1994

BT - 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers

ER -