Preference grammars

Softening syntactic constraints to improve statistical machine translation

Ashish Venugopal, Andreas Zollmann, Noah A. Smith, Stephan Vogel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

20 Citations (Scopus)

Abstract

We propose a novel probabilistic syn-choronous context-free grammar formalism for statistical machine translation, in which syntactic nonterminal labels are represented as "soft" preferences rather than as "hard" matching constraints. This formalism allows us to efficiently score unlabeled synchronous derivations without forgoing traditional syntactic constraints. Using this score as a feature in a log-linear model, we are able to approximate the selection of the most likely unlabeled derivation. This helps reduce fragmentation of probability across differently labeled derivations of the same translation. It also allows the importance of syntactic preferences to be learned alongside other features (e.g., the language model) and for particular labeling procedures. We show improvements in translation quality on small and medium sized Chinese-to-English translation tasks.

Original languageEnglish
Title of host publicationNAACL HLT 2009 - Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Conference
Pages236-244
Number of pages9
Publication statusPublished - 1 Dec 2009
Externally publishedYes
EventHuman Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2009 - Boulder, CO, United States
Duration: 31 May 20095 Jun 2009

Other

OtherHuman Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2009
CountryUnited States
CityBoulder, CO
Period31/5/095/6/09

Fingerprint

grammar
linear model
fragmentation
Formalism
Statistical Machine Translation
Grammar
Syntactic Constraints
Syntax
language
Fragmentation
English Translation
Language Model
Labeling

ASJC Scopus subject areas

  • Language and Linguistics
  • Social Sciences (miscellaneous)

Cite this

Venugopal, A., Zollmann, A., Smith, N. A., & Vogel, S. (2009). Preference grammars: Softening syntactic constraints to improve statistical machine translation. In NAACL HLT 2009 - Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Conference (pp. 236-244)

Preference grammars : Softening syntactic constraints to improve statistical machine translation. / Venugopal, Ashish; Zollmann, Andreas; Smith, Noah A.; Vogel, Stephan.

NAACL HLT 2009 - Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Conference. 2009. p. 236-244.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Venugopal, A, Zollmann, A, Smith, NA & Vogel, S 2009, Preference grammars: Softening syntactic constraints to improve statistical machine translation. in NAACL HLT 2009 - Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Conference. pp. 236-244, Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2009, Boulder, CO, United States, 31/5/09.
Venugopal A, Zollmann A, Smith NA, Vogel S. Preference grammars: Softening syntactic constraints to improve statistical machine translation. In NAACL HLT 2009 - Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Conference. 2009. p. 236-244
Venugopal, Ashish ; Zollmann, Andreas ; Smith, Noah A. ; Vogel, Stephan. / Preference grammars : Softening syntactic constraints to improve statistical machine translation. NAACL HLT 2009 - Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Conference. 2009. pp. 236-244
@inproceedings{88a4e8a46f47472aa74717b65f1ec2f8,
title = "Preference grammars: Softening syntactic constraints to improve statistical machine translation",
abstract = "We propose a novel probabilistic syn-choronous context-free grammar formalism for statistical machine translation, in which syntactic nonterminal labels are represented as {"}soft{"} preferences rather than as {"}hard{"} matching constraints. This formalism allows us to efficiently score unlabeled synchronous derivations without forgoing traditional syntactic constraints. Using this score as a feature in a log-linear model, we are able to approximate the selection of the most likely unlabeled derivation. This helps reduce fragmentation of probability across differently labeled derivations of the same translation. It also allows the importance of syntactic preferences to be learned alongside other features (e.g., the language model) and for particular labeling procedures. We show improvements in translation quality on small and medium sized Chinese-to-English translation tasks.",
author = "Ashish Venugopal and Andreas Zollmann and Smith, {Noah A.} and Stephan Vogel",
year = "2009",
month = "12",
day = "1",
language = "English",
isbn = "9781932432411",
pages = "236--244",
booktitle = "NAACL HLT 2009 - Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Conference",

}

TY - GEN

T1 - Preference grammars

T2 - Softening syntactic constraints to improve statistical machine translation

AU - Venugopal, Ashish

AU - Zollmann, Andreas

AU - Smith, Noah A.

AU - Vogel, Stephan

PY - 2009/12/1

Y1 - 2009/12/1

N2 - We propose a novel probabilistic syn-choronous context-free grammar formalism for statistical machine translation, in which syntactic nonterminal labels are represented as "soft" preferences rather than as "hard" matching constraints. This formalism allows us to efficiently score unlabeled synchronous derivations without forgoing traditional syntactic constraints. Using this score as a feature in a log-linear model, we are able to approximate the selection of the most likely unlabeled derivation. This helps reduce fragmentation of probability across differently labeled derivations of the same translation. It also allows the importance of syntactic preferences to be learned alongside other features (e.g., the language model) and for particular labeling procedures. We show improvements in translation quality on small and medium sized Chinese-to-English translation tasks.

AB - We propose a novel probabilistic syn-choronous context-free grammar formalism for statistical machine translation, in which syntactic nonterminal labels are represented as "soft" preferences rather than as "hard" matching constraints. This formalism allows us to efficiently score unlabeled synchronous derivations without forgoing traditional syntactic constraints. Using this score as a feature in a log-linear model, we are able to approximate the selection of the most likely unlabeled derivation. This helps reduce fragmentation of probability across differently labeled derivations of the same translation. It also allows the importance of syntactic preferences to be learned alongside other features (e.g., the language model) and for particular labeling procedures. We show improvements in translation quality on small and medium sized Chinese-to-English translation tasks.

UR - http://www.scopus.com/inward/record.url?scp=84857524432&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84857524432&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781932432411

SP - 236

EP - 244

BT - NAACL HLT 2009 - Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Conference

ER -