An efficient two-pass approach to synchronous-CFG driven statistical MT

Ashish Venugopal, Andreas Zollmann, Stephan Vogel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Citations (Scopus)

Abstract

We present an efficient, novel two-pass approach to mitigate the computational impact resulting from online intersection of an n-gram language model (LM) and a probabilistic synchronous context-free grammar (PSCFG) for statistical machine translation. In first pass CYK-style decoding, we consider first-best chart item approximations, generating a hypergraph of sentence spanning target language derivations. In the second stage, we instantiate specific alternative derivations from this hypergraph, using the LM to drive this search process, recovering from search errors made in the first pass. Model search errors in our approach are comparable to those made by the state-of-the-art "Cube Pruning" approach in (Chiang, 2007) under comparable pruning conditions evaluated on both hierarchical and syntax-based grammars.

Original languageEnglish
Title of host publicationNAACL HLT 2007 - Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Main Conference
Pages500-507
Number of pages8
Publication statusPublished - 1 Dec 2007
Externally publishedYes
EventHuman Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2007 - Rochester, NY, United States
Duration: 22 Apr 200727 Apr 2007

Other

OtherHuman Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2007
CountryUnited States
CityRochester, NY
Period22/4/0727/4/07

Fingerprint

grammar
language
syntax
Language Model
Grammar
Computational
N-gram
Language
Statistical Machine Translation
Charts
Decoding
Syntax
Approximation

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Venugopal, A., Zollmann, A., & Vogel, S. (2007). An efficient two-pass approach to synchronous-CFG driven statistical MT. In NAACL HLT 2007 - Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Main Conference (pp. 500-507)

An efficient two-pass approach to synchronous-CFG driven statistical MT. / Venugopal, Ashish; Zollmann, Andreas; Vogel, Stephan.

NAACL HLT 2007 - Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Main Conference. 2007. p. 500-507.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Venugopal, A, Zollmann, A & Vogel, S 2007, An efficient two-pass approach to synchronous-CFG driven statistical MT. in NAACL HLT 2007 - Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Main Conference. pp. 500-507, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2007, Rochester, NY, United States, 22/4/07.
Venugopal A, Zollmann A, Vogel S. An efficient two-pass approach to synchronous-CFG driven statistical MT. In NAACL HLT 2007 - Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Main Conference. 2007. p. 500-507
Venugopal, Ashish ; Zollmann, Andreas ; Vogel, Stephan. / An efficient two-pass approach to synchronous-CFG driven statistical MT. NAACL HLT 2007 - Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Main Conference. 2007. pp. 500-507
@inproceedings{9857029023d9447989529f9cae93f658,
title = "An efficient two-pass approach to synchronous-CFG driven statistical MT",
abstract = "We present an efficient, novel two-pass approach to mitigate the computational impact resulting from online intersection of an n-gram language model (LM) and a probabilistic synchronous context-free grammar (PSCFG) for statistical machine translation. In first pass CYK-style decoding, we consider first-best chart item approximations, generating a hypergraph of sentence spanning target language derivations. In the second stage, we instantiate specific alternative derivations from this hypergraph, using the LM to drive this search process, recovering from search errors made in the first pass. Model search errors in our approach are comparable to those made by the state-of-the-art {"}Cube Pruning{"} approach in (Chiang, 2007) under comparable pruning conditions evaluated on both hierarchical and syntax-based grammars.",
author = "Ashish Venugopal and Andreas Zollmann and Stephan Vogel",
year = "2007",
month = "12",
day = "1",
language = "English",
pages = "500--507",
booktitle = "NAACL HLT 2007 - Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Main Conference",

}

TY - GEN

T1 - An efficient two-pass approach to synchronous-CFG driven statistical MT

AU - Venugopal, Ashish

AU - Zollmann, Andreas

AU - Vogel, Stephan

PY - 2007/12/1

Y1 - 2007/12/1

N2 - We present an efficient, novel two-pass approach to mitigate the computational impact resulting from online intersection of an n-gram language model (LM) and a probabilistic synchronous context-free grammar (PSCFG) for statistical machine translation. In first pass CYK-style decoding, we consider first-best chart item approximations, generating a hypergraph of sentence spanning target language derivations. In the second stage, we instantiate specific alternative derivations from this hypergraph, using the LM to drive this search process, recovering from search errors made in the first pass. Model search errors in our approach are comparable to those made by the state-of-the-art "Cube Pruning" approach in (Chiang, 2007) under comparable pruning conditions evaluated on both hierarchical and syntax-based grammars.

AB - We present an efficient, novel two-pass approach to mitigate the computational impact resulting from online intersection of an n-gram language model (LM) and a probabilistic synchronous context-free grammar (PSCFG) for statistical machine translation. In first pass CYK-style decoding, we consider first-best chart item approximations, generating a hypergraph of sentence spanning target language derivations. In the second stage, we instantiate specific alternative derivations from this hypergraph, using the LM to drive this search process, recovering from search errors made in the first pass. Model search errors in our approach are comparable to those made by the state-of-the-art "Cube Pruning" approach in (Chiang, 2007) under comparable pruning conditions evaluated on both hierarchical and syntax-based grammars.

UR - http://www.scopus.com/inward/record.url?scp=84858386182&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84858386182&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84858386182

SP - 500

EP - 507

BT - NAACL HLT 2007 - Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Proceedings of the Main Conference

ER -