Distributed language modeling for N-best list re-ranking

Ying Zhang, Almut Silja Hildebrand, Stephan Vogel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

31 Citations (Scopus)

Abstract

In this paper we describe a novel distributed language model for N-best list re-ranking. The model is based on the client/server paradigm where each server hosts a portion of the data and provides information to the client. This model allows for using an arbitrarily large corpus in a very efficient way. It also provides a natural platform for relevance weighting and selection. We applied this model on a 2.97 billion-word corpus and re-ranked the N-best list from Hiero, a state-of-theart phrase-based system. Using BLEU as a metric, the re-ranked translation achieves a relative improvement of 4.8%, significantly better than the model-best translation.

Original languageEnglish
Title of host publicationCOLING/ACL 2006 - EMNLP 2006: 2006 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
Pages216-223
Number of pages8
Publication statusPublished - 1 Dec 2006
Externally publishedYes
Event11th Conference on Empirical Methods in Natural Language Proceessing, EMNLP 2006, Held in Conjunction with COLING/ACL 2006 - Sydney, NSW, Australia
Duration: 22 Jul 200623 Jul 2006

Other

Other11th Conference on Empirical Methods in Natural Language Proceessing, EMNLP 2006, Held in Conjunction with COLING/ACL 2006
CountryAustralia
CitySydney, NSW
Period22/7/0623/7/06

Fingerprint

Servers

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Cite this

Zhang, Y., Hildebrand, A. S., & Vogel, S. (2006). Distributed language modeling for N-best list re-ranking. In COLING/ACL 2006 - EMNLP 2006: 2006 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 216-223)

Distributed language modeling for N-best list re-ranking. / Zhang, Ying; Hildebrand, Almut Silja; Vogel, Stephan.

COLING/ACL 2006 - EMNLP 2006: 2006 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference. 2006. p. 216-223.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhang, Y, Hildebrand, AS & Vogel, S 2006, Distributed language modeling for N-best list re-ranking. in COLING/ACL 2006 - EMNLP 2006: 2006 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference. pp. 216-223, 11th Conference on Empirical Methods in Natural Language Proceessing, EMNLP 2006, Held in Conjunction with COLING/ACL 2006, Sydney, NSW, Australia, 22/7/06.
Zhang Y, Hildebrand AS, Vogel S. Distributed language modeling for N-best list re-ranking. In COLING/ACL 2006 - EMNLP 2006: 2006 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference. 2006. p. 216-223
Zhang, Ying ; Hildebrand, Almut Silja ; Vogel, Stephan. / Distributed language modeling for N-best list re-ranking. COLING/ACL 2006 - EMNLP 2006: 2006 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference. 2006. pp. 216-223
@inproceedings{f697ab2d66b54c06ab784be0f631daef,
title = "Distributed language modeling for N-best list re-ranking",
abstract = "In this paper we describe a novel distributed language model for N-best list re-ranking. The model is based on the client/server paradigm where each server hosts a portion of the data and provides information to the client. This model allows for using an arbitrarily large corpus in a very efficient way. It also provides a natural platform for relevance weighting and selection. We applied this model on a 2.97 billion-word corpus and re-ranked the N-best list from Hiero, a state-of-theart phrase-based system. Using BLEU as a metric, the re-ranked translation achieves a relative improvement of 4.8{\%}, significantly better than the model-best translation.",
author = "Ying Zhang and Hildebrand, {Almut Silja} and Stephan Vogel",
year = "2006",
month = "12",
day = "1",
language = "English",
isbn = "1932432736",
pages = "216--223",
booktitle = "COLING/ACL 2006 - EMNLP 2006: 2006 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference",

}

TY - GEN

T1 - Distributed language modeling for N-best list re-ranking

AU - Zhang, Ying

AU - Hildebrand, Almut Silja

AU - Vogel, Stephan

PY - 2006/12/1

Y1 - 2006/12/1

N2 - In this paper we describe a novel distributed language model for N-best list re-ranking. The model is based on the client/server paradigm where each server hosts a portion of the data and provides information to the client. This model allows for using an arbitrarily large corpus in a very efficient way. It also provides a natural platform for relevance weighting and selection. We applied this model on a 2.97 billion-word corpus and re-ranked the N-best list from Hiero, a state-of-theart phrase-based system. Using BLEU as a metric, the re-ranked translation achieves a relative improvement of 4.8%, significantly better than the model-best translation.

AB - In this paper we describe a novel distributed language model for N-best list re-ranking. The model is based on the client/server paradigm where each server hosts a portion of the data and provides information to the client. This model allows for using an arbitrarily large corpus in a very efficient way. It also provides a natural platform for relevance weighting and selection. We applied this model on a 2.97 billion-word corpus and re-ranked the N-best list from Hiero, a state-of-theart phrase-based system. Using BLEU as a metric, the re-ranked translation achieves a relative improvement of 4.8%, significantly better than the model-best translation.

UR - http://www.scopus.com/inward/record.url?scp=34547495897&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34547495897&partnerID=8YFLogxK

M3 - Conference contribution

SN - 1932432736

SN - 9781932432732

SP - 216

EP - 223

BT - COLING/ACL 2006 - EMNLP 2006: 2006 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference

ER -