Consolidation-based speech translation and evaluation approach

Chiori Hori, Bing Zhao, Stephan Vogel, Alex Waibel, Hideki Kashioka, Satoshi Nakamura

Research output: Contribution to journalArticle

Abstract

The performance of speech translation systems combining automatic speech recognition (ASR) and machine translation (MT) systems is degraded by redundant and irrelevant information caused by speaker disfluency and recognition errors. This paper proposes a new approach to translating speech recognition results through speech consolidation, which removes ASR errors and disfluencies and extracts meaningful phrases. A consolidation approach is spun off from speech summarization by word extraction from ASR 1-best. We extended the consolidation approach for confusion network (CN) and tested the performance using TED speech and confirmed the consolidation results preserved more meaningful phrases in comparison with the original ASR results. We applied the consolidation technique to speech translation. To test the performance of consolidationbased speech translation, Chinese broadcast news (BN) speech in RT04 were recognized, consolidated and then translated. The speech translation results via consolidation cannot be directly compared with gold standards in which all words in speech are translated because consolidation-based translations are partial translations. We would like to propose a new evaluation framework for partial translation by comparing them with the most similar set of words extracted from a word network created by merging gradual summarizations of the gold standard translation. The performance of consolidation-based MT results was evaluated using BLEU. We also proose Information Preservation Accuracy (IPAccy) and Meaning Preservation Accuracy (MPAccy) to evaluate consolidation and consolidation-based MT. We confirmed that consolidation contributed to the performance of speech translation.

Original languageEnglish
Pages (from-to)477-488
Number of pages12
JournalIEICE Transactions on Information and Systems
VolumeE92-D
Issue number3
DOIs
Publication statusPublished - 1 Dec 2009
Externally publishedYes

Fingerprint

Consolidation
Speech recognition
Merging

Keywords

  • BLEU
  • Chinese broadcast news speech
  • Chinese-English translation
  • IPAccy
  • MPAccy
  • Speech consolidation
  • Speech translation
  • TED speech

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Software
  • Artificial Intelligence
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition

Cite this

Consolidation-based speech translation and evaluation approach. / Hori, Chiori; Zhao, Bing; Vogel, Stephan; Waibel, Alex; Kashioka, Hideki; Nakamura, Satoshi.

In: IEICE Transactions on Information and Systems, Vol. E92-D, No. 3, 01.12.2009, p. 477-488.

Research output: Contribution to journalArticle

Hori, Chiori ; Zhao, Bing ; Vogel, Stephan ; Waibel, Alex ; Kashioka, Hideki ; Nakamura, Satoshi. / Consolidation-based speech translation and evaluation approach. In: IEICE Transactions on Information and Systems. 2009 ; Vol. E92-D, No. 3. pp. 477-488.
@article{1be6b5fc1e7441f79c2caff58ffd711c,
title = "Consolidation-based speech translation and evaluation approach",
abstract = "The performance of speech translation systems combining automatic speech recognition (ASR) and machine translation (MT) systems is degraded by redundant and irrelevant information caused by speaker disfluency and recognition errors. This paper proposes a new approach to translating speech recognition results through speech consolidation, which removes ASR errors and disfluencies and extracts meaningful phrases. A consolidation approach is spun off from speech summarization by word extraction from ASR 1-best. We extended the consolidation approach for confusion network (CN) and tested the performance using TED speech and confirmed the consolidation results preserved more meaningful phrases in comparison with the original ASR results. We applied the consolidation technique to speech translation. To test the performance of consolidationbased speech translation, Chinese broadcast news (BN) speech in RT04 were recognized, consolidated and then translated. The speech translation results via consolidation cannot be directly compared with gold standards in which all words in speech are translated because consolidation-based translations are partial translations. We would like to propose a new evaluation framework for partial translation by comparing them with the most similar set of words extracted from a word network created by merging gradual summarizations of the gold standard translation. The performance of consolidation-based MT results was evaluated using BLEU. We also proose Information Preservation Accuracy (IPAccy) and Meaning Preservation Accuracy (MPAccy) to evaluate consolidation and consolidation-based MT. We confirmed that consolidation contributed to the performance of speech translation.",
keywords = "BLEU, Chinese broadcast news speech, Chinese-English translation, IPAccy, MPAccy, Speech consolidation, Speech translation, TED speech",
author = "Chiori Hori and Bing Zhao and Stephan Vogel and Alex Waibel and Hideki Kashioka and Satoshi Nakamura",
year = "2009",
month = "12",
day = "1",
doi = "10.1587/transinf.E92.D.477",
language = "English",
volume = "E92-D",
pages = "477--488",
journal = "IEICE Transactions on Information and Systems",
issn = "0916-8532",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "3",

}

TY - JOUR

T1 - Consolidation-based speech translation and evaluation approach

AU - Hori, Chiori

AU - Zhao, Bing

AU - Vogel, Stephan

AU - Waibel, Alex

AU - Kashioka, Hideki

AU - Nakamura, Satoshi

PY - 2009/12/1

Y1 - 2009/12/1

N2 - The performance of speech translation systems combining automatic speech recognition (ASR) and machine translation (MT) systems is degraded by redundant and irrelevant information caused by speaker disfluency and recognition errors. This paper proposes a new approach to translating speech recognition results through speech consolidation, which removes ASR errors and disfluencies and extracts meaningful phrases. A consolidation approach is spun off from speech summarization by word extraction from ASR 1-best. We extended the consolidation approach for confusion network (CN) and tested the performance using TED speech and confirmed the consolidation results preserved more meaningful phrases in comparison with the original ASR results. We applied the consolidation technique to speech translation. To test the performance of consolidationbased speech translation, Chinese broadcast news (BN) speech in RT04 were recognized, consolidated and then translated. The speech translation results via consolidation cannot be directly compared with gold standards in which all words in speech are translated because consolidation-based translations are partial translations. We would like to propose a new evaluation framework for partial translation by comparing them with the most similar set of words extracted from a word network created by merging gradual summarizations of the gold standard translation. The performance of consolidation-based MT results was evaluated using BLEU. We also proose Information Preservation Accuracy (IPAccy) and Meaning Preservation Accuracy (MPAccy) to evaluate consolidation and consolidation-based MT. We confirmed that consolidation contributed to the performance of speech translation.

AB - The performance of speech translation systems combining automatic speech recognition (ASR) and machine translation (MT) systems is degraded by redundant and irrelevant information caused by speaker disfluency and recognition errors. This paper proposes a new approach to translating speech recognition results through speech consolidation, which removes ASR errors and disfluencies and extracts meaningful phrases. A consolidation approach is spun off from speech summarization by word extraction from ASR 1-best. We extended the consolidation approach for confusion network (CN) and tested the performance using TED speech and confirmed the consolidation results preserved more meaningful phrases in comparison with the original ASR results. We applied the consolidation technique to speech translation. To test the performance of consolidationbased speech translation, Chinese broadcast news (BN) speech in RT04 were recognized, consolidated and then translated. The speech translation results via consolidation cannot be directly compared with gold standards in which all words in speech are translated because consolidation-based translations are partial translations. We would like to propose a new evaluation framework for partial translation by comparing them with the most similar set of words extracted from a word network created by merging gradual summarizations of the gold standard translation. The performance of consolidation-based MT results was evaluated using BLEU. We also proose Information Preservation Accuracy (IPAccy) and Meaning Preservation Accuracy (MPAccy) to evaluate consolidation and consolidation-based MT. We confirmed that consolidation contributed to the performance of speech translation.

KW - BLEU

KW - Chinese broadcast news speech

KW - Chinese-English translation

KW - IPAccy

KW - MPAccy

KW - Speech consolidation

KW - Speech translation

KW - TED speech

UR - http://www.scopus.com/inward/record.url?scp=77950329279&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77950329279&partnerID=8YFLogxK

U2 - 10.1587/transinf.E92.D.477

DO - 10.1587/transinf.E92.D.477

M3 - Article

VL - E92-D

SP - 477

EP - 488

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

SN - 0916-8532

IS - 3

ER -