Discourse structure in machine translation evaluation

Research output: Contribution to journalArticle

Abstract

In this article, we explore the potential of using sentence-level discourse structure for machine translation evaluation. We first design discourse-aware similarity measures, which use allsubtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory (RST). Then, we show that a simple linear combination with these measures can help improve various existing machine translation evaluation metrics regarding correlation with human judgments both at the segment level and at the system level. This suggests that discourse information is complementary to the information used by many of the existing evaluation metrics, and thus it could be taken into account when developing richer evaluation metrics, such as the WMT-14 winning combined metric DISCOTKparty. We also provide a detailed analysis of the relevance of various discourse elements and relations from the RST parse trees for machine translation evaluation. In particular, we show that (i) all aspects of the RST tree are relevant, (ii) nuclearity is more useful than relation type, and (iii) the similarity of the translation RST tree to the reference RST tree is positively correlated with translation quality.

Original languageEnglish
Pages (from-to)683-722
Number of pages40
JournalComputational Linguistics
Volume43
Issue number4
DOIs
Publication statusPublished - 1 Dec 2017

Fingerprint

discourse
evaluation
Rhetorical Structure Theory
Evaluation
Discourse Structure
Machine Translation
Discourse
Kernel

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language
  • Computer Science Applications
  • Artificial Intelligence

Cite this

Discourse structure in machine translation evaluation. / Rayhan Joty, Shafiq; Guzmán, Francisco; Marques, Lluis; Nakov, Preslav.

In: Computational Linguistics, Vol. 43, No. 4, 01.12.2017, p. 683-722.

Research output: Contribution to journalArticle

@article{740c861548ba4fbf81a1fd48af810203,
title = "Discourse structure in machine translation evaluation",
abstract = "In this article, we explore the potential of using sentence-level discourse structure for machine translation evaluation. We first design discourse-aware similarity measures, which use allsubtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory (RST). Then, we show that a simple linear combination with these measures can help improve various existing machine translation evaluation metrics regarding correlation with human judgments both at the segment level and at the system level. This suggests that discourse information is complementary to the information used by many of the existing evaluation metrics, and thus it could be taken into account when developing richer evaluation metrics, such as the WMT-14 winning combined metric DISCOTKparty. We also provide a detailed analysis of the relevance of various discourse elements and relations from the RST parse trees for machine translation evaluation. In particular, we show that (i) all aspects of the RST tree are relevant, (ii) nuclearity is more useful than relation type, and (iii) the similarity of the translation RST tree to the reference RST tree is positively correlated with translation quality.",
author = "{Rayhan Joty}, Shafiq and Francisco Guzm{\'a}n and Lluis Marques and Preslav Nakov",
year = "2017",
month = "12",
day = "1",
doi = "10.1162/COLI_a_00298",
language = "English",
volume = "43",
pages = "683--722",
journal = "Computational Linguistics",
issn = "0891-2017",
publisher = "MIT Press Journals",
number = "4",

}

TY - JOUR

T1 - Discourse structure in machine translation evaluation

AU - Rayhan Joty, Shafiq

AU - Guzmán, Francisco

AU - Marques, Lluis

AU - Nakov, Preslav

PY - 2017/12/1

Y1 - 2017/12/1

N2 - In this article, we explore the potential of using sentence-level discourse structure for machine translation evaluation. We first design discourse-aware similarity measures, which use allsubtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory (RST). Then, we show that a simple linear combination with these measures can help improve various existing machine translation evaluation metrics regarding correlation with human judgments both at the segment level and at the system level. This suggests that discourse information is complementary to the information used by many of the existing evaluation metrics, and thus it could be taken into account when developing richer evaluation metrics, such as the WMT-14 winning combined metric DISCOTKparty. We also provide a detailed analysis of the relevance of various discourse elements and relations from the RST parse trees for machine translation evaluation. In particular, we show that (i) all aspects of the RST tree are relevant, (ii) nuclearity is more useful than relation type, and (iii) the similarity of the translation RST tree to the reference RST tree is positively correlated with translation quality.

AB - In this article, we explore the potential of using sentence-level discourse structure for machine translation evaluation. We first design discourse-aware similarity measures, which use allsubtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory (RST). Then, we show that a simple linear combination with these measures can help improve various existing machine translation evaluation metrics regarding correlation with human judgments both at the segment level and at the system level. This suggests that discourse information is complementary to the information used by many of the existing evaluation metrics, and thus it could be taken into account when developing richer evaluation metrics, such as the WMT-14 winning combined metric DISCOTKparty. We also provide a detailed analysis of the relevance of various discourse elements and relations from the RST parse trees for machine translation evaluation. In particular, we show that (i) all aspects of the RST tree are relevant, (ii) nuclearity is more useful than relation type, and (iii) the similarity of the translation RST tree to the reference RST tree is positively correlated with translation quality.

UR - http://www.scopus.com/inward/record.url?scp=85047270339&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85047270339&partnerID=8YFLogxK

U2 - 10.1162/COLI_a_00298

DO - 10.1162/COLI_a_00298

M3 - Article

VL - 43

SP - 683

EP - 722

JO - Computational Linguistics

JF - Computational Linguistics

SN - 0891-2017

IS - 4

ER -