Generating abstractive summaries from meeting transcripts

Siddhartha Banerjee, Prasenjit Mitra, Kazunari Sugiyama

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Summaries of meetings are very important as they convey the essential content of discussions in a concise form. Both participants and non-participants are interested in the summaries of meetings to plan for their future work. Generally, it is time consuming to read and understand the whole documents. Therefore, summaries play an important role as the readers are interested in only the important context of discussions. In this work, we address the task of meeting document summarization. Automatic summarization systems on meeting conversations developed so far have been primarily extractive, resulting in unacceptable summaries that are hard to read. The extracted utterances contain disfluencies that affect the quality of the extractive summaries. To make summaries much more readable, we propose an approach to generating abstractive summaries by fusing important content from several utterances. We first separate meeting transcripts into various topic segments, and then identify the important utterances in each segment using a supervised learning approach. The important utterances are then combined together to generate a one-sentence summary. In the text generation step, the dependency parses of the utterances in each segment are combined together to create a directed graph. The most informative and well-formed sub-graph obtained by integer linear programming (ILP) is selected to generate a one-sentence summary for each topic segment. The ILP formulation reduces disfluencies by leveraging grammatical relations that are more prominent in nonconversational style of text, and therefore generates summaries that is comparable to human-written abstractive summaries. Experimental results show that our method can generate more informative summaries than the baselines. In addition, readability assessments by human judges as well as log-likelihood estimates obtained from the dependency parser show that our generated summaries are significantly readable and well-formed.

Original languageEnglish
Title of host publicationDocEng 2015 - Proceedings of the 2015 ACM Symposium on Document Engineering
PublisherAssociation for Computing Machinery, Inc
Pages51-60
Number of pages10
ISBN (Print)9781450333078
DOIs
Publication statusPublished - 8 Sep 2015
EventACM Symposium on Document Engineering, DocEng 2015 - Lausanne, Switzerland
Duration: 8 Sep 201511 Sep 2015

Other

OtherACM Symposium on Document Engineering, DocEng 2015
CountrySwitzerland
CityLausanne
Period8/9/1511/9/15

Fingerprint

Linear programming
Directed graphs
Supervised learning

Keywords

  • Abstractive meeting summarization
  • Integer linear programming
  • Topic segmentation

ASJC Scopus subject areas

  • Information Systems
  • Software

Cite this

Banerjee, S., Mitra, P., & Sugiyama, K. (2015). Generating abstractive summaries from meeting transcripts. In DocEng 2015 - Proceedings of the 2015 ACM Symposium on Document Engineering (pp. 51-60). Association for Computing Machinery, Inc. https://doi.org/10.1145/2682571.2797061

Generating abstractive summaries from meeting transcripts. / Banerjee, Siddhartha; Mitra, Prasenjit; Sugiyama, Kazunari.

DocEng 2015 - Proceedings of the 2015 ACM Symposium on Document Engineering. Association for Computing Machinery, Inc, 2015. p. 51-60.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Banerjee, S, Mitra, P & Sugiyama, K 2015, Generating abstractive summaries from meeting transcripts. in DocEng 2015 - Proceedings of the 2015 ACM Symposium on Document Engineering. Association for Computing Machinery, Inc, pp. 51-60, ACM Symposium on Document Engineering, DocEng 2015, Lausanne, Switzerland, 8/9/15. https://doi.org/10.1145/2682571.2797061
Banerjee S, Mitra P, Sugiyama K. Generating abstractive summaries from meeting transcripts. In DocEng 2015 - Proceedings of the 2015 ACM Symposium on Document Engineering. Association for Computing Machinery, Inc. 2015. p. 51-60 https://doi.org/10.1145/2682571.2797061
Banerjee, Siddhartha ; Mitra, Prasenjit ; Sugiyama, Kazunari. / Generating abstractive summaries from meeting transcripts. DocEng 2015 - Proceedings of the 2015 ACM Symposium on Document Engineering. Association for Computing Machinery, Inc, 2015. pp. 51-60
@inproceedings{0273a762b70e46c286000d94851d54f9,
title = "Generating abstractive summaries from meeting transcripts",
abstract = "Summaries of meetings are very important as they convey the essential content of discussions in a concise form. Both participants and non-participants are interested in the summaries of meetings to plan for their future work. Generally, it is time consuming to read and understand the whole documents. Therefore, summaries play an important role as the readers are interested in only the important context of discussions. In this work, we address the task of meeting document summarization. Automatic summarization systems on meeting conversations developed so far have been primarily extractive, resulting in unacceptable summaries that are hard to read. The extracted utterances contain disfluencies that affect the quality of the extractive summaries. To make summaries much more readable, we propose an approach to generating abstractive summaries by fusing important content from several utterances. We first separate meeting transcripts into various topic segments, and then identify the important utterances in each segment using a supervised learning approach. The important utterances are then combined together to generate a one-sentence summary. In the text generation step, the dependency parses of the utterances in each segment are combined together to create a directed graph. The most informative and well-formed sub-graph obtained by integer linear programming (ILP) is selected to generate a one-sentence summary for each topic segment. The ILP formulation reduces disfluencies by leveraging grammatical relations that are more prominent in nonconversational style of text, and therefore generates summaries that is comparable to human-written abstractive summaries. Experimental results show that our method can generate more informative summaries than the baselines. In addition, readability assessments by human judges as well as log-likelihood estimates obtained from the dependency parser show that our generated summaries are significantly readable and well-formed.",
keywords = "Abstractive meeting summarization, Integer linear programming, Topic segmentation",
author = "Siddhartha Banerjee and Prasenjit Mitra and Kazunari Sugiyama",
year = "2015",
month = "9",
day = "8",
doi = "10.1145/2682571.2797061",
language = "English",
isbn = "9781450333078",
pages = "51--60",
booktitle = "DocEng 2015 - Proceedings of the 2015 ACM Symposium on Document Engineering",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - Generating abstractive summaries from meeting transcripts

AU - Banerjee, Siddhartha

AU - Mitra, Prasenjit

AU - Sugiyama, Kazunari

PY - 2015/9/8

Y1 - 2015/9/8

N2 - Summaries of meetings are very important as they convey the essential content of discussions in a concise form. Both participants and non-participants are interested in the summaries of meetings to plan for their future work. Generally, it is time consuming to read and understand the whole documents. Therefore, summaries play an important role as the readers are interested in only the important context of discussions. In this work, we address the task of meeting document summarization. Automatic summarization systems on meeting conversations developed so far have been primarily extractive, resulting in unacceptable summaries that are hard to read. The extracted utterances contain disfluencies that affect the quality of the extractive summaries. To make summaries much more readable, we propose an approach to generating abstractive summaries by fusing important content from several utterances. We first separate meeting transcripts into various topic segments, and then identify the important utterances in each segment using a supervised learning approach. The important utterances are then combined together to generate a one-sentence summary. In the text generation step, the dependency parses of the utterances in each segment are combined together to create a directed graph. The most informative and well-formed sub-graph obtained by integer linear programming (ILP) is selected to generate a one-sentence summary for each topic segment. The ILP formulation reduces disfluencies by leveraging grammatical relations that are more prominent in nonconversational style of text, and therefore generates summaries that is comparable to human-written abstractive summaries. Experimental results show that our method can generate more informative summaries than the baselines. In addition, readability assessments by human judges as well as log-likelihood estimates obtained from the dependency parser show that our generated summaries are significantly readable and well-formed.

AB - Summaries of meetings are very important as they convey the essential content of discussions in a concise form. Both participants and non-participants are interested in the summaries of meetings to plan for their future work. Generally, it is time consuming to read and understand the whole documents. Therefore, summaries play an important role as the readers are interested in only the important context of discussions. In this work, we address the task of meeting document summarization. Automatic summarization systems on meeting conversations developed so far have been primarily extractive, resulting in unacceptable summaries that are hard to read. The extracted utterances contain disfluencies that affect the quality of the extractive summaries. To make summaries much more readable, we propose an approach to generating abstractive summaries by fusing important content from several utterances. We first separate meeting transcripts into various topic segments, and then identify the important utterances in each segment using a supervised learning approach. The important utterances are then combined together to generate a one-sentence summary. In the text generation step, the dependency parses of the utterances in each segment are combined together to create a directed graph. The most informative and well-formed sub-graph obtained by integer linear programming (ILP) is selected to generate a one-sentence summary for each topic segment. The ILP formulation reduces disfluencies by leveraging grammatical relations that are more prominent in nonconversational style of text, and therefore generates summaries that is comparable to human-written abstractive summaries. Experimental results show that our method can generate more informative summaries than the baselines. In addition, readability assessments by human judges as well as log-likelihood estimates obtained from the dependency parser show that our generated summaries are significantly readable and well-formed.

KW - Abstractive meeting summarization

KW - Integer linear programming

KW - Topic segmentation

UR - http://www.scopus.com/inward/record.url?scp=84959231684&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84959231684&partnerID=8YFLogxK

U2 - 10.1145/2682571.2797061

DO - 10.1145/2682571.2797061

M3 - Conference contribution

SN - 9781450333078

SP - 51

EP - 60

BT - DocEng 2015 - Proceedings of the 2015 ACM Symposium on Document Engineering

PB - Association for Computing Machinery, Inc

ER -