Concept segmentation and labeling for conversational speech

Marco Dinarelli, Alessandro Moschitti, Giuseppe Riccardi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Spoken Language Understanding performs automatic concept labeling and segmentation of speech utterances. For this task, many approaches have been proposed based on both generative and discriminative models. While all these methods have shown remarkable accuracy on manual transcription of spoken utterances, robustness to noisy automatic transcription is still an open issue. In this paper we study algorithms for Spoken Language Understanding combining complementary learning models: Stochastic Finite State Transducers produce a list of hypotheses, which are re-ranked using a discriminative algorithm based on kernel methods. Our experiments on two different spoken dialog corpora, MEDIA and LUNA, show that the combined generative-discriminative model reaches the state-of-the-art such as Conditional Random Fields (CRF) on manual transcriptions, and it is robust to noisy automatic transcriptions, outperforming, in some cases, the state-of-the-art.

Original languageEnglish
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Pages2747-2750
Number of pages4
Publication statusPublished - 2009
Externally publishedYes
Event10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009 - Brighton, United Kingdom
Duration: 6 Sep 200910 Sep 2009

Other

Other10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009
CountryUnited Kingdom
CityBrighton
Period6/9/0910/9/09

Fingerprint

Transcription
Labeling
Language
Transducers
Learning
Stochastic models
Experiments

Keywords

  • Discriminative learning
  • Kernel methods
  • Spoken language understanding

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Sensory Systems

Cite this

Dinarelli, M., Moschitti, A., & Riccardi, G. (2009). Concept segmentation and labeling for conversational speech. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 2747-2750)

Concept segmentation and labeling for conversational speech. / Dinarelli, Marco; Moschitti, Alessandro; Riccardi, Giuseppe.

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2009. p. 2747-2750.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Dinarelli, M, Moschitti, A & Riccardi, G 2009, Concept segmentation and labeling for conversational speech. in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. pp. 2747-2750, 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009, Brighton, United Kingdom, 6/9/09.
Dinarelli M, Moschitti A, Riccardi G. Concept segmentation and labeling for conversational speech. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2009. p. 2747-2750
Dinarelli, Marco ; Moschitti, Alessandro ; Riccardi, Giuseppe. / Concept segmentation and labeling for conversational speech. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2009. pp. 2747-2750
@inproceedings{d7f25c9e7d474c508601ac59b723ba15,
title = "Concept segmentation and labeling for conversational speech",
abstract = "Spoken Language Understanding performs automatic concept labeling and segmentation of speech utterances. For this task, many approaches have been proposed based on both generative and discriminative models. While all these methods have shown remarkable accuracy on manual transcription of spoken utterances, robustness to noisy automatic transcription is still an open issue. In this paper we study algorithms for Spoken Language Understanding combining complementary learning models: Stochastic Finite State Transducers produce a list of hypotheses, which are re-ranked using a discriminative algorithm based on kernel methods. Our experiments on two different spoken dialog corpora, MEDIA and LUNA, show that the combined generative-discriminative model reaches the state-of-the-art such as Conditional Random Fields (CRF) on manual transcriptions, and it is robust to noisy automatic transcriptions, outperforming, in some cases, the state-of-the-art.",
keywords = "Discriminative learning, Kernel methods, Spoken language understanding",
author = "Marco Dinarelli and Alessandro Moschitti and Giuseppe Riccardi",
year = "2009",
language = "English",
pages = "2747--2750",
booktitle = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

}

TY - GEN

T1 - Concept segmentation and labeling for conversational speech

AU - Dinarelli, Marco

AU - Moschitti, Alessandro

AU - Riccardi, Giuseppe

PY - 2009

Y1 - 2009

N2 - Spoken Language Understanding performs automatic concept labeling and segmentation of speech utterances. For this task, many approaches have been proposed based on both generative and discriminative models. While all these methods have shown remarkable accuracy on manual transcription of spoken utterances, robustness to noisy automatic transcription is still an open issue. In this paper we study algorithms for Spoken Language Understanding combining complementary learning models: Stochastic Finite State Transducers produce a list of hypotheses, which are re-ranked using a discriminative algorithm based on kernel methods. Our experiments on two different spoken dialog corpora, MEDIA and LUNA, show that the combined generative-discriminative model reaches the state-of-the-art such as Conditional Random Fields (CRF) on manual transcriptions, and it is robust to noisy automatic transcriptions, outperforming, in some cases, the state-of-the-art.

AB - Spoken Language Understanding performs automatic concept labeling and segmentation of speech utterances. For this task, many approaches have been proposed based on both generative and discriminative models. While all these methods have shown remarkable accuracy on manual transcription of spoken utterances, robustness to noisy automatic transcription is still an open issue. In this paper we study algorithms for Spoken Language Understanding combining complementary learning models: Stochastic Finite State Transducers produce a list of hypotheses, which are re-ranked using a discriminative algorithm based on kernel methods. Our experiments on two different spoken dialog corpora, MEDIA and LUNA, show that the combined generative-discriminative model reaches the state-of-the-art such as Conditional Random Fields (CRF) on manual transcriptions, and it is robust to noisy automatic transcriptions, outperforming, in some cases, the state-of-the-art.

KW - Discriminative learning

KW - Kernel methods

KW - Spoken language understanding

UR - http://www.scopus.com/inward/record.url?scp=70450214505&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70450214505&partnerID=8YFLogxK

M3 - Conference contribution

SP - 2747

EP - 2750

BT - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

ER -