Using dependency parsing and machine learning for factoid question answering on spoken documents

Pere R. Comas, Jordi Turmo, Lluis Marques

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

This paper presents our experiments in question answering for speech corpora. These experiments focus on improving the answer extraction step of the QA process. We present two approaches to answer extraction in question answering for speech corpora that apply machine learning to improve the coverage and precision of the extraction. The first one is a reranker that uses only lexical information, the second one uses dependency parsing to score robust similarity between syntactic structures. Our experimental results show that the proposed learning models improve our previous results using only hand-made ranking rules with small syntactic information. Moreover, this results show also that a dependency parser can be useful for speech transcripts even if it was trained with written text data from a news collection. We evaluate the system on manual transcripts of speech from EPPS English corpus and a set of questions transcribed from spontaneous oral questions. This data belongs to the CLEF 2009 track on QA on speech transcripts (QAst).

Original languageEnglish
Title of host publicationProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
Pages1265-1268
Number of pages4
Publication statusPublished - 1 Dec 2010
Externally publishedYes
Event11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010 - Makuhari, Chiba, Japan
Duration: 26 Sep 201030 Sep 2010

Other

Other11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010
CountryJapan
CityMakuhari, Chiba
Period26/9/1030/9/10

Fingerprint

Question Answering
Machine Learning
Parsing
Dependency (Psychology)
Learning
Experiment
News
Learning Model
Clef
Syntactic Structure
Syntax
Ranking

Keywords

  • Answer extraction
  • Oral question answering
  • Question answering
  • Speech transcriptions

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing

Cite this

Comas, P. R., Turmo, J., & Marques, L. (2010). Using dependency parsing and machine learning for factoid question answering on spoken documents. In Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 (pp. 1265-1268)

Using dependency parsing and machine learning for factoid question answering on spoken documents. / Comas, Pere R.; Turmo, Jordi; Marques, Lluis.

Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. 2010. p. 1265-1268.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Comas, PR, Turmo, J & Marques, L 2010, Using dependency parsing and machine learning for factoid question answering on spoken documents. in Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. pp. 1265-1268, 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010, Makuhari, Chiba, Japan, 26/9/10.
Comas PR, Turmo J, Marques L. Using dependency parsing and machine learning for factoid question answering on spoken documents. In Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. 2010. p. 1265-1268
Comas, Pere R. ; Turmo, Jordi ; Marques, Lluis. / Using dependency parsing and machine learning for factoid question answering on spoken documents. Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. 2010. pp. 1265-1268
@inproceedings{1ad1b3e4409148afb79bfd066219b711,
title = "Using dependency parsing and machine learning for factoid question answering on spoken documents",
abstract = "This paper presents our experiments in question answering for speech corpora. These experiments focus on improving the answer extraction step of the QA process. We present two approaches to answer extraction in question answering for speech corpora that apply machine learning to improve the coverage and precision of the extraction. The first one is a reranker that uses only lexical information, the second one uses dependency parsing to score robust similarity between syntactic structures. Our experimental results show that the proposed learning models improve our previous results using only hand-made ranking rules with small syntactic information. Moreover, this results show also that a dependency parser can be useful for speech transcripts even if it was trained with written text data from a news collection. We evaluate the system on manual transcripts of speech from EPPS English corpus and a set of questions transcribed from spontaneous oral questions. This data belongs to the CLEF 2009 track on QA on speech transcripts (QAst).",
keywords = "Answer extraction, Oral question answering, Question answering, Speech transcriptions",
author = "Comas, {Pere R.} and Jordi Turmo and Lluis Marques",
year = "2010",
month = "12",
day = "1",
language = "English",
pages = "1265--1268",
booktitle = "Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010",

}

TY - GEN

T1 - Using dependency parsing and machine learning for factoid question answering on spoken documents

AU - Comas, Pere R.

AU - Turmo, Jordi

AU - Marques, Lluis

PY - 2010/12/1

Y1 - 2010/12/1

N2 - This paper presents our experiments in question answering for speech corpora. These experiments focus on improving the answer extraction step of the QA process. We present two approaches to answer extraction in question answering for speech corpora that apply machine learning to improve the coverage and precision of the extraction. The first one is a reranker that uses only lexical information, the second one uses dependency parsing to score robust similarity between syntactic structures. Our experimental results show that the proposed learning models improve our previous results using only hand-made ranking rules with small syntactic information. Moreover, this results show also that a dependency parser can be useful for speech transcripts even if it was trained with written text data from a news collection. We evaluate the system on manual transcripts of speech from EPPS English corpus and a set of questions transcribed from spontaneous oral questions. This data belongs to the CLEF 2009 track on QA on speech transcripts (QAst).

AB - This paper presents our experiments in question answering for speech corpora. These experiments focus on improving the answer extraction step of the QA process. We present two approaches to answer extraction in question answering for speech corpora that apply machine learning to improve the coverage and precision of the extraction. The first one is a reranker that uses only lexical information, the second one uses dependency parsing to score robust similarity between syntactic structures. Our experimental results show that the proposed learning models improve our previous results using only hand-made ranking rules with small syntactic information. Moreover, this results show also that a dependency parser can be useful for speech transcripts even if it was trained with written text data from a news collection. We evaluate the system on manual transcripts of speech from EPPS English corpus and a set of questions transcribed from spontaneous oral questions. This data belongs to the CLEF 2009 track on QA on speech transcripts (QAst).

KW - Answer extraction

KW - Oral question answering

KW - Question answering

KW - Speech transcriptions

UR - http://www.scopus.com/inward/record.url?scp=79959848786&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79959848786&partnerID=8YFLogxK

M3 - Conference contribution

SP - 1265

EP - 1268

BT - Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

ER -