Using dependency parsing and machine learning for factoid question answering on spoken documents

Pere R. Comas, Jordi Turmo, Lluís Marquez

Research output: Contribution to conferencePaper

Abstract

This paper presents our experiments in question answering for speech corpora. These experiments focus on improving the answer extraction step of the QA process. We present two approaches to answer extraction in question answering for speech corpora that apply machine learning to improve the coverage and precision of the extraction. The first one is a reranker that uses only lexical information, the second one uses dependency parsing to score robust similarity between syntactic structures. Our experimental results show that the proposed learning models improve our previous results using only hand-made ranking rules with small syntactic information. Moreover, this results show also that a dependency parser can be useful for speech transcripts even if it was trained with written text data from a news collection. We evaluate the system on manual transcripts of speech from EPPS English corpus and a set of questions transcribed from spontaneous oral questions. This data belongs to the CLEF 2009 track on QA on speech transcripts (QAst).

Original languageEnglish
Pages1265-1268
Number of pages4
Publication statusPublished - 1 Dec 2010
Event11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010 - Makuhari, Chiba, Japan
Duration: 26 Sep 201030 Sep 2010

Other

Other11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010
CountryJapan
CityMakuhari, Chiba
Period26/9/1030/9/10

    Fingerprint

Keywords

  • Answer extraction
  • Oral question answering
  • Question answering
  • Speech transcriptions

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing

Cite this

Comas, P. R., Turmo, J., & Marquez, L. (2010). Using dependency parsing and machine learning for factoid question answering on spoken documents. 1265-1268. Paper presented at 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010, Makuhari, Chiba, Japan.