Underspecified query refinement via natural language question generation

Hassan Sajjad, Patrick Pantel, Michael Gamon

Research output: Contribution to conferencePaper

2 Citations (Scopus)

Abstract

Underspecified queries are common in vertical search engines, leading to large result sets that are difficult for users to navigate. In this paper, we show that we can automatically guide users to their target results by engaging them in a dialog consisting of well-formed binary questions mined from unstructured data. We propose a system that extracts candidate attribute-value question terms from unstructured descriptions of records in a database. These terms are then filtered using a Maximum Entropy classifier to identify those that are suitable for question formation given a user query. We then select question terms via a novel ranking function that aims to minimize the number of question turns necessary for a user to find her target result. We evaluate the quality of system-generated questions for grammaticality and refinement effectiveness. Our final system shows best results in effectiveness, percentage of well-formed questions, and percentage of answerable questions over three baseline systems.

Original languageEnglish
Pages2341-2356
Number of pages16
Publication statusPublished - 1 Dec 2012
Event24th International Conference on Computational Linguistics, COLING 2012 - Mumbai, India
Duration: 8 Dec 201215 Dec 2012

Other

Other24th International Conference on Computational Linguistics, COLING 2012
CountryIndia
CityMumbai
Period8/12/1215/12/12

Keywords

  • Query refinement
  • Question generation
  • Search as a dialog

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Language and Linguistics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Underspecified query refinement via natural language question generation'. Together they form a unique fingerprint.

  • Cite this

    Sajjad, H., Pantel, P., & Gamon, M. (2012). Underspecified query refinement via natural language question generation. 2341-2356. Paper presented at 24th International Conference on Computational Linguistics, COLING 2012, Mumbai, India.