Underspecified query refinement via natural language question generation

Hassan Sajjad, Patrick Pantel, Michael Gamon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Underspecified queries are common in vertical search engines, leading to large result sets that are difficult for users to navigate. In this paper, we show that we can automatically guide users to their target results by engaging them in a dialog consisting of well-formed binary questions mined from unstructured data. We propose a system that extracts candidate attribute-value question terms from unstructured descriptions of records in a database. These terms are then filtered using a Maximum Entropy classifier to identify those that are suitable for question formation given a user query. We then select question terms via a novel ranking function that aims to minimize the number of question turns necessary for a user to find her target result. We evaluate the quality of system-generated questions for grammaticality and refinement effectiveness. Our final system shows best results in effectiveness, percentage of well-formed questions, and percentage of answerable questions over three baseline systems.

Original languageEnglish
Title of host publication24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers
Pages2341-2356
Number of pages16
Publication statusPublished - 1 Dec 2012
Event24th International Conference on Computational Linguistics, COLING 2012 - Mumbai, India
Duration: 8 Dec 201215 Dec 2012

Other

Other24th International Conference on Computational Linguistics, COLING 2012
CountryIndia
CityMumbai
Period8/12/1215/12/12

    Fingerprint

Keywords

  • Query refinement
  • Question generation
  • Search as a dialog

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Language and Linguistics
  • Linguistics and Language

Cite this

Sajjad, H., Pantel, P., & Gamon, M. (2012). Underspecified query refinement via natural language question generation. In 24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers (pp. 2341-2356)