Patent query reduction using pseudo relevance feedback

Debasis Ganguly, Johannes Leveling, Walid Magdy, Gareth J.F. Jones

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Citations (Scopus)

Abstract

Queries in patent prior art search are full patent applications and much longer than standard ad hoc search and web search topics. Standard information retrieval (IR) techniques are not entirely effective for patent prior art search because of ambiguous terms in these massive queries. Reducing patent queries by extracting key terms has been shown to be ineffective mainly because it is not clear what the focus of the query is. An optimal query reduction algorithm must thus seek to retain the useful terms for retrieval favouring recall of relevant patents, but remove terms which impair IR effectiveness. We propose a new query reduction technique decomposing a patent application into constituent text segments and computing the Language Modeling (LM) similarities by calculating the probability of generating each segment from the top ranked documents. We reduce a patent query by removing the least similar segments from the query, hypothesising that removal of these segments can increase the precision of retrieval, while still retaining the useful context to achieve high recall. Experiments on the patent prior art search collection CLEF-IP 2010 show that the proposed method outperforms standard pseudo-relevance feedback (PRF) and a naive method of query reduction based on removal of unit frequency terms (UFTs).

Original languageEnglish
Title of host publicationCIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management
Pages1953-1956
Number of pages4
DOIs
Publication statusPublished - 13 Dec 2011
Event20th ACM Conference on Information and Knowledge Management, CIKM'11 - Glasgow, United Kingdom
Duration: 24 Oct 201128 Oct 2011

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

Other20th ACM Conference on Information and Knowledge Management, CIKM'11
CountryUnited Kingdom
CityGlasgow
Period24/10/1128/10/11

    Fingerprint

Keywords

  • patent search
  • pseudo-relevance feedback
  • query reduction

ASJC Scopus subject areas

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Cite this

Ganguly, D., Leveling, J., Magdy, W., & Jones, G. J. F. (2011). Patent query reduction using pseudo relevance feedback. In CIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management (pp. 1953-1956). (International Conference on Information and Knowledge Management, Proceedings). https://doi.org/10.1145/2063576.2063863