A study on query expansion methods for patent retrieval

Walid Magdy, Gareth J F Jones

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Citations (Scopus)

Abstract

Patent retrieval is a recall-oriented search task where the objective is to find all possible relevant documents. Queries in patent retrieval are typically very long since they take the form of a patent claim or even a full patent application in the case of prior-art patent search. Nevertheless, there is generally a significant mismatch between the query and the relevant documents, often leading to low retrieval effectiveness. Some previous work has tried to address this mismatch through the application of query expansion (QE) techniques which have generally showed effectiveness for many other retrieval tasks. However, results of QE on patent search have been found to be very disappointing. We present a review of previous investigations of QE in patent retrieval, and explore some of these techniques on a prior-art patent search task. In addition, a novel method for QE using automatically generated synonyms set is presented. While previous QE techniques fail to improve over baseline retrieval, our new approach show statistically better retrieval precision over the baseline, although not for recall. In addition, it proves to be significantly more efficient than existing techniques. An extensive analysis to the results is presented which seeks to better understand situations where these QE techniques succeed or fail.

Original languageEnglish
Title of host publicationInternational Conference on Information and Knowledge Management, Proceedings
Pages19-24
Number of pages6
DOIs
Publication statusPublished - 15 Dec 2011
Externally publishedYes
Event4th Workshop on Patent Information Retrieval, PaIR'11 - Glasgow, United Kingdom
Duration: 24 Oct 201124 Oct 2011

Other

Other4th Workshop on Patent Information Retrieval, PaIR'11
CountryUnited Kingdom
CityGlasgow
Period24/10/1124/10/11

Fingerprint

Query expansion
Patent retrieval
Patents
Mismatch
Query
Art

Keywords

  • patent retrieval
  • query expansion
  • synset

ASJC Scopus subject areas

  • Business, Management and Accounting(all)
  • Decision Sciences(all)

Cite this

Magdy, W., & Jones, G. J. F. (2011). A study on query expansion methods for patent retrieval. In International Conference on Information and Knowledge Management, Proceedings (pp. 19-24) https://doi.org/10.1145/2064975.2064982

A study on query expansion methods for patent retrieval. / Magdy, Walid; Jones, Gareth J F.

International Conference on Information and Knowledge Management, Proceedings. 2011. p. 19-24.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Magdy, W & Jones, GJF 2011, A study on query expansion methods for patent retrieval. in International Conference on Information and Knowledge Management, Proceedings. pp. 19-24, 4th Workshop on Patent Information Retrieval, PaIR'11, Glasgow, United Kingdom, 24/10/11. https://doi.org/10.1145/2064975.2064982
Magdy W, Jones GJF. A study on query expansion methods for patent retrieval. In International Conference on Information and Knowledge Management, Proceedings. 2011. p. 19-24 https://doi.org/10.1145/2064975.2064982
Magdy, Walid ; Jones, Gareth J F. / A study on query expansion methods for patent retrieval. International Conference on Information and Knowledge Management, Proceedings. 2011. pp. 19-24
@inproceedings{362ef5f237e34abba3874e49debf297c,
title = "A study on query expansion methods for patent retrieval",
abstract = "Patent retrieval is a recall-oriented search task where the objective is to find all possible relevant documents. Queries in patent retrieval are typically very long since they take the form of a patent claim or even a full patent application in the case of prior-art patent search. Nevertheless, there is generally a significant mismatch between the query and the relevant documents, often leading to low retrieval effectiveness. Some previous work has tried to address this mismatch through the application of query expansion (QE) techniques which have generally showed effectiveness for many other retrieval tasks. However, results of QE on patent search have been found to be very disappointing. We present a review of previous investigations of QE in patent retrieval, and explore some of these techniques on a prior-art patent search task. In addition, a novel method for QE using automatically generated synonyms set is presented. While previous QE techniques fail to improve over baseline retrieval, our new approach show statistically better retrieval precision over the baseline, although not for recall. In addition, it proves to be significantly more efficient than existing techniques. An extensive analysis to the results is presented which seeks to better understand situations where these QE techniques succeed or fail.",
keywords = "patent retrieval, query expansion, synset",
author = "Walid Magdy and Jones, {Gareth J F}",
year = "2011",
month = "12",
day = "15",
doi = "10.1145/2064975.2064982",
language = "English",
isbn = "9781450309554",
pages = "19--24",
booktitle = "International Conference on Information and Knowledge Management, Proceedings",

}

TY - GEN

T1 - A study on query expansion methods for patent retrieval

AU - Magdy, Walid

AU - Jones, Gareth J F

PY - 2011/12/15

Y1 - 2011/12/15

N2 - Patent retrieval is a recall-oriented search task where the objective is to find all possible relevant documents. Queries in patent retrieval are typically very long since they take the form of a patent claim or even a full patent application in the case of prior-art patent search. Nevertheless, there is generally a significant mismatch between the query and the relevant documents, often leading to low retrieval effectiveness. Some previous work has tried to address this mismatch through the application of query expansion (QE) techniques which have generally showed effectiveness for many other retrieval tasks. However, results of QE on patent search have been found to be very disappointing. We present a review of previous investigations of QE in patent retrieval, and explore some of these techniques on a prior-art patent search task. In addition, a novel method for QE using automatically generated synonyms set is presented. While previous QE techniques fail to improve over baseline retrieval, our new approach show statistically better retrieval precision over the baseline, although not for recall. In addition, it proves to be significantly more efficient than existing techniques. An extensive analysis to the results is presented which seeks to better understand situations where these QE techniques succeed or fail.

AB - Patent retrieval is a recall-oriented search task where the objective is to find all possible relevant documents. Queries in patent retrieval are typically very long since they take the form of a patent claim or even a full patent application in the case of prior-art patent search. Nevertheless, there is generally a significant mismatch between the query and the relevant documents, often leading to low retrieval effectiveness. Some previous work has tried to address this mismatch through the application of query expansion (QE) techniques which have generally showed effectiveness for many other retrieval tasks. However, results of QE on patent search have been found to be very disappointing. We present a review of previous investigations of QE in patent retrieval, and explore some of these techniques on a prior-art patent search task. In addition, a novel method for QE using automatically generated synonyms set is presented. While previous QE techniques fail to improve over baseline retrieval, our new approach show statistically better retrieval precision over the baseline, although not for recall. In addition, it proves to be significantly more efficient than existing techniques. An extensive analysis to the results is presented which seeks to better understand situations where these QE techniques succeed or fail.

KW - patent retrieval

KW - query expansion

KW - synset

UR - http://www.scopus.com/inward/record.url?scp=83255174131&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=83255174131&partnerID=8YFLogxK

U2 - 10.1145/2064975.2064982

DO - 10.1145/2064975.2064982

M3 - Conference contribution

AN - SCOPUS:83255174131

SN - 9781450309554

SP - 19

EP - 24

BT - International Conference on Information and Knowledge Management, Proceedings

ER -