Exploring structured documents and query formulation techniques for patent retrieval

Walid Magdy, Johannes Leveling, Gareth J F Jones

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Citations (Scopus)

Abstract

This paper presents the experiments and results of DCU in CLEF-IP 2009. Our work applied standard information retrieval (IR) techniques to patent search. Different experiments tested various methods for the patent retrieval, including query formulation, structured index, weighted fields, document filtering, and blind relevance feedback. Some methods did not show expected good retrieval effectiveness such as blind relevance feedback, other experiments showed acceptable performance. Query formulation was the key to achieving better retrieval effectiveness, and this was performed through assigning higher weights to certain document fields. Further experiments showed that for longer queries, better results are achieved but at the expense of additional computations. For the best runs, the retrieval effectiveness is still lower than for IR applications for other domains, illustrating the difficulty of patent search. The official results have shown that among fifteen participants we achieved the seventh and the fourth ranks from the mean average precision (MAP) and recall point of view, respectively.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages410-417
Number of pages8
Volume6241 LNCS
DOIs
Publication statusPublished - 5 Nov 2010
Externally publishedYes
Event10th Workshop of the Cross-Language Evaluation Forum, CLEF 2009 - Corfu, Greece
Duration: 30 Sep 20092 Oct 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6241 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other10th Workshop of the Cross-Language Evaluation Forum, CLEF 2009
CountryGreece
CityCorfu
Period30/9/092/10/09

Fingerprint

Patents
Retrieval
Query
Relevance Feedback
Formulation
Information retrieval
Information Retrieval
Experiment
Experiments
Feedback
Filtering

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Magdy, W., Leveling, J., & Jones, G. J. F. (2010). Exploring structured documents and query formulation techniques for patent retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6241 LNCS, pp. 410-417). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6241 LNCS). https://doi.org/10.1007/978-3-642-15754-7_48

Exploring structured documents and query formulation techniques for patent retrieval. / Magdy, Walid; Leveling, Johannes; Jones, Gareth J F.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6241 LNCS 2010. p. 410-417 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6241 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Magdy, W, Leveling, J & Jones, GJF 2010, Exploring structured documents and query formulation techniques for patent retrieval. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 6241 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6241 LNCS, pp. 410-417, 10th Workshop of the Cross-Language Evaluation Forum, CLEF 2009, Corfu, Greece, 30/9/09. https://doi.org/10.1007/978-3-642-15754-7_48
Magdy W, Leveling J, Jones GJF. Exploring structured documents and query formulation techniques for patent retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6241 LNCS. 2010. p. 410-417. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-15754-7_48
Magdy, Walid ; Leveling, Johannes ; Jones, Gareth J F. / Exploring structured documents and query formulation techniques for patent retrieval. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6241 LNCS 2010. pp. 410-417 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{ea06a740250441c8a37e470a3fdec926,
title = "Exploring structured documents and query formulation techniques for patent retrieval",
abstract = "This paper presents the experiments and results of DCU in CLEF-IP 2009. Our work applied standard information retrieval (IR) techniques to patent search. Different experiments tested various methods for the patent retrieval, including query formulation, structured index, weighted fields, document filtering, and blind relevance feedback. Some methods did not show expected good retrieval effectiveness such as blind relevance feedback, other experiments showed acceptable performance. Query formulation was the key to achieving better retrieval effectiveness, and this was performed through assigning higher weights to certain document fields. Further experiments showed that for longer queries, better results are achieved but at the expense of additional computations. For the best runs, the retrieval effectiveness is still lower than for IR applications for other domains, illustrating the difficulty of patent search. The official results have shown that among fifteen participants we achieved the seventh and the fourth ranks from the mean average precision (MAP) and recall point of view, respectively.",
author = "Walid Magdy and Johannes Leveling and Jones, {Gareth J F}",
year = "2010",
month = "11",
day = "5",
doi = "10.1007/978-3-642-15754-7_48",
language = "English",
isbn = "364215753X",
volume = "6241 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "410--417",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Exploring structured documents and query formulation techniques for patent retrieval

AU - Magdy, Walid

AU - Leveling, Johannes

AU - Jones, Gareth J F

PY - 2010/11/5

Y1 - 2010/11/5

N2 - This paper presents the experiments and results of DCU in CLEF-IP 2009. Our work applied standard information retrieval (IR) techniques to patent search. Different experiments tested various methods for the patent retrieval, including query formulation, structured index, weighted fields, document filtering, and blind relevance feedback. Some methods did not show expected good retrieval effectiveness such as blind relevance feedback, other experiments showed acceptable performance. Query formulation was the key to achieving better retrieval effectiveness, and this was performed through assigning higher weights to certain document fields. Further experiments showed that for longer queries, better results are achieved but at the expense of additional computations. For the best runs, the retrieval effectiveness is still lower than for IR applications for other domains, illustrating the difficulty of patent search. The official results have shown that among fifteen participants we achieved the seventh and the fourth ranks from the mean average precision (MAP) and recall point of view, respectively.

AB - This paper presents the experiments and results of DCU in CLEF-IP 2009. Our work applied standard information retrieval (IR) techniques to patent search. Different experiments tested various methods for the patent retrieval, including query formulation, structured index, weighted fields, document filtering, and blind relevance feedback. Some methods did not show expected good retrieval effectiveness such as blind relevance feedback, other experiments showed acceptable performance. Query formulation was the key to achieving better retrieval effectiveness, and this was performed through assigning higher weights to certain document fields. Further experiments showed that for longer queries, better results are achieved but at the expense of additional computations. For the best runs, the retrieval effectiveness is still lower than for IR applications for other domains, illustrating the difficulty of patent search. The official results have shown that among fifteen participants we achieved the seventh and the fourth ranks from the mean average precision (MAP) and recall point of view, respectively.

UR - http://www.scopus.com/inward/record.url?scp=78049341185&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78049341185&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-15754-7_48

DO - 10.1007/978-3-642-15754-7_48

M3 - Conference contribution

SN - 364215753X

SN - 9783642157530

VL - 6241 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 410

EP - 417

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -