A figure search engine architecture for a chemistry digital library

Sagnik Ray Choudhury, Suppawong Tuarob, Prasenjit Mitra, Lior Rokach, Andi Kirk, Silvia Szep, Donald Pellegrino, Sue Jones, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

17 Citations (Scopus)

Abstract

Academic papers contain multiple figures representing important findings and experimental results; we present a search engine specifically focused on figures in academic documents. This search engine allows users to search on figures in approximately 150,000 chemistry journal articles though the method is easily extendable to other domains. Our system indexes figure caption and mentions extracted from the PDF in documents using a custom built extractor. Recall and precision performance of extracted figures is in the 80 to 90 % range. We give the frame work for the extraction algorithm, architecture and ranking function.

Original languageEnglish
Title of host publicationProceedings of the ACM/IEEE Joint Conference on Digital Libraries
Pages369-370
Number of pages2
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event13th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2013 - Indianapolis, IN
Duration: 22 Jul 201326 Jul 2013

Other

Other13th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2013
CityIndianapolis, IN
Period22/7/1326/7/13

Fingerprint

Digital libraries
Search engines

Keywords

  • Figure search
  • Information extraction

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Choudhury, S. R., Tuarob, S., Mitra, P., Rokach, L., Kirk, A., Szep, S., ... Lee Giles, C. (2013). A figure search engine architecture for a chemistry digital library. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (pp. 369-370) https://doi.org/10.1145/2467696.2467757

A figure search engine architecture for a chemistry digital library. / Choudhury, Sagnik Ray; Tuarob, Suppawong; Mitra, Prasenjit; Rokach, Lior; Kirk, Andi; Szep, Silvia; Pellegrino, Donald; Jones, Sue; Lee Giles, C.

Proceedings of the ACM/IEEE Joint Conference on Digital Libraries. 2013. p. 369-370.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Choudhury, SR, Tuarob, S, Mitra, P, Rokach, L, Kirk, A, Szep, S, Pellegrino, D, Jones, S & Lee Giles, C 2013, A figure search engine architecture for a chemistry digital library. in Proceedings of the ACM/IEEE Joint Conference on Digital Libraries. pp. 369-370, 13th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2013, Indianapolis, IN, 22/7/13. https://doi.org/10.1145/2467696.2467757
Choudhury SR, Tuarob S, Mitra P, Rokach L, Kirk A, Szep S et al. A figure search engine architecture for a chemistry digital library. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries. 2013. p. 369-370 https://doi.org/10.1145/2467696.2467757
Choudhury, Sagnik Ray ; Tuarob, Suppawong ; Mitra, Prasenjit ; Rokach, Lior ; Kirk, Andi ; Szep, Silvia ; Pellegrino, Donald ; Jones, Sue ; Lee Giles, C. / A figure search engine architecture for a chemistry digital library. Proceedings of the ACM/IEEE Joint Conference on Digital Libraries. 2013. pp. 369-370
@inproceedings{69ad186571a84a35b2c2240d5c769fe1,
title = "A figure search engine architecture for a chemistry digital library",
abstract = "Academic papers contain multiple figures representing important findings and experimental results; we present a search engine specifically focused on figures in academic documents. This search engine allows users to search on figures in approximately 150,000 chemistry journal articles though the method is easily extendable to other domains. Our system indexes figure caption and mentions extracted from the PDF in documents using a custom built extractor. Recall and precision performance of extracted figures is in the 80 to 90 {\%} range. We give the frame work for the extraction algorithm, architecture and ranking function.",
keywords = "Figure search, Information extraction",
author = "Choudhury, {Sagnik Ray} and Suppawong Tuarob and Prasenjit Mitra and Lior Rokach and Andi Kirk and Silvia Szep and Donald Pellegrino and Sue Jones and {Lee Giles}, C.",
year = "2013",
doi = "10.1145/2467696.2467757",
language = "English",
isbn = "9781450320764",
pages = "369--370",
booktitle = "Proceedings of the ACM/IEEE Joint Conference on Digital Libraries",

}

TY - GEN

T1 - A figure search engine architecture for a chemistry digital library

AU - Choudhury, Sagnik Ray

AU - Tuarob, Suppawong

AU - Mitra, Prasenjit

AU - Rokach, Lior

AU - Kirk, Andi

AU - Szep, Silvia

AU - Pellegrino, Donald

AU - Jones, Sue

AU - Lee Giles, C.

PY - 2013

Y1 - 2013

N2 - Academic papers contain multiple figures representing important findings and experimental results; we present a search engine specifically focused on figures in academic documents. This search engine allows users to search on figures in approximately 150,000 chemistry journal articles though the method is easily extendable to other domains. Our system indexes figure caption and mentions extracted from the PDF in documents using a custom built extractor. Recall and precision performance of extracted figures is in the 80 to 90 % range. We give the frame work for the extraction algorithm, architecture and ranking function.

AB - Academic papers contain multiple figures representing important findings and experimental results; we present a search engine specifically focused on figures in academic documents. This search engine allows users to search on figures in approximately 150,000 chemistry journal articles though the method is easily extendable to other domains. Our system indexes figure caption and mentions extracted from the PDF in documents using a custom built extractor. Recall and precision performance of extracted figures is in the 80 to 90 % range. We give the frame work for the extraction algorithm, architecture and ranking function.

KW - Figure search

KW - Information extraction

UR - http://www.scopus.com/inward/record.url?scp=84882283062&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84882283062&partnerID=8YFLogxK

U2 - 10.1145/2467696.2467757

DO - 10.1145/2467696.2467757

M3 - Conference contribution

AN - SCOPUS:84882283062

SN - 9781450320764

SP - 369

EP - 370

BT - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries

ER -