CMIC at INEX 2007

Book Search track

Walid Magdy, Kareem Darwish

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With massive book digitization efforts underway, the need for effective retrieval of books and pages in books is an important problem. This paper describes our submissions to the INEX 2007 Book Search track. We explored using book specific features such as table of content and index pages and headers along with non-book specific features. Our results show that indexing the entire contents of books and headers provided the most effective retrieval strategy.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages175-182
Number of pages8
Volume4862 LNCS
DOIs
Publication statusPublished - 22 Sep 2008
Externally publishedYes
Event6th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2007 - Dagstuhl Castle, Germany
Duration: 17 Dec 200719 Dec 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4862 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other6th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2007
CountryGermany
CityDagstuhl Castle
Period17/12/0719/12/07

Fingerprint

Analog to digital conversion
Retrieval
Digitization
Indexing
6-cyano-5-methoxyindolo(2,3-a)carbazole
Table
Entire

Keywords

  • Book search
  • OCR retrieval

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Magdy, W., & Darwish, K. (2008). CMIC at INEX 2007: Book Search track. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4862 LNCS, pp. 175-182). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4862 LNCS). https://doi.org/10.1007/978-3-540-85902-4_16

CMIC at INEX 2007 : Book Search track. / Magdy, Walid; Darwish, Kareem.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4862 LNCS 2008. p. 175-182 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4862 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Magdy, W & Darwish, K 2008, CMIC at INEX 2007: Book Search track. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 4862 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4862 LNCS, pp. 175-182, 6th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2007, Dagstuhl Castle, Germany, 17/12/07. https://doi.org/10.1007/978-3-540-85902-4_16
Magdy W, Darwish K. CMIC at INEX 2007: Book Search track. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4862 LNCS. 2008. p. 175-182. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-540-85902-4_16
Magdy, Walid ; Darwish, Kareem. / CMIC at INEX 2007 : Book Search track. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4862 LNCS 2008. pp. 175-182 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{c36bf0f8e3c843b197b39ad5888e4df5,
title = "CMIC at INEX 2007: Book Search track",
abstract = "With massive book digitization efforts underway, the need for effective retrieval of books and pages in books is an important problem. This paper describes our submissions to the INEX 2007 Book Search track. We explored using book specific features such as table of content and index pages and headers along with non-book specific features. Our results show that indexing the entire contents of books and headers provided the most effective retrieval strategy.",
keywords = "Book search, OCR retrieval",
author = "Walid Magdy and Kareem Darwish",
year = "2008",
month = "9",
day = "22",
doi = "10.1007/978-3-540-85902-4_16",
language = "English",
isbn = "3540859012",
volume = "4862 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "175--182",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - CMIC at INEX 2007

T2 - Book Search track

AU - Magdy, Walid

AU - Darwish, Kareem

PY - 2008/9/22

Y1 - 2008/9/22

N2 - With massive book digitization efforts underway, the need for effective retrieval of books and pages in books is an important problem. This paper describes our submissions to the INEX 2007 Book Search track. We explored using book specific features such as table of content and index pages and headers along with non-book specific features. Our results show that indexing the entire contents of books and headers provided the most effective retrieval strategy.

AB - With massive book digitization efforts underway, the need for effective retrieval of books and pages in books is an important problem. This paper describes our submissions to the INEX 2007 Book Search track. We explored using book specific features such as table of content and index pages and headers along with non-book specific features. Our results show that indexing the entire contents of books and headers provided the most effective retrieval strategy.

KW - Book search

KW - OCR retrieval

UR - http://www.scopus.com/inward/record.url?scp=51849158842&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=51849158842&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-85902-4_16

DO - 10.1007/978-3-540-85902-4_16

M3 - Conference contribution

SN - 3540859012

SN - 9783540859017

VL - 4862 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 175

EP - 182

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -