Searching for tables in digital documents

Ying Liu, Kun Bai, Prasenjit Mitra, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Tables are ubiquitous. In scientific documents, tables are widely used to present experimental results or statistical data in a condensed fashion. Current search engines do not allow the end-user to search for relevant tables. In this paper, we describe TableSeer, an automatic table extraction and search engine system. TableSeer crawls scientific documents, identifies documents with tables, extracts tables from documents, indexes them and enables end-users to search for tables. We also propose an extensive set of mediumindependent metadata for tables representation. Given a query, TableSeer ranks the returned results using an innovative ranking algorithm - TableRank. Our results show that TableSeer outperforms popular search engines, such as Google Scholar when the end-user seeks for tables.

Original languageEnglish
Title of host publicationProceedings of the International Conference on Document Analysis and Recognition, ICDAR
Pages934-938
Number of pages5
Volume2
DOIs
Publication statusPublished - 2007
Externally publishedYes
Event9th International Conference on Document Analysis and Recognition, ICDAR 2007 - Curitiba
Duration: 23 Sep 200726 Sep 2007

Other

Other9th International Conference on Document Analysis and Recognition, ICDAR 2007
CityCuritiba
Period23/9/0726/9/07

Fingerprint

Search engines
Metadata

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Cite this

Liu, Y., Bai, K., Mitra, P., & Lee Giles, C. (2007). Searching for tables in digital documents. In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR (Vol. 2, pp. 934-938). [4377052] https://doi.org/10.1109/ICDAR.2007.4377052

Searching for tables in digital documents. / Liu, Ying; Bai, Kun; Mitra, Prasenjit; Lee Giles, C.

Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. Vol. 2 2007. p. 934-938 4377052.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Liu, Y, Bai, K, Mitra, P & Lee Giles, C 2007, Searching for tables in digital documents. in Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. vol. 2, 4377052, pp. 934-938, 9th International Conference on Document Analysis and Recognition, ICDAR 2007, Curitiba, 23/9/07. https://doi.org/10.1109/ICDAR.2007.4377052
Liu Y, Bai K, Mitra P, Lee Giles C. Searching for tables in digital documents. In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. Vol. 2. 2007. p. 934-938. 4377052 https://doi.org/10.1109/ICDAR.2007.4377052
Liu, Ying ; Bai, Kun ; Mitra, Prasenjit ; Lee Giles, C. / Searching for tables in digital documents. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. Vol. 2 2007. pp. 934-938
@inproceedings{6e74d932d8534711904611ea2d40ccac,
title = "Searching for tables in digital documents",
abstract = "Tables are ubiquitous. In scientific documents, tables are widely used to present experimental results or statistical data in a condensed fashion. Current search engines do not allow the end-user to search for relevant tables. In this paper, we describe TableSeer, an automatic table extraction and search engine system. TableSeer crawls scientific documents, identifies documents with tables, extracts tables from documents, indexes them and enables end-users to search for tables. We also propose an extensive set of mediumindependent metadata for tables representation. Given a query, TableSeer ranks the returned results using an innovative ranking algorithm - TableRank. Our results show that TableSeer outperforms popular search engines, such as Google Scholar when the end-user seeks for tables.",
author = "Ying Liu and Kun Bai and Prasenjit Mitra and {Lee Giles}, C.",
year = "2007",
doi = "10.1109/ICDAR.2007.4377052",
language = "English",
isbn = "0769528228",
volume = "2",
pages = "934--938",
booktitle = "Proceedings of the International Conference on Document Analysis and Recognition, ICDAR",

}

TY - GEN

T1 - Searching for tables in digital documents

AU - Liu, Ying

AU - Bai, Kun

AU - Mitra, Prasenjit

AU - Lee Giles, C.

PY - 2007

Y1 - 2007

N2 - Tables are ubiquitous. In scientific documents, tables are widely used to present experimental results or statistical data in a condensed fashion. Current search engines do not allow the end-user to search for relevant tables. In this paper, we describe TableSeer, an automatic table extraction and search engine system. TableSeer crawls scientific documents, identifies documents with tables, extracts tables from documents, indexes them and enables end-users to search for tables. We also propose an extensive set of mediumindependent metadata for tables representation. Given a query, TableSeer ranks the returned results using an innovative ranking algorithm - TableRank. Our results show that TableSeer outperforms popular search engines, such as Google Scholar when the end-user seeks for tables.

AB - Tables are ubiquitous. In scientific documents, tables are widely used to present experimental results or statistical data in a condensed fashion. Current search engines do not allow the end-user to search for relevant tables. In this paper, we describe TableSeer, an automatic table extraction and search engine system. TableSeer crawls scientific documents, identifies documents with tables, extracts tables from documents, indexes them and enables end-users to search for tables. We also propose an extensive set of mediumindependent metadata for tables representation. Given a query, TableSeer ranks the returned results using an innovative ranking algorithm - TableRank. Our results show that TableSeer outperforms popular search engines, such as Google Scholar when the end-user seeks for tables.

UR - http://www.scopus.com/inward/record.url?scp=51149113056&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=51149113056&partnerID=8YFLogxK

U2 - 10.1109/ICDAR.2007.4377052

DO - 10.1109/ICDAR.2007.4377052

M3 - Conference contribution

SN - 0769528228

SN - 9780769528229

VL - 2

SP - 934

EP - 938

BT - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR

ER -