Arabic information retrieval

Kareem Darwish, Walid Magdy

Research output: Contribution to journalArticle

37 Citations (Scopus)

Abstract

In the past several years, Arabic Information Retrieval (IR) has garnered significant attention. The main research interests have focused on retrieval of formal language, mostly in the news domain, with ad hoc retrieval, OCR document retrieval, and cross-language retrieval. The literature on other aspects of retrieval continues to be sparse or non-existent, though some of these aspects have been investigated by industry. Others aspects of Arabic retrieval that have received attention include document image retrieval, speech search, social media and web search, and filtering. However, efforts on different aspects of Arabic retrieval continue to be deficient and severely lacking behind efforts in other languages. The survey covers: 1) general properties of the Arabic language; 2) some of the aspects of Arabic that affect retrieval; 3) Arabic processing necessary for effective Arabic retrieval; 4) Arabic retrieval in public IR evaluations; 5) specialized retrieval problems, namely Arabic-English CLIR, Arabic Document Image Retrieval, Arabic Social Search, Arabic Web Search, Question Answering, Image retrieval, and Arabic Speech Search; 6) Arabic IR and NLP resources; and 7) open IR problems that require further attention.

Original languageEnglish
Pages (from-to)239-342
Number of pages104
JournalFoundations and Trends in Information Retrieval
Volume7
Issue number4
DOIs
Publication statusPublished - 1 Dec 2013

Fingerprint

Information retrieval
Image retrieval
Optical character recognition
Formal languages
Processing
Industry

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Information Systems

Cite this

Arabic information retrieval. / Darwish, Kareem; Magdy, Walid.

In: Foundations and Trends in Information Retrieval, Vol. 7, No. 4, 01.12.2013, p. 239-342.

Research output: Contribution to journalArticle

Darwish, Kareem ; Magdy, Walid. / Arabic information retrieval. In: Foundations and Trends in Information Retrieval. 2013 ; Vol. 7, No. 4. pp. 239-342.
@article{94b5a3443a77494c80a8cedef55c0e51,
title = "Arabic information retrieval",
abstract = "In the past several years, Arabic Information Retrieval (IR) has garnered significant attention. The main research interests have focused on retrieval of formal language, mostly in the news domain, with ad hoc retrieval, OCR document retrieval, and cross-language retrieval. The literature on other aspects of retrieval continues to be sparse or non-existent, though some of these aspects have been investigated by industry. Others aspects of Arabic retrieval that have received attention include document image retrieval, speech search, social media and web search, and filtering. However, efforts on different aspects of Arabic retrieval continue to be deficient and severely lacking behind efforts in other languages. The survey covers: 1) general properties of the Arabic language; 2) some of the aspects of Arabic that affect retrieval; 3) Arabic processing necessary for effective Arabic retrieval; 4) Arabic retrieval in public IR evaluations; 5) specialized retrieval problems, namely Arabic-English CLIR, Arabic Document Image Retrieval, Arabic Social Search, Arabic Web Search, Question Answering, Image retrieval, and Arabic Speech Search; 6) Arabic IR and NLP resources; and 7) open IR problems that require further attention.",
author = "Kareem Darwish and Walid Magdy",
year = "2013",
month = "12",
day = "1",
doi = "10.1561/1500000031",
language = "English",
volume = "7",
pages = "239--342",
journal = "Foundations and Trends in Information Retrieval",
issn = "1554-0669",
publisher = "Now Publishers Inc",
number = "4",

}

TY - JOUR

T1 - Arabic information retrieval

AU - Darwish, Kareem

AU - Magdy, Walid

PY - 2013/12/1

Y1 - 2013/12/1

N2 - In the past several years, Arabic Information Retrieval (IR) has garnered significant attention. The main research interests have focused on retrieval of formal language, mostly in the news domain, with ad hoc retrieval, OCR document retrieval, and cross-language retrieval. The literature on other aspects of retrieval continues to be sparse or non-existent, though some of these aspects have been investigated by industry. Others aspects of Arabic retrieval that have received attention include document image retrieval, speech search, social media and web search, and filtering. However, efforts on different aspects of Arabic retrieval continue to be deficient and severely lacking behind efforts in other languages. The survey covers: 1) general properties of the Arabic language; 2) some of the aspects of Arabic that affect retrieval; 3) Arabic processing necessary for effective Arabic retrieval; 4) Arabic retrieval in public IR evaluations; 5) specialized retrieval problems, namely Arabic-English CLIR, Arabic Document Image Retrieval, Arabic Social Search, Arabic Web Search, Question Answering, Image retrieval, and Arabic Speech Search; 6) Arabic IR and NLP resources; and 7) open IR problems that require further attention.

AB - In the past several years, Arabic Information Retrieval (IR) has garnered significant attention. The main research interests have focused on retrieval of formal language, mostly in the news domain, with ad hoc retrieval, OCR document retrieval, and cross-language retrieval. The literature on other aspects of retrieval continues to be sparse or non-existent, though some of these aspects have been investigated by industry. Others aspects of Arabic retrieval that have received attention include document image retrieval, speech search, social media and web search, and filtering. However, efforts on different aspects of Arabic retrieval continue to be deficient and severely lacking behind efforts in other languages. The survey covers: 1) general properties of the Arabic language; 2) some of the aspects of Arabic that affect retrieval; 3) Arabic processing necessary for effective Arabic retrieval; 4) Arabic retrieval in public IR evaluations; 5) specialized retrieval problems, namely Arabic-English CLIR, Arabic Document Image Retrieval, Arabic Social Search, Arabic Web Search, Question Answering, Image retrieval, and Arabic Speech Search; 6) Arabic IR and NLP resources; and 7) open IR problems that require further attention.

UR - http://www.scopus.com/inward/record.url?scp=84893559162&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893559162&partnerID=8YFLogxK

U2 - 10.1561/1500000031

DO - 10.1561/1500000031

M3 - Article

VL - 7

SP - 239

EP - 342

JO - Foundations and Trends in Information Retrieval

JF - Foundations and Trends in Information Retrieval

SN - 1554-0669

IS - 4

ER -