GRAIL

A scalable index for reachability queries in very large graphs

Hilmi Yildirim, Vineet Chaoji, Mohammed J. Zaki

Research output: Contribution to journalArticle

43 Citations (Scopus)

Abstract

Given a large directed graph, rapidly answering reachability queries between source and target nodes is an important problem. Existing methods for reachability tradeoff indexing time and space versus query time performance. However, the biggest limitation of existing methods is that they do not scale to very large real-world graphs. We present a simple yet scalable reachability index, called GRAIL, that is based on the idea of randomized interval labeling and that can effectively handle very large graphs. Based on an extensive set of experiments, we show that while more sophisticated methods work better on small graphs, GRAIL is the only index that can scale to millions of nodes and edges. GRAIL has linear indexing time and space, and the query time ranges from constant time to being linear in the graph order and size. Our reference C++ implementations are open source and available for download at http://www.code.google.com/p/grail/.

Original languageEnglish
Pages (from-to)509-534
Number of pages26
JournalVLDB Journal
Volume21
Issue number4
DOIs
Publication statusPublished - 1 Aug 2012
Externally publishedYes

Fingerprint

Directed graphs
Labeling
Experiments

Keywords

  • Graph query processing
  • Reachability queries
  • Scalable graph indexing

ASJC Scopus subject areas

  • Hardware and Architecture
  • Information Systems

Cite this

GRAIL : A scalable index for reachability queries in very large graphs. / Yildirim, Hilmi; Chaoji, Vineet; Zaki, Mohammed J.

In: VLDB Journal, Vol. 21, No. 4, 01.08.2012, p. 509-534.

Research output: Contribution to journalArticle

Yildirim, Hilmi ; Chaoji, Vineet ; Zaki, Mohammed J. / GRAIL : A scalable index for reachability queries in very large graphs. In: VLDB Journal. 2012 ; Vol. 21, No. 4. pp. 509-534.
@article{349c1856f2764f1196e65fc4371ba7aa,
title = "GRAIL: A scalable index for reachability queries in very large graphs",
abstract = "Given a large directed graph, rapidly answering reachability queries between source and target nodes is an important problem. Existing methods for reachability tradeoff indexing time and space versus query time performance. However, the biggest limitation of existing methods is that they do not scale to very large real-world graphs. We present a simple yet scalable reachability index, called GRAIL, that is based on the idea of randomized interval labeling and that can effectively handle very large graphs. Based on an extensive set of experiments, we show that while more sophisticated methods work better on small graphs, GRAIL is the only index that can scale to millions of nodes and edges. GRAIL has linear indexing time and space, and the query time ranges from constant time to being linear in the graph order and size. Our reference C++ implementations are open source and available for download at http://www.code.google.com/p/grail/.",
keywords = "Graph query processing, Reachability queries, Scalable graph indexing",
author = "Hilmi Yildirim and Vineet Chaoji and Zaki, {Mohammed J.}",
year = "2012",
month = "8",
day = "1",
doi = "10.1007/s00778-011-0256-4",
language = "English",
volume = "21",
pages = "509--534",
journal = "VLDB Journal",
issn = "1066-8888",
publisher = "Springer New York",
number = "4",

}

TY - JOUR

T1 - GRAIL

T2 - A scalable index for reachability queries in very large graphs

AU - Yildirim, Hilmi

AU - Chaoji, Vineet

AU - Zaki, Mohammed J.

PY - 2012/8/1

Y1 - 2012/8/1

N2 - Given a large directed graph, rapidly answering reachability queries between source and target nodes is an important problem. Existing methods for reachability tradeoff indexing time and space versus query time performance. However, the biggest limitation of existing methods is that they do not scale to very large real-world graphs. We present a simple yet scalable reachability index, called GRAIL, that is based on the idea of randomized interval labeling and that can effectively handle very large graphs. Based on an extensive set of experiments, we show that while more sophisticated methods work better on small graphs, GRAIL is the only index that can scale to millions of nodes and edges. GRAIL has linear indexing time and space, and the query time ranges from constant time to being linear in the graph order and size. Our reference C++ implementations are open source and available for download at http://www.code.google.com/p/grail/.

AB - Given a large directed graph, rapidly answering reachability queries between source and target nodes is an important problem. Existing methods for reachability tradeoff indexing time and space versus query time performance. However, the biggest limitation of existing methods is that they do not scale to very large real-world graphs. We present a simple yet scalable reachability index, called GRAIL, that is based on the idea of randomized interval labeling and that can effectively handle very large graphs. Based on an extensive set of experiments, we show that while more sophisticated methods work better on small graphs, GRAIL is the only index that can scale to millions of nodes and edges. GRAIL has linear indexing time and space, and the query time ranges from constant time to being linear in the graph order and size. Our reference C++ implementations are open source and available for download at http://www.code.google.com/p/grail/.

KW - Graph query processing

KW - Reachability queries

KW - Scalable graph indexing

UR - http://www.scopus.com/inward/record.url?scp=84864278510&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84864278510&partnerID=8YFLogxK

U2 - 10.1007/s00778-011-0256-4

DO - 10.1007/s00778-011-0256-4

M3 - Article

VL - 21

SP - 509

EP - 534

JO - VLDB Journal

JF - VLDB Journal

SN - 1066-8888

IS - 4

ER -