Mining and indexing graphs for supergraph search

Dayu Yuan, Prasenjit Mitra, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingChapter

12 Citations (Scopus)

Abstract

We study supergraph search (SPS), that is, given a query graph qand a graph database G that contains a collection of graphs, returngraphs that have q as a supergraph from G. SPS has broad applicationsin bioinformatics, cheminformatics and other scientific andcommercial fields. Determining whether a graph is a subgraph (orsupergraph) of another is an NP-complete problem. Hence, it is intractableto compute SPS for large graph databases. Two separateindexing methods, a "filter + verify"-based method and a "prefixsharing"-based method, have been studied to efficiently computeSPS. To implement the above two methods, subgraph patterns aremined from the graph database to build an index. Those subgraphsare mined to optimize either the filtering gain or the prefix-sharinggain. However, no single subgraph-mining algorithm considersboth gains. This work is the first one to mine subgraphs to optimize boththe filtering gain and the prefix-sharing gain while processing SPSqueries. First, we show that the subgraph-mining problem is NPhard. Then, we propose two polynomial-time algorithms to solvethe problem with an approximation ratio of 1-1/e and 1/4 respectively. In addition, we construct a lattice-like index, LW-index, toorganize the selected subgraph patterns for fast index-lookup. Ourexperiments show that our approach improves the query processingtime for SPS queries by a factor of 3 to 10.

Original languageEnglish
Title of host publicationProceedings of the VLDB Endowment
Pages829-840
Number of pages12
Volume6
Edition10
Publication statusPublished - Aug 2013
Externally publishedYes

Fingerprint

Bioinformatics
Computational complexity
Polynomials
Processing

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this

Yuan, D., Mitra, P., & Giles, C. L. (2013). Mining and indexing graphs for supergraph search. In Proceedings of the VLDB Endowment (10 ed., Vol. 6, pp. 829-840)

Mining and indexing graphs for supergraph search. / Yuan, Dayu; Mitra, Prasenjit; Giles, C. Lee.

Proceedings of the VLDB Endowment. Vol. 6 10. ed. 2013. p. 829-840.

Research output: Chapter in Book/Report/Conference proceedingChapter

Yuan, D, Mitra, P & Giles, CL 2013, Mining and indexing graphs for supergraph search. in Proceedings of the VLDB Endowment. 10 edn, vol. 6, pp. 829-840.
Yuan D, Mitra P, Giles CL. Mining and indexing graphs for supergraph search. In Proceedings of the VLDB Endowment. 10 ed. Vol. 6. 2013. p. 829-840
Yuan, Dayu ; Mitra, Prasenjit ; Giles, C. Lee. / Mining and indexing graphs for supergraph search. Proceedings of the VLDB Endowment. Vol. 6 10. ed. 2013. pp. 829-840
@inbook{12ebdb87fa45460f8de6d7a7cf2a8617,
title = "Mining and indexing graphs for supergraph search",
abstract = "We study supergraph search (SPS), that is, given a query graph qand a graph database G that contains a collection of graphs, returngraphs that have q as a supergraph from G. SPS has broad applicationsin bioinformatics, cheminformatics and other scientific andcommercial fields. Determining whether a graph is a subgraph (orsupergraph) of another is an NP-complete problem. Hence, it is intractableto compute SPS for large graph databases. Two separateindexing methods, a {"}filter + verify{"}-based method and a {"}prefixsharing{"}-based method, have been studied to efficiently computeSPS. To implement the above two methods, subgraph patterns aremined from the graph database to build an index. Those subgraphsare mined to optimize either the filtering gain or the prefix-sharinggain. However, no single subgraph-mining algorithm considersboth gains. This work is the first one to mine subgraphs to optimize boththe filtering gain and the prefix-sharing gain while processing SPSqueries. First, we show that the subgraph-mining problem is NPhard. Then, we propose two polynomial-time algorithms to solvethe problem with an approximation ratio of 1-1/e and 1/4 respectively. In addition, we construct a lattice-like index, LW-index, toorganize the selected subgraph patterns for fast index-lookup. Ourexperiments show that our approach improves the query processingtime for SPS queries by a factor of 3 to 10.",
author = "Dayu Yuan and Prasenjit Mitra and Giles, {C. Lee}",
year = "2013",
month = "8",
language = "English",
volume = "6",
pages = "829--840",
booktitle = "Proceedings of the VLDB Endowment",
edition = "10",

}

TY - CHAP

T1 - Mining and indexing graphs for supergraph search

AU - Yuan, Dayu

AU - Mitra, Prasenjit

AU - Giles, C. Lee

PY - 2013/8

Y1 - 2013/8

N2 - We study supergraph search (SPS), that is, given a query graph qand a graph database G that contains a collection of graphs, returngraphs that have q as a supergraph from G. SPS has broad applicationsin bioinformatics, cheminformatics and other scientific andcommercial fields. Determining whether a graph is a subgraph (orsupergraph) of another is an NP-complete problem. Hence, it is intractableto compute SPS for large graph databases. Two separateindexing methods, a "filter + verify"-based method and a "prefixsharing"-based method, have been studied to efficiently computeSPS. To implement the above two methods, subgraph patterns aremined from the graph database to build an index. Those subgraphsare mined to optimize either the filtering gain or the prefix-sharinggain. However, no single subgraph-mining algorithm considersboth gains. This work is the first one to mine subgraphs to optimize boththe filtering gain and the prefix-sharing gain while processing SPSqueries. First, we show that the subgraph-mining problem is NPhard. Then, we propose two polynomial-time algorithms to solvethe problem with an approximation ratio of 1-1/e and 1/4 respectively. In addition, we construct a lattice-like index, LW-index, toorganize the selected subgraph patterns for fast index-lookup. Ourexperiments show that our approach improves the query processingtime for SPS queries by a factor of 3 to 10.

AB - We study supergraph search (SPS), that is, given a query graph qand a graph database G that contains a collection of graphs, returngraphs that have q as a supergraph from G. SPS has broad applicationsin bioinformatics, cheminformatics and other scientific andcommercial fields. Determining whether a graph is a subgraph (orsupergraph) of another is an NP-complete problem. Hence, it is intractableto compute SPS for large graph databases. Two separateindexing methods, a "filter + verify"-based method and a "prefixsharing"-based method, have been studied to efficiently computeSPS. To implement the above two methods, subgraph patterns aremined from the graph database to build an index. Those subgraphsare mined to optimize either the filtering gain or the prefix-sharinggain. However, no single subgraph-mining algorithm considersboth gains. This work is the first one to mine subgraphs to optimize boththe filtering gain and the prefix-sharing gain while processing SPSqueries. First, we show that the subgraph-mining problem is NPhard. Then, we propose two polynomial-time algorithms to solvethe problem with an approximation ratio of 1-1/e and 1/4 respectively. In addition, we construct a lattice-like index, LW-index, toorganize the selected subgraph patterns for fast index-lookup. Ourexperiments show that our approach improves the query processingtime for SPS queries by a factor of 3 to 10.

UR - http://www.scopus.com/inward/record.url?scp=84891094619&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84891094619&partnerID=8YFLogxK

M3 - Chapter

AN - SCOPUS:84891094619

VL - 6

SP - 829

EP - 840

BT - Proceedings of the VLDB Endowment

ER -