Adding regular expressions to graph reachability and pattern queries

Wenfei Fan, Jianzhong Li, Shuai Ma, Nan Tang, Yinghui Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

83 Citations (Scopus)

Abstract

It is increasingly common to find graphs in which edges bear different types, indicating a variety of relationships. For such graphs we propose a class of reachability queries and a class of graph patterns, in which an edge is specified with a regular expression of a certain form, expressing the connectivity in a data graph via edges of various types. In addition, we define graph pattern matching based on a revised notion of graph simulation. On graphs in emerging applications such as social networks, we show that these queries are capable of finding more sensible information than their traditional counterparts. Better still, their increased expressive power does not come with extra complexity. Indeed, (1) we investigate their containment and minimization problems, and show that these fundamental problems are in quadratic time for reachability queries and are in cubic time for pattern queries. (2) We develop an algorithm for answering reachability queries, in quadratic time as for their traditional counterpart. (3) We provide two cubic-time algorithms for evaluating graph pattern queries based on extended graph simulation, as opposed to the NP-completeness of graph pattern matching via subgraph isomorphism. (4) The effectiveness, efficiency and scalability of these algorithms are experimentally verified using real-life data and synthetic data.

Original languageEnglish
Title of host publicationProceedings - International Conference on Data Engineering
Pages39-50
Number of pages12
DOIs
Publication statusPublished - 6 Jun 2011
Externally publishedYes
Event2011 IEEE 27th International Conference on Data Engineering, ICDE 2011 - Hannover, Germany
Duration: 11 Apr 201116 Apr 2011

Other

Other2011 IEEE 27th International Conference on Data Engineering, ICDE 2011
CountryGermany
CityHannover
Period11/4/1116/4/11

Fingerprint

Pattern matching
Scalability

ASJC Scopus subject areas

  • Information Systems
  • Signal Processing
  • Software

Cite this

Fan, W., Li, J., Ma, S., Tang, N., & Wu, Y. (2011). Adding regular expressions to graph reachability and pattern queries. In Proceedings - International Conference on Data Engineering (pp. 39-50). [5767858] https://doi.org/10.1109/ICDE.2011.5767858

Adding regular expressions to graph reachability and pattern queries. / Fan, Wenfei; Li, Jianzhong; Ma, Shuai; Tang, Nan; Wu, Yinghui.

Proceedings - International Conference on Data Engineering. 2011. p. 39-50 5767858.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fan, W, Li, J, Ma, S, Tang, N & Wu, Y 2011, Adding regular expressions to graph reachability and pattern queries. in Proceedings - International Conference on Data Engineering., 5767858, pp. 39-50, 2011 IEEE 27th International Conference on Data Engineering, ICDE 2011, Hannover, Germany, 11/4/11. https://doi.org/10.1109/ICDE.2011.5767858
Fan W, Li J, Ma S, Tang N, Wu Y. Adding regular expressions to graph reachability and pattern queries. In Proceedings - International Conference on Data Engineering. 2011. p. 39-50. 5767858 https://doi.org/10.1109/ICDE.2011.5767858
Fan, Wenfei ; Li, Jianzhong ; Ma, Shuai ; Tang, Nan ; Wu, Yinghui. / Adding regular expressions to graph reachability and pattern queries. Proceedings - International Conference on Data Engineering. 2011. pp. 39-50
@inproceedings{185eaaf6409948b08d8cc490aece8031,
title = "Adding regular expressions to graph reachability and pattern queries",
abstract = "It is increasingly common to find graphs in which edges bear different types, indicating a variety of relationships. For such graphs we propose a class of reachability queries and a class of graph patterns, in which an edge is specified with a regular expression of a certain form, expressing the connectivity in a data graph via edges of various types. In addition, we define graph pattern matching based on a revised notion of graph simulation. On graphs in emerging applications such as social networks, we show that these queries are capable of finding more sensible information than their traditional counterparts. Better still, their increased expressive power does not come with extra complexity. Indeed, (1) we investigate their containment and minimization problems, and show that these fundamental problems are in quadratic time for reachability queries and are in cubic time for pattern queries. (2) We develop an algorithm for answering reachability queries, in quadratic time as for their traditional counterpart. (3) We provide two cubic-time algorithms for evaluating graph pattern queries based on extended graph simulation, as opposed to the NP-completeness of graph pattern matching via subgraph isomorphism. (4) The effectiveness, efficiency and scalability of these algorithms are experimentally verified using real-life data and synthetic data.",
author = "Wenfei Fan and Jianzhong Li and Shuai Ma and Nan Tang and Yinghui Wu",
year = "2011",
month = "6",
day = "6",
doi = "10.1109/ICDE.2011.5767858",
language = "English",
isbn = "9781424489589",
pages = "39--50",
booktitle = "Proceedings - International Conference on Data Engineering",

}

TY - GEN

T1 - Adding regular expressions to graph reachability and pattern queries

AU - Fan, Wenfei

AU - Li, Jianzhong

AU - Ma, Shuai

AU - Tang, Nan

AU - Wu, Yinghui

PY - 2011/6/6

Y1 - 2011/6/6

N2 - It is increasingly common to find graphs in which edges bear different types, indicating a variety of relationships. For such graphs we propose a class of reachability queries and a class of graph patterns, in which an edge is specified with a regular expression of a certain form, expressing the connectivity in a data graph via edges of various types. In addition, we define graph pattern matching based on a revised notion of graph simulation. On graphs in emerging applications such as social networks, we show that these queries are capable of finding more sensible information than their traditional counterparts. Better still, their increased expressive power does not come with extra complexity. Indeed, (1) we investigate their containment and minimization problems, and show that these fundamental problems are in quadratic time for reachability queries and are in cubic time for pattern queries. (2) We develop an algorithm for answering reachability queries, in quadratic time as for their traditional counterpart. (3) We provide two cubic-time algorithms for evaluating graph pattern queries based on extended graph simulation, as opposed to the NP-completeness of graph pattern matching via subgraph isomorphism. (4) The effectiveness, efficiency and scalability of these algorithms are experimentally verified using real-life data and synthetic data.

AB - It is increasingly common to find graphs in which edges bear different types, indicating a variety of relationships. For such graphs we propose a class of reachability queries and a class of graph patterns, in which an edge is specified with a regular expression of a certain form, expressing the connectivity in a data graph via edges of various types. In addition, we define graph pattern matching based on a revised notion of graph simulation. On graphs in emerging applications such as social networks, we show that these queries are capable of finding more sensible information than their traditional counterparts. Better still, their increased expressive power does not come with extra complexity. Indeed, (1) we investigate their containment and minimization problems, and show that these fundamental problems are in quadratic time for reachability queries and are in cubic time for pattern queries. (2) We develop an algorithm for answering reachability queries, in quadratic time as for their traditional counterpart. (3) We provide two cubic-time algorithms for evaluating graph pattern queries based on extended graph simulation, as opposed to the NP-completeness of graph pattern matching via subgraph isomorphism. (4) The effectiveness, efficiency and scalability of these algorithms are experimentally verified using real-life data and synthetic data.

UR - http://www.scopus.com/inward/record.url?scp=79957828078&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79957828078&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2011.5767858

DO - 10.1109/ICDE.2011.5767858

M3 - Conference contribution

AN - SCOPUS:79957828078

SN - 9781424489589

SP - 39

EP - 50

BT - Proceedings - International Conference on Data Engineering

ER -