Graph stream summarization: From big bang to big crunch

Nan Tang, Qing Chen, Prasenjit Mitra

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Citations (Scopus)

Abstract

A graph stream, which refers to the graph with edges being updated sequentially in a form of a stream, has important applications in cyber security and social networks. Due to the sheer volume and highly dynamic nature of graph streams, the practical way of handling them is by summarization. Given a graph stream G, directed or undirected, the problem of graph stream summarization is to summarize G as SG with a much smaller (sublinear) space, linear construction time and constant maintenance cost for each edge update, such that SG allows many queries over G to be approximately conducted efficiently. The widely used practice of summarizing data streams is to treat each stream element independently by e.g., hash-or sample-based methods, without maintaining the connections (or relationships) between elements. Hence, existing methods can only solve ad-hoc problems, without supporting diversified and complicated analytics over graph streams. We present TCM, a novel graph stream summary. Given an incoming edge, it summarizes both node and edge information in constant time. Consequently, the summary forms a graphical sketch where edges capture the connections inside elements, and nodes maintain relationships across elements. We discuss a wide range of supported queries and establish some error bounds. In addition, we experimentally show that TCM can effectively and efficiently support analytics over graph streams beyond the power of existing sketches, which demonstrates its potential to start a new line of research and applications in graph stream management.

Original languageEnglish
Title of host publicationSIGMOD 2016 - Proceedings of the 2016 International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages1481-1496
Number of pages16
Volume26-June-2016
ISBN (Electronic)9781450335317
DOIs
Publication statusPublished - 26 Jun 2016
Event2016 ACM SIGMOD International Conference on Management of Data, SIGMOD 2016 - San Francisco, United States
Duration: 26 Jun 20161 Jul 2016

Other

Other2016 ACM SIGMOD International Conference on Management of Data, SIGMOD 2016
CountryUnited States
CitySan Francisco
Period26/6/161/7/16

    Fingerprint

Keywords

  • Data streams
  • Graph streams
  • Sketch
  • Summarization

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Tang, N., Chen, Q., & Mitra, P. (2016). Graph stream summarization: From big bang to big crunch. In SIGMOD 2016 - Proceedings of the 2016 International Conference on Management of Data (Vol. 26-June-2016, pp. 1481-1496). Association for Computing Machinery. https://doi.org/10.1145/2882903.2915223