GUISE

A uniform sampler for constructing frequency histogram of graphlets

Mahmudur Rahman, Mansurul Alam Bhuiyan, Mahmuda Rahman, Mohammad Al Hasan

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Graphlet frequency distribution (GFD) has recently become popular for characterizing large networks. However, the computation of GFD for a network requires the exact count of embedded graphlets in that network, which is a computationally expensive task. As a result, it is practically infeasible to compute the GFD for even a moderately large network. In this paper, we propose Guise, which uses a Markov Chain Monte Carlo sampling method for constructing the approximate GFD of a large network. Our experiments on networks with millions of nodes show that Guise obtains the GFD with very low rate of error within few minutes, whereas the exhaustive counting-based approach takes several days.

Original languageEnglish
Pages (from-to)511-536
Number of pages26
JournalKnowledge and Information Systems
Volume38
Issue number3
DOIs
Publication statusPublished - 1 Jan 2014

Fingerprint

Markov processes
Sampling
Experiments

Keywords

  • Graph analysis
  • Graph mining
  • Graphlet counting
  • Graphlet degree distribution
  • Graphlet sampling
  • MCMC sampling
  • Subgraph concentration
  • Uniform sampling

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Information Systems
  • Hardware and Architecture
  • Human-Computer Interaction

Cite this

GUISE : A uniform sampler for constructing frequency histogram of graphlets. / Rahman, Mahmudur; Bhuiyan, Mansurul Alam; Rahman, Mahmuda; Hasan, Mohammad Al.

In: Knowledge and Information Systems, Vol. 38, No. 3, 01.01.2014, p. 511-536.

Research output: Contribution to journalArticle

Rahman, Mahmudur ; Bhuiyan, Mansurul Alam ; Rahman, Mahmuda ; Hasan, Mohammad Al. / GUISE : A uniform sampler for constructing frequency histogram of graphlets. In: Knowledge and Information Systems. 2014 ; Vol. 38, No. 3. pp. 511-536.
@article{c96892507481479cb3823547181d7b9c,
title = "GUISE: A uniform sampler for constructing frequency histogram of graphlets",
abstract = "Graphlet frequency distribution (GFD) has recently become popular for characterizing large networks. However, the computation of GFD for a network requires the exact count of embedded graphlets in that network, which is a computationally expensive task. As a result, it is practically infeasible to compute the GFD for even a moderately large network. In this paper, we propose Guise, which uses a Markov Chain Monte Carlo sampling method for constructing the approximate GFD of a large network. Our experiments on networks with millions of nodes show that Guise obtains the GFD with very low rate of error within few minutes, whereas the exhaustive counting-based approach takes several days.",
keywords = "Graph analysis, Graph mining, Graphlet counting, Graphlet degree distribution, Graphlet sampling, MCMC sampling, Subgraph concentration, Uniform sampling",
author = "Mahmudur Rahman and Bhuiyan, {Mansurul Alam} and Mahmuda Rahman and Hasan, {Mohammad Al}",
year = "2014",
month = "1",
day = "1",
doi = "10.1007/s10115-013-0673-3",
language = "English",
volume = "38",
pages = "511--536",
journal = "Knowledge and Information Systems",
issn = "0219-1377",
publisher = "Springer London",
number = "3",

}

TY - JOUR

T1 - GUISE

T2 - A uniform sampler for constructing frequency histogram of graphlets

AU - Rahman, Mahmudur

AU - Bhuiyan, Mansurul Alam

AU - Rahman, Mahmuda

AU - Hasan, Mohammad Al

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Graphlet frequency distribution (GFD) has recently become popular for characterizing large networks. However, the computation of GFD for a network requires the exact count of embedded graphlets in that network, which is a computationally expensive task. As a result, it is practically infeasible to compute the GFD for even a moderately large network. In this paper, we propose Guise, which uses a Markov Chain Monte Carlo sampling method for constructing the approximate GFD of a large network. Our experiments on networks with millions of nodes show that Guise obtains the GFD with very low rate of error within few minutes, whereas the exhaustive counting-based approach takes several days.

AB - Graphlet frequency distribution (GFD) has recently become popular for characterizing large networks. However, the computation of GFD for a network requires the exact count of embedded graphlets in that network, which is a computationally expensive task. As a result, it is practically infeasible to compute the GFD for even a moderately large network. In this paper, we propose Guise, which uses a Markov Chain Monte Carlo sampling method for constructing the approximate GFD of a large network. Our experiments on networks with millions of nodes show that Guise obtains the GFD with very low rate of error within few minutes, whereas the exhaustive counting-based approach takes several days.

KW - Graph analysis

KW - Graph mining

KW - Graphlet counting

KW - Graphlet degree distribution

KW - Graphlet sampling

KW - MCMC sampling

KW - Subgraph concentration

KW - Uniform sampling

UR - http://www.scopus.com/inward/record.url?scp=84894649226&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84894649226&partnerID=8YFLogxK

U2 - 10.1007/s10115-013-0673-3

DO - 10.1007/s10115-013-0673-3

M3 - Article

VL - 38

SP - 511

EP - 536

JO - Knowledge and Information Systems

JF - Knowledge and Information Systems

SN - 0219-1377

IS - 3

ER -