GUISE: Uniform sampling of graphlets for large graph analysis

Mansurul A. Bhuiyan, Mahmudur Rahman, Mahmuda Rahman, Mohammad Al Hasan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

38 Citations (Scopus)

Abstract

Graphlet frequency distribution (GFD) has recently become popular for characterizing large networks. However, the computation of GFD for a network requires the exact count of embedded graphlets in that network, which is a computationally expensive task. As a result, it is practically infeasible to compute the GFD for even a moderately large network. In this paper, we propose GUISE, which uses a Markov Chain Monte Carlo (MCMC) sampling method for constructing the approximate GFD of a large network. Our experiments on networks with millions of nodes show that GUISE obtains the GFD within few minutes, whereas the exhaustive counting based approach takes several days.

Original languageEnglish
Title of host publicationProceedings - IEEE International Conference on Data Mining, ICDM
Pages91-100
Number of pages10
DOIs
Publication statusPublished - 1 Dec 2012
Event12th IEEE International Conference on Data Mining, ICDM 2012 - Brussels, Belgium
Duration: 10 Dec 201213 Dec 2012

Other

Other12th IEEE International Conference on Data Mining, ICDM 2012
CountryBelgium
CityBrussels
Period10/12/1213/12/12

Fingerprint

Markov processes
Sampling
Experiments

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Bhuiyan, M. A., Rahman, M., Rahman, M., & Al Hasan, M. (2012). GUISE: Uniform sampling of graphlets for large graph analysis. In Proceedings - IEEE International Conference on Data Mining, ICDM (pp. 91-100). [6413912] https://doi.org/10.1109/ICDM.2012.87

GUISE : Uniform sampling of graphlets for large graph analysis. / Bhuiyan, Mansurul A.; Rahman, Mahmudur; Rahman, Mahmuda; Al Hasan, Mohammad.

Proceedings - IEEE International Conference on Data Mining, ICDM. 2012. p. 91-100 6413912.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Bhuiyan, MA, Rahman, M, Rahman, M & Al Hasan, M 2012, GUISE: Uniform sampling of graphlets for large graph analysis. in Proceedings - IEEE International Conference on Data Mining, ICDM., 6413912, pp. 91-100, 12th IEEE International Conference on Data Mining, ICDM 2012, Brussels, Belgium, 10/12/12. https://doi.org/10.1109/ICDM.2012.87
Bhuiyan MA, Rahman M, Rahman M, Al Hasan M. GUISE: Uniform sampling of graphlets for large graph analysis. In Proceedings - IEEE International Conference on Data Mining, ICDM. 2012. p. 91-100. 6413912 https://doi.org/10.1109/ICDM.2012.87
Bhuiyan, Mansurul A. ; Rahman, Mahmudur ; Rahman, Mahmuda ; Al Hasan, Mohammad. / GUISE : Uniform sampling of graphlets for large graph analysis. Proceedings - IEEE International Conference on Data Mining, ICDM. 2012. pp. 91-100
@inproceedings{44fca606abe64268917e40e4f9d53428,
title = "GUISE: Uniform sampling of graphlets for large graph analysis",
abstract = "Graphlet frequency distribution (GFD) has recently become popular for characterizing large networks. However, the computation of GFD for a network requires the exact count of embedded graphlets in that network, which is a computationally expensive task. As a result, it is practically infeasible to compute the GFD for even a moderately large network. In this paper, we propose GUISE, which uses a Markov Chain Monte Carlo (MCMC) sampling method for constructing the approximate GFD of a large network. Our experiments on networks with millions of nodes show that GUISE obtains the GFD within few minutes, whereas the exhaustive counting based approach takes several days.",
author = "Bhuiyan, {Mansurul A.} and Mahmudur Rahman and Mahmuda Rahman and {Al Hasan}, Mohammad",
year = "2012",
month = "12",
day = "1",
doi = "10.1109/ICDM.2012.87",
language = "English",
isbn = "9780769549057",
pages = "91--100",
booktitle = "Proceedings - IEEE International Conference on Data Mining, ICDM",

}

TY - GEN

T1 - GUISE

T2 - Uniform sampling of graphlets for large graph analysis

AU - Bhuiyan, Mansurul A.

AU - Rahman, Mahmudur

AU - Rahman, Mahmuda

AU - Al Hasan, Mohammad

PY - 2012/12/1

Y1 - 2012/12/1

N2 - Graphlet frequency distribution (GFD) has recently become popular for characterizing large networks. However, the computation of GFD for a network requires the exact count of embedded graphlets in that network, which is a computationally expensive task. As a result, it is practically infeasible to compute the GFD for even a moderately large network. In this paper, we propose GUISE, which uses a Markov Chain Monte Carlo (MCMC) sampling method for constructing the approximate GFD of a large network. Our experiments on networks with millions of nodes show that GUISE obtains the GFD within few minutes, whereas the exhaustive counting based approach takes several days.

AB - Graphlet frequency distribution (GFD) has recently become popular for characterizing large networks. However, the computation of GFD for a network requires the exact count of embedded graphlets in that network, which is a computationally expensive task. As a result, it is practically infeasible to compute the GFD for even a moderately large network. In this paper, we propose GUISE, which uses a Markov Chain Monte Carlo (MCMC) sampling method for constructing the approximate GFD of a large network. Our experiments on networks with millions of nodes show that GUISE obtains the GFD within few minutes, whereas the exhaustive counting based approach takes several days.

UR - http://www.scopus.com/inward/record.url?scp=84874032150&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874032150&partnerID=8YFLogxK

U2 - 10.1109/ICDM.2012.87

DO - 10.1109/ICDM.2012.87

M3 - Conference contribution

AN - SCOPUS:84874032150

SN - 9780769549057

SP - 91

EP - 100

BT - Proceedings - IEEE International Conference on Data Mining, ICDM

ER -