Abstract
We propose a novel algorithm, FURS (Fast and Unique Representative Subset selection) to deterministically select a set of nodes from a given graph which retains the underlying community structure. FURS greedily selects nodes with high-degree centrality from most or all the communities in the network. The nodes with high-degree centrality for each community are usually located at the center rather than the periphery and can better capture the community structure. The nodes are selected such that they are not isolated but can form disconnected components. The FURS is evaluated by quality measures, such as coverage, clustering coefficients, degree distributions and variation of information. Empirically, we observe that the nodes are selected such that most or all of the communities in the original network are retained. We compare our proposed technique with state-of-the-art methods like SlashBurn, Forest-Fire, Metropolis and Snowball Expansion sampling techniques. We evaluate FURS on several synthetic and real-world networks of varying size to demonstrate the high quality of our subset while preserving the community structure. The subset generated by the FURS method can be effectively utilized by model-based approaches with out-of-sample extension properties for inferring community affiliation of the large-scale networks. A consequence of FURS is that the selected subset is also a good candidate set for simple diffusion model. We compare the spread of information over time using FURS for several real-world networks with random node selection, hubs selection, spokes selection, high eigenvector centrality, high Pagerank, high betweenness centrality and low betweenness centrality-based representative subset selection.
Original language | English |
---|---|
Pages (from-to) | 1075-1095 |
Number of pages | 21 |
Journal | Social Network Analysis and Mining |
Volume | 3 |
Issue number | 4 |
DOIs | |
Publication status | Published - 1 Jan 2013 |
Externally published | Yes |
Fingerprint
Keywords
- Community detection
- Hubs
- Node subset selection
- Simple diffusion model
ASJC Scopus subject areas
- Information Systems
- Communication
- Media Technology
- Human-Computer Interaction
- Computer Science Applications
Cite this
FURS : Fast and Unique Representative Subset selection retaining large-scale community structure. / Mall, RaghvenPhDa; Langone, Rocco; Suykens, Johan A.K.
In: Social Network Analysis and Mining, Vol. 3, No. 4, 01.01.2013, p. 1075-1095.Research output: Contribution to journal › Article
}
TY - JOUR
T1 - FURS
T2 - Fast and Unique Representative Subset selection retaining large-scale community structure
AU - Mall, RaghvenPhDa
AU - Langone, Rocco
AU - Suykens, Johan A.K.
PY - 2013/1/1
Y1 - 2013/1/1
N2 - We propose a novel algorithm, FURS (Fast and Unique Representative Subset selection) to deterministically select a set of nodes from a given graph which retains the underlying community structure. FURS greedily selects nodes with high-degree centrality from most or all the communities in the network. The nodes with high-degree centrality for each community are usually located at the center rather than the periphery and can better capture the community structure. The nodes are selected such that they are not isolated but can form disconnected components. The FURS is evaluated by quality measures, such as coverage, clustering coefficients, degree distributions and variation of information. Empirically, we observe that the nodes are selected such that most or all of the communities in the original network are retained. We compare our proposed technique with state-of-the-art methods like SlashBurn, Forest-Fire, Metropolis and Snowball Expansion sampling techniques. We evaluate FURS on several synthetic and real-world networks of varying size to demonstrate the high quality of our subset while preserving the community structure. The subset generated by the FURS method can be effectively utilized by model-based approaches with out-of-sample extension properties for inferring community affiliation of the large-scale networks. A consequence of FURS is that the selected subset is also a good candidate set for simple diffusion model. We compare the spread of information over time using FURS for several real-world networks with random node selection, hubs selection, spokes selection, high eigenvector centrality, high Pagerank, high betweenness centrality and low betweenness centrality-based representative subset selection.
AB - We propose a novel algorithm, FURS (Fast and Unique Representative Subset selection) to deterministically select a set of nodes from a given graph which retains the underlying community structure. FURS greedily selects nodes with high-degree centrality from most or all the communities in the network. The nodes with high-degree centrality for each community are usually located at the center rather than the periphery and can better capture the community structure. The nodes are selected such that they are not isolated but can form disconnected components. The FURS is evaluated by quality measures, such as coverage, clustering coefficients, degree distributions and variation of information. Empirically, we observe that the nodes are selected such that most or all of the communities in the original network are retained. We compare our proposed technique with state-of-the-art methods like SlashBurn, Forest-Fire, Metropolis and Snowball Expansion sampling techniques. We evaluate FURS on several synthetic and real-world networks of varying size to demonstrate the high quality of our subset while preserving the community structure. The subset generated by the FURS method can be effectively utilized by model-based approaches with out-of-sample extension properties for inferring community affiliation of the large-scale networks. A consequence of FURS is that the selected subset is also a good candidate set for simple diffusion model. We compare the spread of information over time using FURS for several real-world networks with random node selection, hubs selection, spokes selection, high eigenvector centrality, high Pagerank, high betweenness centrality and low betweenness centrality-based representative subset selection.
KW - Community detection
KW - Hubs
KW - Node subset selection
KW - Simple diffusion model
UR - http://www.scopus.com/inward/record.url?scp=84921713163&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84921713163&partnerID=8YFLogxK
U2 - 10.1007/s13278-013-0144-6
DO - 10.1007/s13278-013-0144-6
M3 - Article
AN - SCOPUS:84921713163
VL - 3
SP - 1075
EP - 1095
JO - Social Network Analysis and Mining
JF - Social Network Analysis and Mining
SN - 1869-5450
IS - 4
ER -