Integration of scientific and social networks

Mahmood Neshati, Djoerd Hiemstra, Ehsaneddin Asgari, Hamid Beigy

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

In this paper, we address the problem of scientific-social network integration to find a matching relationship between members of these networks (i.e. The DBLP publication network and the Twitter social network). This task is a crucial step toward building a multi environment expert finding system that has recently attracted much attention in Information Retrieval community. In this paper, the problem of social and scientific network integration is divided into two sub problems. The first problem concerns finding those profiles in one network, which presumably have a corresponding profile in the other network and the second problem concerns the name disambiguation to find true matching profiles among some candidate profiles for matching. Utilizing several name similarity patterns and contextual properties of these networks, we design a focused crawler to find high probable matching pairs, then the problem of name disambiguation is reduced to predict the label of each candidate pair as either true or false matching. Because the labels of these candidate pairs are not independent, state-of-the-art classification methods such as logistic regression and decision tree, which classify each instance separately, are unsuitable for this task. By defining matching dependency graph, we propose a joint label prediction model to determine the label of all candidate pairs simultaneously. Two main types of dependencies among candidate pairs are considered for designing the joint label prediction model which are quite intuitive and general. Using the discriminative approaches, we utilize various feature sets to train our proposed classifiers. An extensive set of experiments have been conducted on six test collection collected from the DBLP and the Twitter networks to show the effectiveness of the proposed joint label prediction model.

Original languageEnglish
Pages (from-to)1-29
Number of pages29
JournalWorld Wide Web
DOIs
Publication statusAccepted/In press - 24 Jun 2013

Fingerprint

Labels
Decision trees
Information retrieval
Logistics
Classifiers
Experiments

Keywords

  • Collective classification
  • DBLP
  • Social network integration
  • Twitter

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Neshati, M., Hiemstra, D., Asgari, E., & Beigy, H. (Accepted/In press). Integration of scientific and social networks. World Wide Web, 1-29. https://doi.org/10.1007/s11280-013-0229-1

Integration of scientific and social networks. / Neshati, Mahmood; Hiemstra, Djoerd; Asgari, Ehsaneddin; Beigy, Hamid.

In: World Wide Web, 24.06.2013, p. 1-29.

Research output: Contribution to journalArticle

Neshati, M, Hiemstra, D, Asgari, E & Beigy, H 2013, 'Integration of scientific and social networks', World Wide Web, pp. 1-29. https://doi.org/10.1007/s11280-013-0229-1
Neshati, Mahmood ; Hiemstra, Djoerd ; Asgari, Ehsaneddin ; Beigy, Hamid. / Integration of scientific and social networks. In: World Wide Web. 2013 ; pp. 1-29.
@article{c6f9f2163fbe4c86aa28e0d76b2fc207,
title = "Integration of scientific and social networks",
abstract = "In this paper, we address the problem of scientific-social network integration to find a matching relationship between members of these networks (i.e. The DBLP publication network and the Twitter social network). This task is a crucial step toward building a multi environment expert finding system that has recently attracted much attention in Information Retrieval community. In this paper, the problem of social and scientific network integration is divided into two sub problems. The first problem concerns finding those profiles in one network, which presumably have a corresponding profile in the other network and the second problem concerns the name disambiguation to find true matching profiles among some candidate profiles for matching. Utilizing several name similarity patterns and contextual properties of these networks, we design a focused crawler to find high probable matching pairs, then the problem of name disambiguation is reduced to predict the label of each candidate pair as either true or false matching. Because the labels of these candidate pairs are not independent, state-of-the-art classification methods such as logistic regression and decision tree, which classify each instance separately, are unsuitable for this task. By defining matching dependency graph, we propose a joint label prediction model to determine the label of all candidate pairs simultaneously. Two main types of dependencies among candidate pairs are considered for designing the joint label prediction model which are quite intuitive and general. Using the discriminative approaches, we utilize various feature sets to train our proposed classifiers. An extensive set of experiments have been conducted on six test collection collected from the DBLP and the Twitter networks to show the effectiveness of the proposed joint label prediction model.",
keywords = "Collective classification, DBLP, Social network integration, Twitter",
author = "Mahmood Neshati and Djoerd Hiemstra and Ehsaneddin Asgari and Hamid Beigy",
year = "2013",
month = "6",
day = "24",
doi = "10.1007/s11280-013-0229-1",
language = "English",
pages = "1--29",
journal = "World Wide Web",
issn = "1386-145X",
publisher = "Springer New York",

}

TY - JOUR

T1 - Integration of scientific and social networks

AU - Neshati, Mahmood

AU - Hiemstra, Djoerd

AU - Asgari, Ehsaneddin

AU - Beigy, Hamid

PY - 2013/6/24

Y1 - 2013/6/24

N2 - In this paper, we address the problem of scientific-social network integration to find a matching relationship between members of these networks (i.e. The DBLP publication network and the Twitter social network). This task is a crucial step toward building a multi environment expert finding system that has recently attracted much attention in Information Retrieval community. In this paper, the problem of social and scientific network integration is divided into two sub problems. The first problem concerns finding those profiles in one network, which presumably have a corresponding profile in the other network and the second problem concerns the name disambiguation to find true matching profiles among some candidate profiles for matching. Utilizing several name similarity patterns and contextual properties of these networks, we design a focused crawler to find high probable matching pairs, then the problem of name disambiguation is reduced to predict the label of each candidate pair as either true or false matching. Because the labels of these candidate pairs are not independent, state-of-the-art classification methods such as logistic regression and decision tree, which classify each instance separately, are unsuitable for this task. By defining matching dependency graph, we propose a joint label prediction model to determine the label of all candidate pairs simultaneously. Two main types of dependencies among candidate pairs are considered for designing the joint label prediction model which are quite intuitive and general. Using the discriminative approaches, we utilize various feature sets to train our proposed classifiers. An extensive set of experiments have been conducted on six test collection collected from the DBLP and the Twitter networks to show the effectiveness of the proposed joint label prediction model.

AB - In this paper, we address the problem of scientific-social network integration to find a matching relationship between members of these networks (i.e. The DBLP publication network and the Twitter social network). This task is a crucial step toward building a multi environment expert finding system that has recently attracted much attention in Information Retrieval community. In this paper, the problem of social and scientific network integration is divided into two sub problems. The first problem concerns finding those profiles in one network, which presumably have a corresponding profile in the other network and the second problem concerns the name disambiguation to find true matching profiles among some candidate profiles for matching. Utilizing several name similarity patterns and contextual properties of these networks, we design a focused crawler to find high probable matching pairs, then the problem of name disambiguation is reduced to predict the label of each candidate pair as either true or false matching. Because the labels of these candidate pairs are not independent, state-of-the-art classification methods such as logistic regression and decision tree, which classify each instance separately, are unsuitable for this task. By defining matching dependency graph, we propose a joint label prediction model to determine the label of all candidate pairs simultaneously. Two main types of dependencies among candidate pairs are considered for designing the joint label prediction model which are quite intuitive and general. Using the discriminative approaches, we utilize various feature sets to train our proposed classifiers. An extensive set of experiments have been conducted on six test collection collected from the DBLP and the Twitter networks to show the effectiveness of the proposed joint label prediction model.

KW - Collective classification

KW - DBLP

KW - Social network integration

KW - Twitter

UR - http://www.scopus.com/inward/record.url?scp=84879064521&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84879064521&partnerID=8YFLogxK

U2 - 10.1007/s11280-013-0229-1

DO - 10.1007/s11280-013-0229-1

M3 - Article

SP - 1

EP - 29

JO - World Wide Web

JF - World Wide Web

SN - 1386-145X

ER -