Estimating number of citations using author reputation

Carlos Castillo, Debora Donato, Aristides Gionis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

36 Citations (Scopus)

Abstract

We study the problem of predicting the popularity of items in a dynamic environment in which authors post continuously new items and provide feedback on existing items. This problem can be applied to predict popularity of blog posts, rank photographs in a photo-sharing system, or predict the citations of a scientific article using author information and monitoring the items of interest for a short period of time after their creation. As a case study, we show how to estimate the number of citations for an academic paper using information about past articles written by the same author(s) of the paper. If we use only the citation information over a short period of time, we obtain a predicted value that has a correlation of r = 0.57 with the actual value. This is our baseline prediction. Our best-performing system can improve that prediction by adding features extracted from the past publishing history of its authors, increasing the correlation between the actual and the predicted values to r = 0.81.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages107-117
Number of pages11
Volume4726 LNCS
Publication statusPublished - 1 Dec 2007
Externally publishedYes
Event14th International Symposium on String Processing and Information Retrieval, SPIRE 2007 - Santiago, Chile
Duration: 29 Oct 200731 Oct 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4726 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other14th International Symposium on String Processing and Information Retrieval, SPIRE 2007
CountryChile
CitySantiago
Period29/10/0731/10/07

Fingerprint

Citations
Blogging
Period of time
Blogs
Predict
Prediction
Dynamic Environment
Feedback
Monitoring
Baseline
Sharing
Estimate
Reputation

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Castillo, C., Donato, D., & Gionis, A. (2007). Estimating number of citations using author reputation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4726 LNCS, pp. 107-117). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4726 LNCS).

Estimating number of citations using author reputation. / Castillo, Carlos; Donato, Debora; Gionis, Aristides.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4726 LNCS 2007. p. 107-117 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4726 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Castillo, C, Donato, D & Gionis, A 2007, Estimating number of citations using author reputation. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 4726 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4726 LNCS, pp. 107-117, 14th International Symposium on String Processing and Information Retrieval, SPIRE 2007, Santiago, Chile, 29/10/07.
Castillo C, Donato D, Gionis A. Estimating number of citations using author reputation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4726 LNCS. 2007. p. 107-117. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Castillo, Carlos ; Donato, Debora ; Gionis, Aristides. / Estimating number of citations using author reputation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4726 LNCS 2007. pp. 107-117 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{003493e5b07047d5ac0f0b47e85fde34,
title = "Estimating number of citations using author reputation",
abstract = "We study the problem of predicting the popularity of items in a dynamic environment in which authors post continuously new items and provide feedback on existing items. This problem can be applied to predict popularity of blog posts, rank photographs in a photo-sharing system, or predict the citations of a scientific article using author information and monitoring the items of interest for a short period of time after their creation. As a case study, we show how to estimate the number of citations for an academic paper using information about past articles written by the same author(s) of the paper. If we use only the citation information over a short period of time, we obtain a predicted value that has a correlation of r = 0.57 with the actual value. This is our baseline prediction. Our best-performing system can improve that prediction by adding features extracted from the past publishing history of its authors, increasing the correlation between the actual and the predicted values to r = 0.81.",
author = "Carlos Castillo and Debora Donato and Aristides Gionis",
year = "2007",
month = "12",
day = "1",
language = "English",
isbn = "9783540755296",
volume = "4726 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "107--117",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Estimating number of citations using author reputation

AU - Castillo, Carlos

AU - Donato, Debora

AU - Gionis, Aristides

PY - 2007/12/1

Y1 - 2007/12/1

N2 - We study the problem of predicting the popularity of items in a dynamic environment in which authors post continuously new items and provide feedback on existing items. This problem can be applied to predict popularity of blog posts, rank photographs in a photo-sharing system, or predict the citations of a scientific article using author information and monitoring the items of interest for a short period of time after their creation. As a case study, we show how to estimate the number of citations for an academic paper using information about past articles written by the same author(s) of the paper. If we use only the citation information over a short period of time, we obtain a predicted value that has a correlation of r = 0.57 with the actual value. This is our baseline prediction. Our best-performing system can improve that prediction by adding features extracted from the past publishing history of its authors, increasing the correlation between the actual and the predicted values to r = 0.81.

AB - We study the problem of predicting the popularity of items in a dynamic environment in which authors post continuously new items and provide feedback on existing items. This problem can be applied to predict popularity of blog posts, rank photographs in a photo-sharing system, or predict the citations of a scientific article using author information and monitoring the items of interest for a short period of time after their creation. As a case study, we show how to estimate the number of citations for an academic paper using information about past articles written by the same author(s) of the paper. If we use only the citation information over a short period of time, we obtain a predicted value that has a correlation of r = 0.57 with the actual value. This is our baseline prediction. Our best-performing system can improve that prediction by adding features extracted from the past publishing history of its authors, increasing the correlation between the actual and the predicted values to r = 0.81.

UR - http://www.scopus.com/inward/record.url?scp=38049086853&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=38049086853&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9783540755296

VL - 4726 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 107

EP - 117

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -