Who tags what? An analysis framework

Mahashweta Das, Saravanan Thirumuruganathan, Sihem Amer-Yahia, Gautam Das, Cong Yu

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

The rise of Web 2.0 is signaled by sites such as Flickr, del.icio.us, and YouTube, and social tagging is essential to their success. A typical tagging action involves three com-ponents, user, item (e.g., photos in Flickr), and tags (i.e., words or phrases). Analyzing how tags are assigned by cer-tain users to certain items has important implications in helping users search for desired information. In this pa-per, we explore common analysis tasks and propose a dual mining framework for social tagging behavior mining. This framework is centered around two opposing measures, sim-ilarity and diversity, being applied to one or more tagging components, and therefore enables a wide range of analy-sis scenarios such as characterizing similar users tagging di-verse items with similar tags, or diverse users tagging similar items with diverse tags, etc. By adopting different concrete measures for similarity and diversity in the framework, we show that a wide range of concrete analysis problems can be defined and they are NP-Complete in general. We de-sign efficient algorithms for solving many of those problems and demonstrate, through comprehensive experiments over real data, that our algorithms significantly out-perform the exact brute-force approach without compromising analysis result quality.

Original languageEnglish
Pages (from-to)1567-1578
Number of pages12
JournalProceedings of the VLDB Endowment
Volume5
Issue number11
DOIs
Publication statusPublished - 1 Jan 2012
Externally publishedYes

Fingerprint

Concretes
Experiments

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this

Who tags what? An analysis framework. / Das, Mahashweta; Thirumuruganathan, Saravanan; Amer-Yahia, Sihem; Das, Gautam; Yu, Cong.

In: Proceedings of the VLDB Endowment, Vol. 5, No. 11, 01.01.2012, p. 1567-1578.

Research output: Contribution to journalArticle

Das, Mahashweta ; Thirumuruganathan, Saravanan ; Amer-Yahia, Sihem ; Das, Gautam ; Yu, Cong. / Who tags what? An analysis framework. In: Proceedings of the VLDB Endowment. 2012 ; Vol. 5, No. 11. pp. 1567-1578.
@article{7b3090f0df55498ea86c896149b24ceb,
title = "Who tags what? An analysis framework",
abstract = "The rise of Web 2.0 is signaled by sites such as Flickr, del.icio.us, and YouTube, and social tagging is essential to their success. A typical tagging action involves three com-ponents, user, item (e.g., photos in Flickr), and tags (i.e., words or phrases). Analyzing how tags are assigned by cer-tain users to certain items has important implications in helping users search for desired information. In this pa-per, we explore common analysis tasks and propose a dual mining framework for social tagging behavior mining. This framework is centered around two opposing measures, sim-ilarity and diversity, being applied to one or more tagging components, and therefore enables a wide range of analy-sis scenarios such as characterizing similar users tagging di-verse items with similar tags, or diverse users tagging similar items with diverse tags, etc. By adopting different concrete measures for similarity and diversity in the framework, we show that a wide range of concrete analysis problems can be defined and they are NP-Complete in general. We de-sign efficient algorithms for solving many of those problems and demonstrate, through comprehensive experiments over real data, that our algorithms significantly out-perform the exact brute-force approach without compromising analysis result quality.",
author = "Mahashweta Das and Saravanan Thirumuruganathan and Sihem Amer-Yahia and Gautam Das and Cong Yu",
year = "2012",
month = "1",
day = "1",
doi = "10.14778/2350229.2350270",
language = "English",
volume = "5",
pages = "1567--1578",
journal = "Proceedings of the VLDB Endowment",
issn = "2150-8097",
publisher = "Very Large Data Base Endowment Inc.",
number = "11",

}

TY - JOUR

T1 - Who tags what? An analysis framework

AU - Das, Mahashweta

AU - Thirumuruganathan, Saravanan

AU - Amer-Yahia, Sihem

AU - Das, Gautam

AU - Yu, Cong

PY - 2012/1/1

Y1 - 2012/1/1

N2 - The rise of Web 2.0 is signaled by sites such as Flickr, del.icio.us, and YouTube, and social tagging is essential to their success. A typical tagging action involves three com-ponents, user, item (e.g., photos in Flickr), and tags (i.e., words or phrases). Analyzing how tags are assigned by cer-tain users to certain items has important implications in helping users search for desired information. In this pa-per, we explore common analysis tasks and propose a dual mining framework for social tagging behavior mining. This framework is centered around two opposing measures, sim-ilarity and diversity, being applied to one or more tagging components, and therefore enables a wide range of analy-sis scenarios such as characterizing similar users tagging di-verse items with similar tags, or diverse users tagging similar items with diverse tags, etc. By adopting different concrete measures for similarity and diversity in the framework, we show that a wide range of concrete analysis problems can be defined and they are NP-Complete in general. We de-sign efficient algorithms for solving many of those problems and demonstrate, through comprehensive experiments over real data, that our algorithms significantly out-perform the exact brute-force approach without compromising analysis result quality.

AB - The rise of Web 2.0 is signaled by sites such as Flickr, del.icio.us, and YouTube, and social tagging is essential to their success. A typical tagging action involves three com-ponents, user, item (e.g., photos in Flickr), and tags (i.e., words or phrases). Analyzing how tags are assigned by cer-tain users to certain items has important implications in helping users search for desired information. In this pa-per, we explore common analysis tasks and propose a dual mining framework for social tagging behavior mining. This framework is centered around two opposing measures, sim-ilarity and diversity, being applied to one or more tagging components, and therefore enables a wide range of analy-sis scenarios such as characterizing similar users tagging di-verse items with similar tags, or diverse users tagging similar items with diverse tags, etc. By adopting different concrete measures for similarity and diversity in the framework, we show that a wide range of concrete analysis problems can be defined and they are NP-Complete in general. We de-sign efficient algorithms for solving many of those problems and demonstrate, through comprehensive experiments over real data, that our algorithms significantly out-perform the exact brute-force approach without compromising analysis result quality.

UR - http://www.scopus.com/inward/record.url?scp=84872916234&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872916234&partnerID=8YFLogxK

U2 - 10.14778/2350229.2350270

DO - 10.14778/2350229.2350270

M3 - Article

VL - 5

SP - 1567

EP - 1578

JO - Proceedings of the VLDB Endowment

JF - Proceedings of the VLDB Endowment

SN - 2150-8097

IS - 11

ER -