Who tags what? An analysis framework

Mahashweta Das, Saravanan Thirumuruganathan, Sihem Amer-Yahia, Gautam Das, Cong Yu

Research output: Contribution to journalArticle

13 Citations (Scopus)


The rise of Web 2.0 is signaled by sites such as Flickr, del.icio.us, and YouTube, and social tagging is essential to their success. A typical tagging action involves three com-ponents, user, item (e.g., photos in Flickr), and tags (i.e., words or phrases). Analyzing how tags are assigned by cer-tain users to certain items has important implications in helping users search for desired information. In this pa-per, we explore common analysis tasks and propose a dual mining framework for social tagging behavior mining. This framework is centered around two opposing measures, sim-ilarity and diversity, being applied to one or more tagging components, and therefore enables a wide range of analy-sis scenarios such as characterizing similar users tagging di-verse items with similar tags, or diverse users tagging similar items with diverse tags, etc. By adopting different concrete measures for similarity and diversity in the framework, we show that a wide range of concrete analysis problems can be defined and they are NP-Complete in general. We de-sign efficient algorithms for solving many of those problems and demonstrate, through comprehensive experiments over real data, that our algorithms significantly out-perform the exact brute-force approach without compromising analysis result quality.

Original languageEnglish
Pages (from-to)1567-1578
Number of pages12
JournalProceedings of the VLDB Endowment
Issue number11
Publication statusPublished - Jul 2012


ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this