Beyond itemsets

Mining frequent featuresets over structured items

Saravanan Thirumuruganathan, Habibur Rahman, Sofiane Abbar, Gautam Das

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We assume a dataset of transactions generated by a set of users over structured items where each item could be described through a set of features. In this paper, we are interested in identifying the frequent featuresets (set of features) by mining item transactions. For example, in a news website, items correspond to news articles, the features are the named-entities/topics in the articles and an item transaction would be the set of news articles read by a user within the same session. We show that mining frequent featuresets over structured item transactions is a novel problem and show that straightforward extensions of existing frequent itemset mining techniques provide unsatisfactory results. This is due to the fact that while users are drawn to each item in the transaction due to a subset of its features, the transaction by itself does not provide any information about such underlying preferred features of users. In order to overcome this hurdle, we propose a featureset uncertainty model where each item transaction could have been generated by various featuresets with different probabilities. We describe a novel approach to transform item transactions into uncertain transaction over featuresets and estimate their probabilities using constrained least squares based approach. We propose diverse algorithms to mine frequent featuresets. Our experimental evaluation provides a comparative analysis of the different approaches proposed.

Original languageEnglish
Title of host publicationProceedings of the VLDB Endowment
PublisherAssociation for Computing Machinery
Pages257-268
Number of pages12
Volume8
Edition3
Publication statusPublished - 1 Nov 2014
Event3rd Workshop on Spatio-Temporal Database Management, STDBM 2006, Co-located with the 32nd International Conference on Very Large Data Bases, VLDB 2006 - Seoul, Korea, Republic of
Duration: 11 Sep 200611 Sep 2006

Other

Other3rd Workshop on Spatio-Temporal Database Management, STDBM 2006, Co-located with the 32nd International Conference on Very Large Data Bases, VLDB 2006
CountryKorea, Republic of
CitySeoul
Period11/9/0611/9/06

Fingerprint

Websites
Uncertainty

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this

Thirumuruganathan, S., Rahman, H., Abbar, S., & Das, G. (2014). Beyond itemsets: Mining frequent featuresets over structured items. In Proceedings of the VLDB Endowment (3 ed., Vol. 8, pp. 257-268). Association for Computing Machinery.

Beyond itemsets : Mining frequent featuresets over structured items. / Thirumuruganathan, Saravanan; Rahman, Habibur; Abbar, Sofiane; Das, Gautam.

Proceedings of the VLDB Endowment. Vol. 8 3. ed. Association for Computing Machinery, 2014. p. 257-268.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Thirumuruganathan, S, Rahman, H, Abbar, S & Das, G 2014, Beyond itemsets: Mining frequent featuresets over structured items. in Proceedings of the VLDB Endowment. 3 edn, vol. 8, Association for Computing Machinery, pp. 257-268, 3rd Workshop on Spatio-Temporal Database Management, STDBM 2006, Co-located with the 32nd International Conference on Very Large Data Bases, VLDB 2006, Seoul, Korea, Republic of, 11/9/06.
Thirumuruganathan S, Rahman H, Abbar S, Das G. Beyond itemsets: Mining frequent featuresets over structured items. In Proceedings of the VLDB Endowment. 3 ed. Vol. 8. Association for Computing Machinery. 2014. p. 257-268
Thirumuruganathan, Saravanan ; Rahman, Habibur ; Abbar, Sofiane ; Das, Gautam. / Beyond itemsets : Mining frequent featuresets over structured items. Proceedings of the VLDB Endowment. Vol. 8 3. ed. Association for Computing Machinery, 2014. pp. 257-268
@inproceedings{703c45fc7dad46d386f6e14f3b00a96f,
title = "Beyond itemsets: Mining frequent featuresets over structured items",
abstract = "We assume a dataset of transactions generated by a set of users over structured items where each item could be described through a set of features. In this paper, we are interested in identifying the frequent featuresets (set of features) by mining item transactions. For example, in a news website, items correspond to news articles, the features are the named-entities/topics in the articles and an item transaction would be the set of news articles read by a user within the same session. We show that mining frequent featuresets over structured item transactions is a novel problem and show that straightforward extensions of existing frequent itemset mining techniques provide unsatisfactory results. This is due to the fact that while users are drawn to each item in the transaction due to a subset of its features, the transaction by itself does not provide any information about such underlying preferred features of users. In order to overcome this hurdle, we propose a featureset uncertainty model where each item transaction could have been generated by various featuresets with different probabilities. We describe a novel approach to transform item transactions into uncertain transaction over featuresets and estimate their probabilities using constrained least squares based approach. We propose diverse algorithms to mine frequent featuresets. Our experimental evaluation provides a comparative analysis of the different approaches proposed.",
author = "Saravanan Thirumuruganathan and Habibur Rahman and Sofiane Abbar and Gautam Das",
year = "2014",
month = "11",
day = "1",
language = "English",
volume = "8",
pages = "257--268",
booktitle = "Proceedings of the VLDB Endowment",
publisher = "Association for Computing Machinery",
edition = "3",

}

TY - GEN

T1 - Beyond itemsets

T2 - Mining frequent featuresets over structured items

AU - Thirumuruganathan, Saravanan

AU - Rahman, Habibur

AU - Abbar, Sofiane

AU - Das, Gautam

PY - 2014/11/1

Y1 - 2014/11/1

N2 - We assume a dataset of transactions generated by a set of users over structured items where each item could be described through a set of features. In this paper, we are interested in identifying the frequent featuresets (set of features) by mining item transactions. For example, in a news website, items correspond to news articles, the features are the named-entities/topics in the articles and an item transaction would be the set of news articles read by a user within the same session. We show that mining frequent featuresets over structured item transactions is a novel problem and show that straightforward extensions of existing frequent itemset mining techniques provide unsatisfactory results. This is due to the fact that while users are drawn to each item in the transaction due to a subset of its features, the transaction by itself does not provide any information about such underlying preferred features of users. In order to overcome this hurdle, we propose a featureset uncertainty model where each item transaction could have been generated by various featuresets with different probabilities. We describe a novel approach to transform item transactions into uncertain transaction over featuresets and estimate their probabilities using constrained least squares based approach. We propose diverse algorithms to mine frequent featuresets. Our experimental evaluation provides a comparative analysis of the different approaches proposed.

AB - We assume a dataset of transactions generated by a set of users over structured items where each item could be described through a set of features. In this paper, we are interested in identifying the frequent featuresets (set of features) by mining item transactions. For example, in a news website, items correspond to news articles, the features are the named-entities/topics in the articles and an item transaction would be the set of news articles read by a user within the same session. We show that mining frequent featuresets over structured item transactions is a novel problem and show that straightforward extensions of existing frequent itemset mining techniques provide unsatisfactory results. This is due to the fact that while users are drawn to each item in the transaction due to a subset of its features, the transaction by itself does not provide any information about such underlying preferred features of users. In order to overcome this hurdle, we propose a featureset uncertainty model where each item transaction could have been generated by various featuresets with different probabilities. We describe a novel approach to transform item transactions into uncertain transaction over featuresets and estimate their probabilities using constrained least squares based approach. We propose diverse algorithms to mine frequent featuresets. Our experimental evaluation provides a comparative analysis of the different approaches proposed.

UR - http://www.scopus.com/inward/record.url?scp=84938081846&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84938081846&partnerID=8YFLogxK

M3 - Conference contribution

VL - 8

SP - 257

EP - 268

BT - Proceedings of the VLDB Endowment

PB - Association for Computing Machinery

ER -