Skyline query processing for incomplete data

Mohamed E. Khalefa, Mohamed Mokbel, Justin J. Levandoski

Research output: Chapter in Book/Report/Conference proceedingConference contribution

83 Citations (Scopus)

Abstract

Recently, there has been much interest in processing skyline queries for various applications that include decision making, personalized services, and search pruning. Skyline queries aim to prune a search space of large numbers of multi-dimensional data items to a small set of interesting items by eliminating items that are dominated by others. Existing skyline algorithms assume that all dimensions are available for all data items. This paper goes beyond this restrictive assumption as we address the more practical case of involving incomplete data items (i.e., data items missing values in some of their dimensions). In contrast to the case of complete data where the dominance relation is transitive, incomplete data suffer from non-transitive dominance relation which may lead to a cyclic dominance behavior. We first propose two algorithms, namely, "Replacement" and "Bucket" that use traditional skyline algorithms for incomplete data. Then, we propose the "ISkyline" algorithm that is designed specifically for the case of incomplete data. The "ISkyline" algorithm employs two optimization techniques, namely, virtual points and shadow skylines to tolerate cyclic dominance relations. Experimental evidence shows that the "ISkyline" algorithm significantly outperforms variations of traditional skyline algorithms.

Original languageEnglish
Title of host publicationProceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE'08
Pages556-565
Number of pages10
DOIs
Publication statusPublished - 1 Oct 2008
Externally publishedYes
Event2008 IEEE 24th International Conference on Data Engineering, ICDE'08 - Cancun, Mexico
Duration: 7 Apr 200812 Apr 2008

Other

Other2008 IEEE 24th International Conference on Data Engineering, ICDE'08
CountryMexico
CityCancun
Period7/4/0812/4/08

Fingerprint

Query processing
Decision making

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Cite this

Khalefa, M. E., Mokbel, M., & Levandoski, J. J. (2008). Skyline query processing for incomplete data. In Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE'08 (pp. 556-565). [4497464] https://doi.org/10.1109/ICDE.2008.4497464

Skyline query processing for incomplete data. / Khalefa, Mohamed E.; Mokbel, Mohamed; Levandoski, Justin J.

Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE'08. 2008. p. 556-565 4497464.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Khalefa, ME, Mokbel, M & Levandoski, JJ 2008, Skyline query processing for incomplete data. in Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE'08., 4497464, pp. 556-565, 2008 IEEE 24th International Conference on Data Engineering, ICDE'08, Cancun, Mexico, 7/4/08. https://doi.org/10.1109/ICDE.2008.4497464
Khalefa ME, Mokbel M, Levandoski JJ. Skyline query processing for incomplete data. In Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE'08. 2008. p. 556-565. 4497464 https://doi.org/10.1109/ICDE.2008.4497464
Khalefa, Mohamed E. ; Mokbel, Mohamed ; Levandoski, Justin J. / Skyline query processing for incomplete data. Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE'08. 2008. pp. 556-565
@inproceedings{65f1fea6ab144daf84a73252d8f6818c,
title = "Skyline query processing for incomplete data",
abstract = "Recently, there has been much interest in processing skyline queries for various applications that include decision making, personalized services, and search pruning. Skyline queries aim to prune a search space of large numbers of multi-dimensional data items to a small set of interesting items by eliminating items that are dominated by others. Existing skyline algorithms assume that all dimensions are available for all data items. This paper goes beyond this restrictive assumption as we address the more practical case of involving incomplete data items (i.e., data items missing values in some of their dimensions). In contrast to the case of complete data where the dominance relation is transitive, incomplete data suffer from non-transitive dominance relation which may lead to a cyclic dominance behavior. We first propose two algorithms, namely, {"}Replacement{"} and {"}Bucket{"} that use traditional skyline algorithms for incomplete data. Then, we propose the {"}ISkyline{"} algorithm that is designed specifically for the case of incomplete data. The {"}ISkyline{"} algorithm employs two optimization techniques, namely, virtual points and shadow skylines to tolerate cyclic dominance relations. Experimental evidence shows that the {"}ISkyline{"} algorithm significantly outperforms variations of traditional skyline algorithms.",
author = "Khalefa, {Mohamed E.} and Mohamed Mokbel and Levandoski, {Justin J.}",
year = "2008",
month = "10",
day = "1",
doi = "10.1109/ICDE.2008.4497464",
language = "English",
isbn = "9781424418374",
pages = "556--565",
booktitle = "Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE'08",

}

TY - GEN

T1 - Skyline query processing for incomplete data

AU - Khalefa, Mohamed E.

AU - Mokbel, Mohamed

AU - Levandoski, Justin J.

PY - 2008/10/1

Y1 - 2008/10/1

N2 - Recently, there has been much interest in processing skyline queries for various applications that include decision making, personalized services, and search pruning. Skyline queries aim to prune a search space of large numbers of multi-dimensional data items to a small set of interesting items by eliminating items that are dominated by others. Existing skyline algorithms assume that all dimensions are available for all data items. This paper goes beyond this restrictive assumption as we address the more practical case of involving incomplete data items (i.e., data items missing values in some of their dimensions). In contrast to the case of complete data where the dominance relation is transitive, incomplete data suffer from non-transitive dominance relation which may lead to a cyclic dominance behavior. We first propose two algorithms, namely, "Replacement" and "Bucket" that use traditional skyline algorithms for incomplete data. Then, we propose the "ISkyline" algorithm that is designed specifically for the case of incomplete data. The "ISkyline" algorithm employs two optimization techniques, namely, virtual points and shadow skylines to tolerate cyclic dominance relations. Experimental evidence shows that the "ISkyline" algorithm significantly outperforms variations of traditional skyline algorithms.

AB - Recently, there has been much interest in processing skyline queries for various applications that include decision making, personalized services, and search pruning. Skyline queries aim to prune a search space of large numbers of multi-dimensional data items to a small set of interesting items by eliminating items that are dominated by others. Existing skyline algorithms assume that all dimensions are available for all data items. This paper goes beyond this restrictive assumption as we address the more practical case of involving incomplete data items (i.e., data items missing values in some of their dimensions). In contrast to the case of complete data where the dominance relation is transitive, incomplete data suffer from non-transitive dominance relation which may lead to a cyclic dominance behavior. We first propose two algorithms, namely, "Replacement" and "Bucket" that use traditional skyline algorithms for incomplete data. Then, we propose the "ISkyline" algorithm that is designed specifically for the case of incomplete data. The "ISkyline" algorithm employs two optimization techniques, namely, virtual points and shadow skylines to tolerate cyclic dominance relations. Experimental evidence shows that the "ISkyline" algorithm significantly outperforms variations of traditional skyline algorithms.

UR - http://www.scopus.com/inward/record.url?scp=52649139904&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=52649139904&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2008.4497464

DO - 10.1109/ICDE.2008.4497464

M3 - Conference contribution

SN - 9781424418374

SP - 556

EP - 565

BT - Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE'08

ER -