An incremental data-stream sketch using sparse random proj ections

Aditya Krishna Menon, Gia Vinh Anh Pham, Sanjay Chawla, Anastasios Viglas

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

We propose the use of random projections with a sparse matrix to maintain a sketch of a collection of high-dimensional data-streams that are updated asynchronously. This sketch allows us to estimate L2 (Euclidean) distances and dot- products with high accuracy. We verify the validity of this sketch by applying it to an online clustering problem, where we compare our results to the offline algorithm and an existing L2 sketch, and observe comparable results in terms of accuracy, and a reduced runtime cost.

Original languageEnglish
Title of host publicationProceedings of the 7th SIAM International Conference on Data Mining
Pages563-568
Number of pages6
Publication statusPublished - 1 Dec 2007
Event7th SIAM International Conference on Data Mining - Minneapolis, MN, United States
Duration: 26 Apr 200728 Apr 2007

Publication series

NameProceedings of the 7th SIAM International Conference on Data Mining

Conference

Conference7th SIAM International Conference on Data Mining
CountryUnited States
CityMinneapolis, MN
Period26/4/0728/4/07

    Fingerprint

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Menon, A. K., Pham, G. V. A., Chawla, S., & Viglas, A. (2007). An incremental data-stream sketch using sparse random proj ections. In Proceedings of the 7th SIAM International Conference on Data Mining (pp. 563-568). (Proceedings of the 7th SIAM International Conference on Data Mining).