Thread cooperation in multicore architectures for frequency counting over multiple data streams

Sudipto Das, Shyam Antony, Divyakant Agrawal, Amr El Abbadi

Research output: Chapter in Book/Report/Conference proceedingChapter

39 Citations (Scopus)

Abstract

Many real-world data stream analysis applications such as network monitoring, click stream analysis, and others require combining multiple streams of data arriving from multiple sources. This is referred to as multi-stream analysis. To deal with high stream arrival rates, it is desirable that such systems be capable of supporting very high processing throughput. The advent of multicore processors and powerful servers driven by these processors calls for efficient parallel designs that can effectively utilize the parallelism of the multicores, since performance improvement is possible only through effective parallelism. In this paper, we address the problem of parallelizing multi-stream analysis in the context of multicore processors. Specifically, we concentrate on parallelizing frequent elements, top-k, and frequency counting over multiple streams. We discuss the challenges in designing an efficient parallel system for multi-stream processing. Our evaluation and analysis reveals that traditional "contention" based locking results in excessive overhead and wait, which in turn leads to severe performance degradation in modern multicore architectures. Based on our analysis, we propose a "cooperation" based locking paradigm for efficient parallelization of frequency counting. The proposed "cooperation" based paradigm removes waits associated with synchronization, and allows replacing locks by much cheaper atomic synchronization primitives. Our implementation of the proposed paradigm to parallelize a well known frequency counting algorithm shows the benefits of the proposed "cooperation" based locking paradigm when compared to the traditional "contention" based locking paradigm. In our experiments, the proposed "cooperation" based design outperforms the traditional "contention" based design by a factor of 2 - 5.5X for synthetic zipfian data sets.

Original languageEnglish
Title of host publicationProceedings of the VLDB Endowment
Pages217-228
Number of pages12
Volume2
Edition1
Publication statusPublished - 2009
Externally publishedYes

Fingerprint

Synchronization
Processing
Servers
Throughput
Degradation
Monitoring
Experiments

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this

Das, S., Antony, S., Agrawal, D., & El Abbadi, A. (2009). Thread cooperation in multicore architectures for frequency counting over multiple data streams. In Proceedings of the VLDB Endowment (1 ed., Vol. 2, pp. 217-228)

Thread cooperation in multicore architectures for frequency counting over multiple data streams. / Das, Sudipto; Antony, Shyam; Agrawal, Divyakant; El Abbadi, Amr.

Proceedings of the VLDB Endowment. Vol. 2 1. ed. 2009. p. 217-228.

Research output: Chapter in Book/Report/Conference proceedingChapter

Das, S, Antony, S, Agrawal, D & El Abbadi, A 2009, Thread cooperation in multicore architectures for frequency counting over multiple data streams. in Proceedings of the VLDB Endowment. 1 edn, vol. 2, pp. 217-228.
Das S, Antony S, Agrawal D, El Abbadi A. Thread cooperation in multicore architectures for frequency counting over multiple data streams. In Proceedings of the VLDB Endowment. 1 ed. Vol. 2. 2009. p. 217-228
Das, Sudipto ; Antony, Shyam ; Agrawal, Divyakant ; El Abbadi, Amr. / Thread cooperation in multicore architectures for frequency counting over multiple data streams. Proceedings of the VLDB Endowment. Vol. 2 1. ed. 2009. pp. 217-228
@inbook{acf114ceb27845caba65057e445a095f,
title = "Thread cooperation in multicore architectures for frequency counting over multiple data streams",
abstract = "Many real-world data stream analysis applications such as network monitoring, click stream analysis, and others require combining multiple streams of data arriving from multiple sources. This is referred to as multi-stream analysis. To deal with high stream arrival rates, it is desirable that such systems be capable of supporting very high processing throughput. The advent of multicore processors and powerful servers driven by these processors calls for efficient parallel designs that can effectively utilize the parallelism of the multicores, since performance improvement is possible only through effective parallelism. In this paper, we address the problem of parallelizing multi-stream analysis in the context of multicore processors. Specifically, we concentrate on parallelizing frequent elements, top-k, and frequency counting over multiple streams. We discuss the challenges in designing an efficient parallel system for multi-stream processing. Our evaluation and analysis reveals that traditional {"}contention{"} based locking results in excessive overhead and wait, which in turn leads to severe performance degradation in modern multicore architectures. Based on our analysis, we propose a {"}cooperation{"} based locking paradigm for efficient parallelization of frequency counting. The proposed {"}cooperation{"} based paradigm removes waits associated with synchronization, and allows replacing locks by much cheaper atomic synchronization primitives. Our implementation of the proposed paradigm to parallelize a well known frequency counting algorithm shows the benefits of the proposed {"}cooperation{"} based locking paradigm when compared to the traditional {"}contention{"} based locking paradigm. In our experiments, the proposed {"}cooperation{"} based design outperforms the traditional {"}contention{"} based design by a factor of 2 - 5.5X for synthetic zipfian data sets.",
author = "Sudipto Das and Shyam Antony and Divyakant Agrawal and {El Abbadi}, Amr",
year = "2009",
language = "English",
volume = "2",
pages = "217--228",
booktitle = "Proceedings of the VLDB Endowment",
edition = "1",

}

TY - CHAP

T1 - Thread cooperation in multicore architectures for frequency counting over multiple data streams

AU - Das, Sudipto

AU - Antony, Shyam

AU - Agrawal, Divyakant

AU - El Abbadi, Amr

PY - 2009

Y1 - 2009

N2 - Many real-world data stream analysis applications such as network monitoring, click stream analysis, and others require combining multiple streams of data arriving from multiple sources. This is referred to as multi-stream analysis. To deal with high stream arrival rates, it is desirable that such systems be capable of supporting very high processing throughput. The advent of multicore processors and powerful servers driven by these processors calls for efficient parallel designs that can effectively utilize the parallelism of the multicores, since performance improvement is possible only through effective parallelism. In this paper, we address the problem of parallelizing multi-stream analysis in the context of multicore processors. Specifically, we concentrate on parallelizing frequent elements, top-k, and frequency counting over multiple streams. We discuss the challenges in designing an efficient parallel system for multi-stream processing. Our evaluation and analysis reveals that traditional "contention" based locking results in excessive overhead and wait, which in turn leads to severe performance degradation in modern multicore architectures. Based on our analysis, we propose a "cooperation" based locking paradigm for efficient parallelization of frequency counting. The proposed "cooperation" based paradigm removes waits associated with synchronization, and allows replacing locks by much cheaper atomic synchronization primitives. Our implementation of the proposed paradigm to parallelize a well known frequency counting algorithm shows the benefits of the proposed "cooperation" based locking paradigm when compared to the traditional "contention" based locking paradigm. In our experiments, the proposed "cooperation" based design outperforms the traditional "contention" based design by a factor of 2 - 5.5X for synthetic zipfian data sets.

AB - Many real-world data stream analysis applications such as network monitoring, click stream analysis, and others require combining multiple streams of data arriving from multiple sources. This is referred to as multi-stream analysis. To deal with high stream arrival rates, it is desirable that such systems be capable of supporting very high processing throughput. The advent of multicore processors and powerful servers driven by these processors calls for efficient parallel designs that can effectively utilize the parallelism of the multicores, since performance improvement is possible only through effective parallelism. In this paper, we address the problem of parallelizing multi-stream analysis in the context of multicore processors. Specifically, we concentrate on parallelizing frequent elements, top-k, and frequency counting over multiple streams. We discuss the challenges in designing an efficient parallel system for multi-stream processing. Our evaluation and analysis reveals that traditional "contention" based locking results in excessive overhead and wait, which in turn leads to severe performance degradation in modern multicore architectures. Based on our analysis, we propose a "cooperation" based locking paradigm for efficient parallelization of frequency counting. The proposed "cooperation" based paradigm removes waits associated with synchronization, and allows replacing locks by much cheaper atomic synchronization primitives. Our implementation of the proposed paradigm to parallelize a well known frequency counting algorithm shows the benefits of the proposed "cooperation" based locking paradigm when compared to the traditional "contention" based locking paradigm. In our experiments, the proposed "cooperation" based design outperforms the traditional "contention" based design by a factor of 2 - 5.5X for synthetic zipfian data sets.

UR - http://www.scopus.com/inward/record.url?scp=77952771518&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77952771518&partnerID=8YFLogxK

M3 - Chapter

AN - SCOPUS:77952771518

VL - 2

SP - 217

EP - 228

BT - Proceedings of the VLDB Endowment

ER -