SAMOA

A platform for mining big data streams

Research output: Chapter in Book/Report/Conference proceedingConference contribution

40 Citations (Scopus)

Abstract

Social media and user generated content are causing an ever growing data deluge. The rate at which we produce data is growing steadily, thus creating larger and larger streams of continuously evolving data. Online news, micro-blogs, search queries are just a few examples of these continuous streams of user activities. The value of these streams relies in their freshness and relatedness to ongoing events. However, current (de-facto standard) solutions for big data analysis are not designed to deal with evolving streams. In this talk, we offer a sneak preview of samoa, an up- coming platform for mining dig data streams. samoa is a platform for online mining in a cluster/cloud environment. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as S4 and Storm. samoa includes algorithms for the most common machine learning tasks such as classification and clustering. Finally, samoa will soon be open sourced in order to foster collaboration and research on big data stream mining.

Original languageEnglish
Title of host publicationWWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web
Pages777-778
Number of pages2
Publication statusPublished - 2013
Externally publishedYes
Event22nd International Conference on World Wide Web, WWW 2013 - Rio de Janeiro, Brazil
Duration: 13 May 201317 May 2013

Other

Other22nd International Conference on World Wide Web, WWW 2013
CountryBrazil
CityRio de Janeiro
Period13/5/1317/5/13

Fingerprint

Blogs
Data mining
Learning systems
Engines
Processing
Big data

Keywords

  • Big data
  • Data streams
  • Distributed com- puting
  • Machine learning
  • Open source
  • Stream mining

ASJC Scopus subject areas

  • Computer Networks and Communications

Cite this

Morales, G. (2013). SAMOA: A platform for mining big data streams. In WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web (pp. 777-778)

SAMOA : A platform for mining big data streams. / Morales, Gianmarco.

WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web. 2013. p. 777-778.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Morales, G 2013, SAMOA: A platform for mining big data streams. in WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web. pp. 777-778, 22nd International Conference on World Wide Web, WWW 2013, Rio de Janeiro, Brazil, 13/5/13.
Morales G. SAMOA: A platform for mining big data streams. In WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web. 2013. p. 777-778
Morales, Gianmarco. / SAMOA : A platform for mining big data streams. WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web. 2013. pp. 777-778
@inproceedings{10cb2079b8384a73a4c600b3057a2b06,
title = "SAMOA: A platform for mining big data streams",
abstract = "Social media and user generated content are causing an ever growing data deluge. The rate at which we produce data is growing steadily, thus creating larger and larger streams of continuously evolving data. Online news, micro-blogs, search queries are just a few examples of these continuous streams of user activities. The value of these streams relies in their freshness and relatedness to ongoing events. However, current (de-facto standard) solutions for big data analysis are not designed to deal with evolving streams. In this talk, we offer a sneak preview of samoa, an up- coming platform for mining dig data streams. samoa is a platform for online mining in a cluster/cloud environment. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as S4 and Storm. samoa includes algorithms for the most common machine learning tasks such as classification and clustering. Finally, samoa will soon be open sourced in order to foster collaboration and research on big data stream mining.",
keywords = "Big data, Data streams, Distributed com- puting, Machine learning, Open source, Stream mining",
author = "Gianmarco Morales",
year = "2013",
language = "English",
isbn = "9781450320382",
pages = "777--778",
booktitle = "WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web",

}

TY - GEN

T1 - SAMOA

T2 - A platform for mining big data streams

AU - Morales, Gianmarco

PY - 2013

Y1 - 2013

N2 - Social media and user generated content are causing an ever growing data deluge. The rate at which we produce data is growing steadily, thus creating larger and larger streams of continuously evolving data. Online news, micro-blogs, search queries are just a few examples of these continuous streams of user activities. The value of these streams relies in their freshness and relatedness to ongoing events. However, current (de-facto standard) solutions for big data analysis are not designed to deal with evolving streams. In this talk, we offer a sneak preview of samoa, an up- coming platform for mining dig data streams. samoa is a platform for online mining in a cluster/cloud environment. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as S4 and Storm. samoa includes algorithms for the most common machine learning tasks such as classification and clustering. Finally, samoa will soon be open sourced in order to foster collaboration and research on big data stream mining.

AB - Social media and user generated content are causing an ever growing data deluge. The rate at which we produce data is growing steadily, thus creating larger and larger streams of continuously evolving data. Online news, micro-blogs, search queries are just a few examples of these continuous streams of user activities. The value of these streams relies in their freshness and relatedness to ongoing events. However, current (de-facto standard) solutions for big data analysis are not designed to deal with evolving streams. In this talk, we offer a sneak preview of samoa, an up- coming platform for mining dig data streams. samoa is a platform for online mining in a cluster/cloud environment. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as S4 and Storm. samoa includes algorithms for the most common machine learning tasks such as classification and clustering. Finally, samoa will soon be open sourced in order to foster collaboration and research on big data stream mining.

KW - Big data

KW - Data streams

KW - Distributed com- puting

KW - Machine learning

KW - Open source

KW - Stream mining

UR - http://www.scopus.com/inward/record.url?scp=84893053113&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893053113&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781450320382

SP - 777

EP - 778

BT - WWW 2013 Companion - Proceedings of the 22nd International Conference on World Wide Web

ER -