SAMOA

Scalable advanced massive online analysis

Gianmarco Morales, Albert Bifet

Research output: Contribution to journalArticle

76 Citations (Scopus)

Abstract

samoa (Scalable Advanced Massive Online Analysis) is a platform for mining big data streams. It provides a collection of distributed streaming algorithms for the most common data mining and machine learning tasks such as classification, clustering, and regression, as well as programming abstractions to develop new algorithms. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as Storm, S4, and Samza. samoa is written in Java, is open source, and is available at http://samoa-project.net under the Apache Software License version 2.0.

Original languageEnglish
Pages (from-to)149-153
Number of pages5
JournalJournal of Machine Learning Research
Volume16
Publication statusPublished - 2015
Externally publishedYes

Fingerprint

Data mining
Learning systems
Stream Processing
Distributed Processing
Engines
Data Streams
Streaming
Open Source
Java
Mining
Data Mining
Machine Learning
Engine
Programming
Processing
Regression
Clustering
Software
Big data
Architecture

Keywords

  • Classification
  • Clustering
  • Data streams
  • Distributed systems
  • Machine learning
  • Regression
  • Toolbox

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Software
  • Statistics and Probability
  • Artificial Intelligence

Cite this

SAMOA : Scalable advanced massive online analysis. / Morales, Gianmarco; Bifet, Albert.

In: Journal of Machine Learning Research, Vol. 16, 2015, p. 149-153.

Research output: Contribution to journalArticle

@article{23fcaa0557404a3a998b4cf77c96e888,
title = "SAMOA: Scalable advanced massive online analysis",
abstract = "samoa (Scalable Advanced Massive Online Analysis) is a platform for mining big data streams. It provides a collection of distributed streaming algorithms for the most common data mining and machine learning tasks such as classification, clustering, and regression, as well as programming abstractions to develop new algorithms. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as Storm, S4, and Samza. samoa is written in Java, is open source, and is available at http://samoa-project.net under the Apache Software License version 2.0.",
keywords = "Classification, Clustering, Data streams, Distributed systems, Machine learning, Regression, Toolbox",
author = "Gianmarco Morales and Albert Bifet",
year = "2015",
language = "English",
volume = "16",
pages = "149--153",
journal = "Journal of Machine Learning Research",
issn = "1532-4435",
publisher = "Microtome Publishing",

}

TY - JOUR

T1 - SAMOA

T2 - Scalable advanced massive online analysis

AU - Morales, Gianmarco

AU - Bifet, Albert

PY - 2015

Y1 - 2015

N2 - samoa (Scalable Advanced Massive Online Analysis) is a platform for mining big data streams. It provides a collection of distributed streaming algorithms for the most common data mining and machine learning tasks such as classification, clustering, and regression, as well as programming abstractions to develop new algorithms. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as Storm, S4, and Samza. samoa is written in Java, is open source, and is available at http://samoa-project.net under the Apache Software License version 2.0.

AB - samoa (Scalable Advanced Massive Online Analysis) is a platform for mining big data streams. It provides a collection of distributed streaming algorithms for the most common data mining and machine learning tasks such as classification, clustering, and regression, as well as programming abstractions to develop new algorithms. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as Storm, S4, and Samza. samoa is written in Java, is open source, and is available at http://samoa-project.net under the Apache Software License version 2.0.

KW - Classification

KW - Clustering

KW - Data streams

KW - Distributed systems

KW - Machine learning

KW - Regression

KW - Toolbox

UR - http://www.scopus.com/inward/record.url?scp=84923923168&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84923923168&partnerID=8YFLogxK

M3 - Article

VL - 16

SP - 149

EP - 153

JO - Journal of Machine Learning Research

JF - Journal of Machine Learning Research

SN - 1532-4435

ER -