Random forests of very fast decision trees on GPU for mining evolving big data streams

Diego Marron, Albert Bifet, Gianmarco Morales

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Random Forest is a classical ensemble method used to improve the performance of single tree classifiers. It is able to obtain superior performance by increasing the diversity of the single classifiers. However, in the more challenging context of evolving data streams, the classifier has also to be adaptive and work under very strict constraints of space and time. Furthermore, the computational load of using a large number of classifiers can make its application extremely expensive. In this work, we present a method for building Random Forests that use Very Fast Decision Trees for data streams on GPUs. We show how this method can benefit from the massive parallel architecture of GPUs, which are becoming an efficient hardware alternative to large clusters of computers. Moreover, our algorithm minimizes the communication between CPU and GPU by building the trees directly inside the GPU. We run an empirical evaluation and compare our method to two well know machine learning frameworks, VFML and MOA. Random Forests on the GPU are at least 300x faster while maintaining a similar accuracy.

Original languageEnglish
Title of host publicationECAI 2014 - 21st European Conference on Artificial Intelligence, Including Prestigious Applications of Intelligent Systems, PAIS 2014, Proceedings
PublisherIOS Press
Pages615-620
Number of pages6
Volume263
ISBN (Electronic)9781614994183
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event21st European Conference on Artificial Intelligence, ECAI 2014 - Prague, Czech Republic
Duration: 18 Aug 201422 Aug 2014

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume263
ISSN (Print)09226389

Other

Other21st European Conference on Artificial Intelligence, ECAI 2014
CountryCzech Republic
CityPrague
Period18/8/1422/8/14

Fingerprint

Decision trees
Classifiers
Parallel architectures
Computer hardware
Program processors
Learning systems
Big data
Graphics processing unit
Communication

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this

Marron, D., Bifet, A., & Morales, G. (2014). Random forests of very fast decision trees on GPU for mining evolving big data streams. In ECAI 2014 - 21st European Conference on Artificial Intelligence, Including Prestigious Applications of Intelligent Systems, PAIS 2014, Proceedings (Vol. 263, pp. 615-620). (Frontiers in Artificial Intelligence and Applications; Vol. 263). IOS Press. https://doi.org/10.3233/978-1-61499-419-0-615

Random forests of very fast decision trees on GPU for mining evolving big data streams. / Marron, Diego; Bifet, Albert; Morales, Gianmarco.

ECAI 2014 - 21st European Conference on Artificial Intelligence, Including Prestigious Applications of Intelligent Systems, PAIS 2014, Proceedings. Vol. 263 IOS Press, 2014. p. 615-620 (Frontiers in Artificial Intelligence and Applications; Vol. 263).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Marron, D, Bifet, A & Morales, G 2014, Random forests of very fast decision trees on GPU for mining evolving big data streams. in ECAI 2014 - 21st European Conference on Artificial Intelligence, Including Prestigious Applications of Intelligent Systems, PAIS 2014, Proceedings. vol. 263, Frontiers in Artificial Intelligence and Applications, vol. 263, IOS Press, pp. 615-620, 21st European Conference on Artificial Intelligence, ECAI 2014, Prague, Czech Republic, 18/8/14. https://doi.org/10.3233/978-1-61499-419-0-615
Marron D, Bifet A, Morales G. Random forests of very fast decision trees on GPU for mining evolving big data streams. In ECAI 2014 - 21st European Conference on Artificial Intelligence, Including Prestigious Applications of Intelligent Systems, PAIS 2014, Proceedings. Vol. 263. IOS Press. 2014. p. 615-620. (Frontiers in Artificial Intelligence and Applications). https://doi.org/10.3233/978-1-61499-419-0-615
Marron, Diego ; Bifet, Albert ; Morales, Gianmarco. / Random forests of very fast decision trees on GPU for mining evolving big data streams. ECAI 2014 - 21st European Conference on Artificial Intelligence, Including Prestigious Applications of Intelligent Systems, PAIS 2014, Proceedings. Vol. 263 IOS Press, 2014. pp. 615-620 (Frontiers in Artificial Intelligence and Applications).
@inproceedings{abd55dac3ab94549a534f83a3d3aa5ba,
title = "Random forests of very fast decision trees on GPU for mining evolving big data streams",
abstract = "Random Forest is a classical ensemble method used to improve the performance of single tree classifiers. It is able to obtain superior performance by increasing the diversity of the single classifiers. However, in the more challenging context of evolving data streams, the classifier has also to be adaptive and work under very strict constraints of space and time. Furthermore, the computational load of using a large number of classifiers can make its application extremely expensive. In this work, we present a method for building Random Forests that use Very Fast Decision Trees for data streams on GPUs. We show how this method can benefit from the massive parallel architecture of GPUs, which are becoming an efficient hardware alternative to large clusters of computers. Moreover, our algorithm minimizes the communication between CPU and GPU by building the trees directly inside the GPU. We run an empirical evaluation and compare our method to two well know machine learning frameworks, VFML and MOA. Random Forests on the GPU are at least 300x faster while maintaining a similar accuracy.",
author = "Diego Marron and Albert Bifet and Gianmarco Morales",
year = "2014",
doi = "10.3233/978-1-61499-419-0-615",
language = "English",
volume = "263",
series = "Frontiers in Artificial Intelligence and Applications",
publisher = "IOS Press",
pages = "615--620",
booktitle = "ECAI 2014 - 21st European Conference on Artificial Intelligence, Including Prestigious Applications of Intelligent Systems, PAIS 2014, Proceedings",

}

TY - GEN

T1 - Random forests of very fast decision trees on GPU for mining evolving big data streams

AU - Marron, Diego

AU - Bifet, Albert

AU - Morales, Gianmarco

PY - 2014

Y1 - 2014

N2 - Random Forest is a classical ensemble method used to improve the performance of single tree classifiers. It is able to obtain superior performance by increasing the diversity of the single classifiers. However, in the more challenging context of evolving data streams, the classifier has also to be adaptive and work under very strict constraints of space and time. Furthermore, the computational load of using a large number of classifiers can make its application extremely expensive. In this work, we present a method for building Random Forests that use Very Fast Decision Trees for data streams on GPUs. We show how this method can benefit from the massive parallel architecture of GPUs, which are becoming an efficient hardware alternative to large clusters of computers. Moreover, our algorithm minimizes the communication between CPU and GPU by building the trees directly inside the GPU. We run an empirical evaluation and compare our method to two well know machine learning frameworks, VFML and MOA. Random Forests on the GPU are at least 300x faster while maintaining a similar accuracy.

AB - Random Forest is a classical ensemble method used to improve the performance of single tree classifiers. It is able to obtain superior performance by increasing the diversity of the single classifiers. However, in the more challenging context of evolving data streams, the classifier has also to be adaptive and work under very strict constraints of space and time. Furthermore, the computational load of using a large number of classifiers can make its application extremely expensive. In this work, we present a method for building Random Forests that use Very Fast Decision Trees for data streams on GPUs. We show how this method can benefit from the massive parallel architecture of GPUs, which are becoming an efficient hardware alternative to large clusters of computers. Moreover, our algorithm minimizes the communication between CPU and GPU by building the trees directly inside the GPU. We run an empirical evaluation and compare our method to two well know machine learning frameworks, VFML and MOA. Random Forests on the GPU are at least 300x faster while maintaining a similar accuracy.

UR - http://www.scopus.com/inward/record.url?scp=84923205468&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84923205468&partnerID=8YFLogxK

U2 - 10.3233/978-1-61499-419-0-615

DO - 10.3233/978-1-61499-419-0-615

M3 - Conference contribution

AN - SCOPUS:84923205468

VL - 263

T3 - Frontiers in Artificial Intelligence and Applications

SP - 615

EP - 620

BT - ECAI 2014 - 21st European Conference on Artificial Intelligence, Including Prestigious Applications of Intelligent Systems, PAIS 2014, Proceedings

PB - IOS Press

ER -