VHT: Vertical hoeffding tree

Nicolas Kourtellis, Gianmarco Morales, Albert Bifet, Arinto Murdopo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Citations (Scopus)


IoT big data requires new machine learning methods able to scale to large size of data arriving at high speed. Decision trees are popular machine learning models since they are very effective, yet easy to interpret and visualize. In the literature, we can find distributed algorithms for learning decision trees, and also streaming algorithms, but not algorithms that combine both features. In this paper we present the Vertical Hoeffding Tree (VHT), the first distributed streaming algorithm for learning decision trees. It features a novel way of distributing decision trees via vertical parallelism. The algorithm is implemented on top of Apache SAMOA, a platform for mining big data streams, and thus able to run on real-world clusters. Our experiments to study the accuracy and throughput of VHT prove its ability to scale while attaining superior performance compared to sequential decision trees.

Original languageEnglish
Title of host publicationProceedings - 2016 IEEE International Conference on Big Data, Big Data 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages8
ISBN (Electronic)9781467390040
Publication statusPublished - 2 Feb 2017
Event4th IEEE International Conference on Big Data, Big Data 2016 - Washington, United States
Duration: 5 Dec 20168 Dec 2016


Other4th IEEE International Conference on Big Data, Big Data 2016
CountryUnited States



  • Apache SAMOA
  • big data
  • distributed streaming decision tree
  • hoeffding tree
  • IoT
  • vertical parallelism

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Hardware and Architecture

Cite this

Kourtellis, N., Morales, G., Bifet, A., & Murdopo, A. (2017). VHT: Vertical hoeffding tree. In Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016 (pp. 915-922). [7840687] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData.2016.7840687