COP

Planning conflicts for faster parallel transactional machine learning

Faisal Nawab, Divyakant Agrawal, Amr El Abbadi, Sanjay Chawla

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Machine learning techniques are essential to extracting knowledge from data. The volume of data encourages the use of parallelization techniques to extract knowledge faster. However, schemes to parallelize machine learning tasks face the trade-off between obeying strict consistency constraints and performance. Existing consistency schemes require expensive coordination between worker threads to detect conflicts, leading to poor performance. In this work, we consider the problem of improving the performance of multi-core machine learning while preserving strong consistency guarantees. We propose Conflict Order Planning (COP), a consistency scheme that exploits special properties of machine learning workloads to reduce the overhead of coordination. What is special about machine learning workloads is that the dataset is often known prior to the execution of the machine learning algorithm and is reused multiple times with different settings. We exploit this prior knowledge of the dataset to plan a partial order for concurrent execution. This planning reduces the cost of consistency significantly because it allows the use of a light-weight conflict detection operation that we call ReadWait. We demonstrate the use of COP on a Stochastic Gradient Descent algorithm for Support Vector Machines and observe better scalability and a speedup factor between 2-6x when compared to other consistency schemes.

Original languageEnglish
Title of host publicationAdvances in Database Technology - EDBT 2017
Subtitle of host publication20th International Conference on Extending Database Technology, Proceedings
PublisherOpenProceedings.org
Pages132-143
Number of pages12
Volume2017-March
ISBN (Electronic)9783893180738
DOIs
Publication statusPublished - 1 Jan 2017
Event20th International Conference on Extending Database Technology, EDBT 2017 - Venice, Italy
Duration: 21 Mar 201724 Mar 2017

Other

Other20th International Conference on Extending Database Technology, EDBT 2017
CountryItaly
CityVenice
Period21/3/1724/3/17

Fingerprint

Learning systems
Planning
Learning algorithms
Support vector machines
Scalability
Costs

ASJC Scopus subject areas

  • Information Systems
  • Software
  • Computer Science Applications

Cite this

Nawab, F., Agrawal, D., El Abbadi, A., & Chawla, S. (2017). COP: Planning conflicts for faster parallel transactional machine learning. In Advances in Database Technology - EDBT 2017: 20th International Conference on Extending Database Technology, Proceedings (Vol. 2017-March, pp. 132-143). OpenProceedings.org. https://doi.org/10.5441/002/edbt.2017.13

COP : Planning conflicts for faster parallel transactional machine learning. / Nawab, Faisal; Agrawal, Divyakant; El Abbadi, Amr; Chawla, Sanjay.

Advances in Database Technology - EDBT 2017: 20th International Conference on Extending Database Technology, Proceedings. Vol. 2017-March OpenProceedings.org, 2017. p. 132-143.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Nawab, F, Agrawal, D, El Abbadi, A & Chawla, S 2017, COP: Planning conflicts for faster parallel transactional machine learning. in Advances in Database Technology - EDBT 2017: 20th International Conference on Extending Database Technology, Proceedings. vol. 2017-March, OpenProceedings.org, pp. 132-143, 20th International Conference on Extending Database Technology, EDBT 2017, Venice, Italy, 21/3/17. https://doi.org/10.5441/002/edbt.2017.13
Nawab F, Agrawal D, El Abbadi A, Chawla S. COP: Planning conflicts for faster parallel transactional machine learning. In Advances in Database Technology - EDBT 2017: 20th International Conference on Extending Database Technology, Proceedings. Vol. 2017-March. OpenProceedings.org. 2017. p. 132-143 https://doi.org/10.5441/002/edbt.2017.13
Nawab, Faisal ; Agrawal, Divyakant ; El Abbadi, Amr ; Chawla, Sanjay. / COP : Planning conflicts for faster parallel transactional machine learning. Advances in Database Technology - EDBT 2017: 20th International Conference on Extending Database Technology, Proceedings. Vol. 2017-March OpenProceedings.org, 2017. pp. 132-143
@inproceedings{cee495b845954bb79c5c59788ad66deb,
title = "COP: Planning conflicts for faster parallel transactional machine learning",
abstract = "Machine learning techniques are essential to extracting knowledge from data. The volume of data encourages the use of parallelization techniques to extract knowledge faster. However, schemes to parallelize machine learning tasks face the trade-off between obeying strict consistency constraints and performance. Existing consistency schemes require expensive coordination between worker threads to detect conflicts, leading to poor performance. In this work, we consider the problem of improving the performance of multi-core machine learning while preserving strong consistency guarantees. We propose Conflict Order Planning (COP), a consistency scheme that exploits special properties of machine learning workloads to reduce the overhead of coordination. What is special about machine learning workloads is that the dataset is often known prior to the execution of the machine learning algorithm and is reused multiple times with different settings. We exploit this prior knowledge of the dataset to plan a partial order for concurrent execution. This planning reduces the cost of consistency significantly because it allows the use of a light-weight conflict detection operation that we call ReadWait. We demonstrate the use of COP on a Stochastic Gradient Descent algorithm for Support Vector Machines and observe better scalability and a speedup factor between 2-6x when compared to other consistency schemes.",
author = "Faisal Nawab and Divyakant Agrawal and {El Abbadi}, Amr and Sanjay Chawla",
year = "2017",
month = "1",
day = "1",
doi = "10.5441/002/edbt.2017.13",
language = "English",
volume = "2017-March",
pages = "132--143",
booktitle = "Advances in Database Technology - EDBT 2017",
publisher = "OpenProceedings.org",

}

TY - GEN

T1 - COP

T2 - Planning conflicts for faster parallel transactional machine learning

AU - Nawab, Faisal

AU - Agrawal, Divyakant

AU - El Abbadi, Amr

AU - Chawla, Sanjay

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Machine learning techniques are essential to extracting knowledge from data. The volume of data encourages the use of parallelization techniques to extract knowledge faster. However, schemes to parallelize machine learning tasks face the trade-off between obeying strict consistency constraints and performance. Existing consistency schemes require expensive coordination between worker threads to detect conflicts, leading to poor performance. In this work, we consider the problem of improving the performance of multi-core machine learning while preserving strong consistency guarantees. We propose Conflict Order Planning (COP), a consistency scheme that exploits special properties of machine learning workloads to reduce the overhead of coordination. What is special about machine learning workloads is that the dataset is often known prior to the execution of the machine learning algorithm and is reused multiple times with different settings. We exploit this prior knowledge of the dataset to plan a partial order for concurrent execution. This planning reduces the cost of consistency significantly because it allows the use of a light-weight conflict detection operation that we call ReadWait. We demonstrate the use of COP on a Stochastic Gradient Descent algorithm for Support Vector Machines and observe better scalability and a speedup factor between 2-6x when compared to other consistency schemes.

AB - Machine learning techniques are essential to extracting knowledge from data. The volume of data encourages the use of parallelization techniques to extract knowledge faster. However, schemes to parallelize machine learning tasks face the trade-off between obeying strict consistency constraints and performance. Existing consistency schemes require expensive coordination between worker threads to detect conflicts, leading to poor performance. In this work, we consider the problem of improving the performance of multi-core machine learning while preserving strong consistency guarantees. We propose Conflict Order Planning (COP), a consistency scheme that exploits special properties of machine learning workloads to reduce the overhead of coordination. What is special about machine learning workloads is that the dataset is often known prior to the execution of the machine learning algorithm and is reused multiple times with different settings. We exploit this prior knowledge of the dataset to plan a partial order for concurrent execution. This planning reduces the cost of consistency significantly because it allows the use of a light-weight conflict detection operation that we call ReadWait. We demonstrate the use of COP on a Stochastic Gradient Descent algorithm for Support Vector Machines and observe better scalability and a speedup factor between 2-6x when compared to other consistency schemes.

UR - http://www.scopus.com/inward/record.url?scp=85046399019&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046399019&partnerID=8YFLogxK

U2 - 10.5441/002/edbt.2017.13

DO - 10.5441/002/edbt.2017.13

M3 - Conference contribution

VL - 2017-March

SP - 132

EP - 143

BT - Advances in Database Technology - EDBT 2017

PB - OpenProceedings.org

ER -