Scalable Matrix inversion using MapReduce

Jingen Xiang, Huangdong Meng, Ashraf Aboulnaga

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Citations (Scopus)

Abstract

Matrix operations are a fundamental building block of many computational tasks in fields as diverse as scientific computing, machine learning, and data mining. Matrix inversion is an important matrix operation, but it is difficult to implement in today's popular parallel dataflow programming systems, such as MapReduce. The reason is that each element in the inverse of a matrix depends on multiple elements in the input matrix, so the computation is not easily partitionable. In this paper, we present a scalable and efficient technique for matrix inversion in MapReduce. Our technique relies on computing the LU decomposition of the input matrix and using that decomposition to compute the required matrix inverse. We present a technique for computing the LU decomposition and the matrix inverse using a pipeline of MapReduce jobs. We also present optimizations of this technique in the context of Hadoop. To the best of our knowledge, our technique is the first matrix inversion technique using MapReduce. We show experimentally that our technique has good scalability, enabling us to invert a 105 ×105 matrix in 5 hours on Amazon EC2. We also show that our technique outperforms ScaLAPACK, a state-of-the-art linear algebra package that uses MPI.

Original languageEnglish
Title of host publicationHPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing
PublisherAssociation for Computing Machinery
Pages177-189
Number of pages13
ISBN (Print)9781450327480
DOIs
Publication statusPublished - 1 Jan 2014
Event23rd ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC 2014 - Vancouver, BC, Canada
Duration: 23 Jun 201427 Jun 2014

Other

Other23rd ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC 2014
CountryCanada
CityVancouver, BC
Period23/6/1427/6/14

Fingerprint

Decomposition
Natural sciences computing
Computer systems programming
Linear algebra
Data mining
Learning systems
Scalability
Pipelines

Keywords

  • Analytics
  • Hadoop
  • Linear algebra
  • MapReduce
  • Matrix inversion

ASJC Scopus subject areas

  • Software

Cite this

Xiang, J., Meng, H., & Aboulnaga, A. (2014). Scalable Matrix inversion using MapReduce. In HPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing (pp. 177-189). Association for Computing Machinery. https://doi.org/10.1145/2600212.2600220

Scalable Matrix inversion using MapReduce. / Xiang, Jingen; Meng, Huangdong; Aboulnaga, Ashraf.

HPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing. Association for Computing Machinery, 2014. p. 177-189.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Xiang, J, Meng, H & Aboulnaga, A 2014, Scalable Matrix inversion using MapReduce. in HPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing. Association for Computing Machinery, pp. 177-189, 23rd ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC 2014, Vancouver, BC, Canada, 23/6/14. https://doi.org/10.1145/2600212.2600220
Xiang J, Meng H, Aboulnaga A. Scalable Matrix inversion using MapReduce. In HPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing. Association for Computing Machinery. 2014. p. 177-189 https://doi.org/10.1145/2600212.2600220
Xiang, Jingen ; Meng, Huangdong ; Aboulnaga, Ashraf. / Scalable Matrix inversion using MapReduce. HPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing. Association for Computing Machinery, 2014. pp. 177-189
@inproceedings{4c46bd87c3d64efeb7d787cff0efa1e8,
title = "Scalable Matrix inversion using MapReduce",
abstract = "Matrix operations are a fundamental building block of many computational tasks in fields as diverse as scientific computing, machine learning, and data mining. Matrix inversion is an important matrix operation, but it is difficult to implement in today's popular parallel dataflow programming systems, such as MapReduce. The reason is that each element in the inverse of a matrix depends on multiple elements in the input matrix, so the computation is not easily partitionable. In this paper, we present a scalable and efficient technique for matrix inversion in MapReduce. Our technique relies on computing the LU decomposition of the input matrix and using that decomposition to compute the required matrix inverse. We present a technique for computing the LU decomposition and the matrix inverse using a pipeline of MapReduce jobs. We also present optimizations of this technique in the context of Hadoop. To the best of our knowledge, our technique is the first matrix inversion technique using MapReduce. We show experimentally that our technique has good scalability, enabling us to invert a 105 ×105 matrix in 5 hours on Amazon EC2. We also show that our technique outperforms ScaLAPACK, a state-of-the-art linear algebra package that uses MPI.",
keywords = "Analytics, Hadoop, Linear algebra, MapReduce, Matrix inversion",
author = "Jingen Xiang and Huangdong Meng and Ashraf Aboulnaga",
year = "2014",
month = "1",
day = "1",
doi = "10.1145/2600212.2600220",
language = "English",
isbn = "9781450327480",
pages = "177--189",
booktitle = "HPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Scalable Matrix inversion using MapReduce

AU - Xiang, Jingen

AU - Meng, Huangdong

AU - Aboulnaga, Ashraf

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Matrix operations are a fundamental building block of many computational tasks in fields as diverse as scientific computing, machine learning, and data mining. Matrix inversion is an important matrix operation, but it is difficult to implement in today's popular parallel dataflow programming systems, such as MapReduce. The reason is that each element in the inverse of a matrix depends on multiple elements in the input matrix, so the computation is not easily partitionable. In this paper, we present a scalable and efficient technique for matrix inversion in MapReduce. Our technique relies on computing the LU decomposition of the input matrix and using that decomposition to compute the required matrix inverse. We present a technique for computing the LU decomposition and the matrix inverse using a pipeline of MapReduce jobs. We also present optimizations of this technique in the context of Hadoop. To the best of our knowledge, our technique is the first matrix inversion technique using MapReduce. We show experimentally that our technique has good scalability, enabling us to invert a 105 ×105 matrix in 5 hours on Amazon EC2. We also show that our technique outperforms ScaLAPACK, a state-of-the-art linear algebra package that uses MPI.

AB - Matrix operations are a fundamental building block of many computational tasks in fields as diverse as scientific computing, machine learning, and data mining. Matrix inversion is an important matrix operation, but it is difficult to implement in today's popular parallel dataflow programming systems, such as MapReduce. The reason is that each element in the inverse of a matrix depends on multiple elements in the input matrix, so the computation is not easily partitionable. In this paper, we present a scalable and efficient technique for matrix inversion in MapReduce. Our technique relies on computing the LU decomposition of the input matrix and using that decomposition to compute the required matrix inverse. We present a technique for computing the LU decomposition and the matrix inverse using a pipeline of MapReduce jobs. We also present optimizations of this technique in the context of Hadoop. To the best of our knowledge, our technique is the first matrix inversion technique using MapReduce. We show experimentally that our technique has good scalability, enabling us to invert a 105 ×105 matrix in 5 hours on Amazon EC2. We also show that our technique outperforms ScaLAPACK, a state-of-the-art linear algebra package that uses MPI.

KW - Analytics

KW - Hadoop

KW - Linear algebra

KW - MapReduce

KW - Matrix inversion

UR - http://www.scopus.com/inward/record.url?scp=84904409818&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84904409818&partnerID=8YFLogxK

U2 - 10.1145/2600212.2600220

DO - 10.1145/2600212.2600220

M3 - Conference contribution

SN - 9781450327480

SP - 177

EP - 189

BT - HPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing

PB - Association for Computing Machinery

ER -