### Abstract

Matrix operations are a fundamental building block of many computational tasks in fields as diverse as scientific computing, machine learning, and data mining. Matrix inversion is an important matrix operation, but it is difficult to implement in today's popular parallel dataflow programming systems, such as MapReduce. The reason is that each element in the inverse of a matrix depends on multiple elements in the input matrix, so the computation is not easily partitionable. In this paper, we present a scalable and efficient technique for matrix inversion in MapReduce. Our technique relies on computing the LU decomposition of the input matrix and using that decomposition to compute the required matrix inverse. We present a technique for computing the LU decomposition and the matrix inverse using a pipeline of MapReduce jobs. We also present optimizations of this technique in the context of Hadoop. To the best of our knowledge, our technique is the first matrix inversion technique using MapReduce. We show experimentally that our technique has good scalability, enabling us to invert a 10^{5} ×10^{5} matrix in 5 hours on Amazon EC2. We also show that our technique outperforms ScaLAPACK, a state-of-the-art linear algebra package that uses MPI.

Original language | English |
---|---|

Title of host publication | HPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing |

Publisher | Association for Computing Machinery |

Pages | 177-189 |

Number of pages | 13 |

ISBN (Print) | 9781450327480 |

DOIs | |

Publication status | Published - 1 Jan 2014 |

Event | 23rd ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC 2014 - Vancouver, BC, Canada Duration: 23 Jun 2014 → 27 Jun 2014 |

### Other

Other | 23rd ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC 2014 |
---|---|

Country | Canada |

City | Vancouver, BC |

Period | 23/6/14 → 27/6/14 |

### Fingerprint

### Keywords

- Analytics
- Hadoop
- Linear algebra
- MapReduce
- Matrix inversion

### ASJC Scopus subject areas

- Software

### Cite this

*HPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing*(pp. 177-189). Association for Computing Machinery. https://doi.org/10.1145/2600212.2600220

**Scalable Matrix inversion using MapReduce.** / Xiang, Jingen; Meng, Huangdong; Aboulnaga, Ashraf.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*HPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing.*Association for Computing Machinery, pp. 177-189, 23rd ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC 2014, Vancouver, BC, Canada, 23/6/14. https://doi.org/10.1145/2600212.2600220

}

TY - GEN

T1 - Scalable Matrix inversion using MapReduce

AU - Xiang, Jingen

AU - Meng, Huangdong

AU - Aboulnaga, Ashraf

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Matrix operations are a fundamental building block of many computational tasks in fields as diverse as scientific computing, machine learning, and data mining. Matrix inversion is an important matrix operation, but it is difficult to implement in today's popular parallel dataflow programming systems, such as MapReduce. The reason is that each element in the inverse of a matrix depends on multiple elements in the input matrix, so the computation is not easily partitionable. In this paper, we present a scalable and efficient technique for matrix inversion in MapReduce. Our technique relies on computing the LU decomposition of the input matrix and using that decomposition to compute the required matrix inverse. We present a technique for computing the LU decomposition and the matrix inverse using a pipeline of MapReduce jobs. We also present optimizations of this technique in the context of Hadoop. To the best of our knowledge, our technique is the first matrix inversion technique using MapReduce. We show experimentally that our technique has good scalability, enabling us to invert a 105 ×105 matrix in 5 hours on Amazon EC2. We also show that our technique outperforms ScaLAPACK, a state-of-the-art linear algebra package that uses MPI.

AB - Matrix operations are a fundamental building block of many computational tasks in fields as diverse as scientific computing, machine learning, and data mining. Matrix inversion is an important matrix operation, but it is difficult to implement in today's popular parallel dataflow programming systems, such as MapReduce. The reason is that each element in the inverse of a matrix depends on multiple elements in the input matrix, so the computation is not easily partitionable. In this paper, we present a scalable and efficient technique for matrix inversion in MapReduce. Our technique relies on computing the LU decomposition of the input matrix and using that decomposition to compute the required matrix inverse. We present a technique for computing the LU decomposition and the matrix inverse using a pipeline of MapReduce jobs. We also present optimizations of this technique in the context of Hadoop. To the best of our knowledge, our technique is the first matrix inversion technique using MapReduce. We show experimentally that our technique has good scalability, enabling us to invert a 105 ×105 matrix in 5 hours on Amazon EC2. We also show that our technique outperforms ScaLAPACK, a state-of-the-art linear algebra package that uses MPI.

KW - Analytics

KW - Hadoop

KW - Linear algebra

KW - MapReduce

KW - Matrix inversion

UR - http://www.scopus.com/inward/record.url?scp=84904409818&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84904409818&partnerID=8YFLogxK

U2 - 10.1145/2600212.2600220

DO - 10.1145/2600212.2600220

M3 - Conference contribution

AN - SCOPUS:84904409818

SN - 9781450327480

SP - 177

EP - 189

BT - HPDC 2014 - Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing

PB - Association for Computing Machinery

ER -