Flying memcache: Lessons learned from different acceleration strategies

Dimitris Deyannis, Lazaros Koromilas, Giorgos Vasiliadis, Elias Athanasopoulos, Sotiris Ioannidis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Distributed key-value and always-in-memory store is employed by large and demanding services, such as Facebook and Amazon. It is apparent that generic implementations of such caches can not meet the needs of every application, therefore further research for optimizing or speeding up cache operations is required. In this paper, we present an incremental optimization strategy for accelerating the most popular key-value store, namely memcached. First we accelerate the computational unit by utilizing commodity GPUs, which offer a significant performance increase on the CPU-bound part of memcached, but only moderate performance increase under intensive I/O. We then proceed to improve I/O performance by replacing TCP with a fast UDP implementation in user-space. Putting it all together, GPUs for computational operations instead of CPUs, and UDP for communication instead of TCP, we are able to experimentally achieve 20 Gbps line-rate, which significantly outperforms the original implementation of memcached.

Original languageEnglish
Title of host publicationProceedings - IEEE 26th International Symposium
PublisherIEEE Computer Society
Pages25-32
Number of pages8
ISBN (Electronic)9781479969043
DOIs
Publication statusPublished - 1 Dec 2014
Externally publishedYes
Event26th International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2014 - Paris, France
Duration: 22 Oct 201424 Oct 2014

Other

Other26th International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2014
CountryFrance
CityParis
Period22/10/1424/10/14

Fingerprint

Program processors
Data storage equipment
Communication
Graphics processing unit

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software

Cite this

Deyannis, D., Koromilas, L., Vasiliadis, G., Athanasopoulos, E., & Ioannidis, S. (2014). Flying memcache: Lessons learned from different acceleration strategies. In Proceedings - IEEE 26th International Symposium (pp. 25-32). [6970643] IEEE Computer Society. https://doi.org/10.1109/SBAC-PAD.2014.17

Flying memcache : Lessons learned from different acceleration strategies. / Deyannis, Dimitris; Koromilas, Lazaros; Vasiliadis, Giorgos; Athanasopoulos, Elias; Ioannidis, Sotiris.

Proceedings - IEEE 26th International Symposium. IEEE Computer Society, 2014. p. 25-32 6970643.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Deyannis, D, Koromilas, L, Vasiliadis, G, Athanasopoulos, E & Ioannidis, S 2014, Flying memcache: Lessons learned from different acceleration strategies. in Proceedings - IEEE 26th International Symposium., 6970643, IEEE Computer Society, pp. 25-32, 26th International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2014, Paris, France, 22/10/14. https://doi.org/10.1109/SBAC-PAD.2014.17
Deyannis D, Koromilas L, Vasiliadis G, Athanasopoulos E, Ioannidis S. Flying memcache: Lessons learned from different acceleration strategies. In Proceedings - IEEE 26th International Symposium. IEEE Computer Society. 2014. p. 25-32. 6970643 https://doi.org/10.1109/SBAC-PAD.2014.17
Deyannis, Dimitris ; Koromilas, Lazaros ; Vasiliadis, Giorgos ; Athanasopoulos, Elias ; Ioannidis, Sotiris. / Flying memcache : Lessons learned from different acceleration strategies. Proceedings - IEEE 26th International Symposium. IEEE Computer Society, 2014. pp. 25-32
@inproceedings{b78654242f0e4ea8a9c34cc566c98ded,
title = "Flying memcache: Lessons learned from different acceleration strategies",
abstract = "Distributed key-value and always-in-memory store is employed by large and demanding services, such as Facebook and Amazon. It is apparent that generic implementations of such caches can not meet the needs of every application, therefore further research for optimizing or speeding up cache operations is required. In this paper, we present an incremental optimization strategy for accelerating the most popular key-value store, namely memcached. First we accelerate the computational unit by utilizing commodity GPUs, which offer a significant performance increase on the CPU-bound part of memcached, but only moderate performance increase under intensive I/O. We then proceed to improve I/O performance by replacing TCP with a fast UDP implementation in user-space. Putting it all together, GPUs for computational operations instead of CPUs, and UDP for communication instead of TCP, we are able to experimentally achieve 20 Gbps line-rate, which significantly outperforms the original implementation of memcached.",
author = "Dimitris Deyannis and Lazaros Koromilas and Giorgos Vasiliadis and Elias Athanasopoulos and Sotiris Ioannidis",
year = "2014",
month = "12",
day = "1",
doi = "10.1109/SBAC-PAD.2014.17",
language = "English",
pages = "25--32",
booktitle = "Proceedings - IEEE 26th International Symposium",
publisher = "IEEE Computer Society",

}

TY - GEN

T1 - Flying memcache

T2 - Lessons learned from different acceleration strategies

AU - Deyannis, Dimitris

AU - Koromilas, Lazaros

AU - Vasiliadis, Giorgos

AU - Athanasopoulos, Elias

AU - Ioannidis, Sotiris

PY - 2014/12/1

Y1 - 2014/12/1

N2 - Distributed key-value and always-in-memory store is employed by large and demanding services, such as Facebook and Amazon. It is apparent that generic implementations of such caches can not meet the needs of every application, therefore further research for optimizing or speeding up cache operations is required. In this paper, we present an incremental optimization strategy for accelerating the most popular key-value store, namely memcached. First we accelerate the computational unit by utilizing commodity GPUs, which offer a significant performance increase on the CPU-bound part of memcached, but only moderate performance increase under intensive I/O. We then proceed to improve I/O performance by replacing TCP with a fast UDP implementation in user-space. Putting it all together, GPUs for computational operations instead of CPUs, and UDP for communication instead of TCP, we are able to experimentally achieve 20 Gbps line-rate, which significantly outperforms the original implementation of memcached.

AB - Distributed key-value and always-in-memory store is employed by large and demanding services, such as Facebook and Amazon. It is apparent that generic implementations of such caches can not meet the needs of every application, therefore further research for optimizing or speeding up cache operations is required. In this paper, we present an incremental optimization strategy for accelerating the most popular key-value store, namely memcached. First we accelerate the computational unit by utilizing commodity GPUs, which offer a significant performance increase on the CPU-bound part of memcached, but only moderate performance increase under intensive I/O. We then proceed to improve I/O performance by replacing TCP with a fast UDP implementation in user-space. Putting it all together, GPUs for computational operations instead of CPUs, and UDP for communication instead of TCP, we are able to experimentally achieve 20 Gbps line-rate, which significantly outperforms the original implementation of memcached.

UR - http://www.scopus.com/inward/record.url?scp=84919442587&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84919442587&partnerID=8YFLogxK

U2 - 10.1109/SBAC-PAD.2014.17

DO - 10.1109/SBAC-PAD.2014.17

M3 - Conference contribution

AN - SCOPUS:84919442587

SP - 25

EP - 32

BT - Proceedings - IEEE 26th International Symposium

PB - IEEE Computer Society

ER -