Towards transparent hardening of distributed systems

Diogo Behrens, Christof Fetzer, Flavio P. Junqueira, Marco Serafini

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

In distributed systems, errors such as data corruption or arbitrary changes to the flow of programs might cause processes to propagate incorrect state across the system. To prevent error propagation in such systems, an efficient and effective technique is to harden processes against Arbitrary State Corruption (ASC) faults through local detection, without replication. For distributed systems designed from scratch, dealing with state corruption can be made fully transparent, but requires that developers follow a few concrete design patterns. In this paper, we discuss the problem of hardening existing code bases of distributed systems transparently. Existing systems have not been designed with ASC hardening in mind, so they do not necessarily follow required design patterns. For such systems, we focus here on both performance and number of changes to the existing code base. Using memcached as an example, we identify and discuss three areas of improvement: reducing the memory overhead, improving access to state variables, and supporting multi-threading. Our initial evaluation of memcached shows that our ASC-hardened version obtains a throughput that is roughly 76% of the throughput of stock memcached with 128-byte and 1k-byte messages.

Original languageEnglish
Title of host publicationProceedings of the 9th Workshop on Hot Topics in Dependable Systems, HotDep 2013
PublisherAssociation for Computing Machinery
DOIs
Publication statusPublished - 1 Jan 2013
Event9th Workshop on Hot Topics in Dependable Systems, HotDep 2013 - Farmington, PA, United States
Duration: 3 Nov 20133 Nov 2013

Other

Other9th Workshop on Hot Topics in Dependable Systems, HotDep 2013
CountryUnited States
CityFarmington, PA
Period3/11/133/11/13

Fingerprint

Hardening
Throughput
Concretes
Data storage equipment

Keywords

  • data corruption
  • distributed systems
  • fault-tolerance

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture

Cite this

Behrens, D., Fetzer, C., Junqueira, F. P., & Serafini, M. (2013). Towards transparent hardening of distributed systems. In Proceedings of the 9th Workshop on Hot Topics in Dependable Systems, HotDep 2013 Association for Computing Machinery. https://doi.org/10.1145/2524224.2524230

Towards transparent hardening of distributed systems. / Behrens, Diogo; Fetzer, Christof; Junqueira, Flavio P.; Serafini, Marco.

Proceedings of the 9th Workshop on Hot Topics in Dependable Systems, HotDep 2013. Association for Computing Machinery, 2013.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Behrens, D, Fetzer, C, Junqueira, FP & Serafini, M 2013, Towards transparent hardening of distributed systems. in Proceedings of the 9th Workshop on Hot Topics in Dependable Systems, HotDep 2013. Association for Computing Machinery, 9th Workshop on Hot Topics in Dependable Systems, HotDep 2013, Farmington, PA, United States, 3/11/13. https://doi.org/10.1145/2524224.2524230
Behrens D, Fetzer C, Junqueira FP, Serafini M. Towards transparent hardening of distributed systems. In Proceedings of the 9th Workshop on Hot Topics in Dependable Systems, HotDep 2013. Association for Computing Machinery. 2013 https://doi.org/10.1145/2524224.2524230
Behrens, Diogo ; Fetzer, Christof ; Junqueira, Flavio P. ; Serafini, Marco. / Towards transparent hardening of distributed systems. Proceedings of the 9th Workshop on Hot Topics in Dependable Systems, HotDep 2013. Association for Computing Machinery, 2013.
@inproceedings{a295c14e73c04d45b667d6ceff35470d,
title = "Towards transparent hardening of distributed systems",
abstract = "In distributed systems, errors such as data corruption or arbitrary changes to the flow of programs might cause processes to propagate incorrect state across the system. To prevent error propagation in such systems, an efficient and effective technique is to harden processes against Arbitrary State Corruption (ASC) faults through local detection, without replication. For distributed systems designed from scratch, dealing with state corruption can be made fully transparent, but requires that developers follow a few concrete design patterns. In this paper, we discuss the problem of hardening existing code bases of distributed systems transparently. Existing systems have not been designed with ASC hardening in mind, so they do not necessarily follow required design patterns. For such systems, we focus here on both performance and number of changes to the existing code base. Using memcached as an example, we identify and discuss three areas of improvement: reducing the memory overhead, improving access to state variables, and supporting multi-threading. Our initial evaluation of memcached shows that our ASC-hardened version obtains a throughput that is roughly 76{\%} of the throughput of stock memcached with 128-byte and 1k-byte messages.",
keywords = "data corruption, distributed systems, fault-tolerance",
author = "Diogo Behrens and Christof Fetzer and Junqueira, {Flavio P.} and Marco Serafini",
year = "2013",
month = "1",
day = "1",
doi = "10.1145/2524224.2524230",
language = "English",
booktitle = "Proceedings of the 9th Workshop on Hot Topics in Dependable Systems, HotDep 2013",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Towards transparent hardening of distributed systems

AU - Behrens, Diogo

AU - Fetzer, Christof

AU - Junqueira, Flavio P.

AU - Serafini, Marco

PY - 2013/1/1

Y1 - 2013/1/1

N2 - In distributed systems, errors such as data corruption or arbitrary changes to the flow of programs might cause processes to propagate incorrect state across the system. To prevent error propagation in such systems, an efficient and effective technique is to harden processes against Arbitrary State Corruption (ASC) faults through local detection, without replication. For distributed systems designed from scratch, dealing with state corruption can be made fully transparent, but requires that developers follow a few concrete design patterns. In this paper, we discuss the problem of hardening existing code bases of distributed systems transparently. Existing systems have not been designed with ASC hardening in mind, so they do not necessarily follow required design patterns. For such systems, we focus here on both performance and number of changes to the existing code base. Using memcached as an example, we identify and discuss three areas of improvement: reducing the memory overhead, improving access to state variables, and supporting multi-threading. Our initial evaluation of memcached shows that our ASC-hardened version obtains a throughput that is roughly 76% of the throughput of stock memcached with 128-byte and 1k-byte messages.

AB - In distributed systems, errors such as data corruption or arbitrary changes to the flow of programs might cause processes to propagate incorrect state across the system. To prevent error propagation in such systems, an efficient and effective technique is to harden processes against Arbitrary State Corruption (ASC) faults through local detection, without replication. For distributed systems designed from scratch, dealing with state corruption can be made fully transparent, but requires that developers follow a few concrete design patterns. In this paper, we discuss the problem of hardening existing code bases of distributed systems transparently. Existing systems have not been designed with ASC hardening in mind, so they do not necessarily follow required design patterns. For such systems, we focus here on both performance and number of changes to the existing code base. Using memcached as an example, we identify and discuss three areas of improvement: reducing the memory overhead, improving access to state variables, and supporting multi-threading. Our initial evaluation of memcached shows that our ASC-hardened version obtains a throughput that is roughly 76% of the throughput of stock memcached with 128-byte and 1k-byte messages.

KW - data corruption

KW - distributed systems

KW - fault-tolerance

UR - http://www.scopus.com/inward/record.url?scp=84897369407&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84897369407&partnerID=8YFLogxK

U2 - 10.1145/2524224.2524230

DO - 10.1145/2524224.2524230

M3 - Conference contribution

AN - SCOPUS:84897369407

BT - Proceedings of the 9th Workshop on Hot Topics in Dependable Systems, HotDep 2013

PB - Association for Computing Machinery

ER -