⚠️ Profile page is outdated

This person has left the Institute of Distributed Systems and is no longer working at Ulm University. The profile page will not be updated anymore but is still available for archiving.

Kontakt

Sprechzeiten

Please send me an Email if you need an appointment.

Gerhard Habiger

Gerhard Habiger received his MSc in Computer Science from Ulm University, Germany, in 2015.
Since then he has been working as a research and teaching assistant in the Institute of Distributed Systems at Ulm University, while pursuing his PhD.


His research topics include the advancement of SMR-systems and deterministic scheduling, as well as the performance optimization of these systems, using, among others, techniques like Reinforcement Learning.

Research Interests

  • Distributed Systems
    • Parallelized state machine replication
    • Deterministic schedulers
    • Self-optimization of parallelized SMR
      • Scheduler reconfiguration
      • Resource efficiency
  • Reinforcement Learning for online optimization problems

Publications

2021

16.
C. Berger, P. Eichhammer, H. P. Reiser, J. Domaschka, F. J. Hauck and G. Habiger, "A survey on resilience in IoT: Taxonomy, classification and discussion of resilience mechanisms", ACM Comp. Surv., vol. 54, no. 7, Jun. 2021.
DOI:10.1145/3462513
15.
J. Köstler, H. P. Reiser, G. Habiger and F. J. Hauck, "SmartStream: Towards Byzantine Resilient Data Streaming" in 36th Ann. ACM Symp. on Appl. Comp. (SAC), New York, NY, USA: ACM, Mä. 2021, pp. 213–222.
DOI:10.1145/3412841.3441904


Abstract:
Data streaming platforms connect heterogeneous services through the publish-subscribe paradigm. Currently available platforms provide protection against crash faults, but are not resistant against Byzantine faults like arbitrary hardware faults and intrusions. State machine replication can provide this protection, but the higher resource requirements and the more elaborated communication primitives usually result in a higher overall complexity and a non-negligible performance degradation. This is especially true for data streaming if the default textbook approach of integrating the service into a replicated state machine is followed without further adaptions. The standard state management with state logs and snapshots and without any partitioning scheme limits both performance and scalability in a way those systems become unusable in practice. That is why we propose SmartStream, a topic-based Byzantine fault-tolerant data streaming platform that harmonizes the competing concepts of both systems and leverages the specific characteristics of data streaming, namely the append-only semantics of the application state and its partitionable structure. We show its effectiveness in a prototype implementation and evaluate its performance. The evaluation results show a moderate drop in system throughput when compared to state-of-the-art data streaming platforms like Apache Kafka, but reasonable overall performance considering the stronger resilience guarantees.
14.
J. K\"{o}stler, H. P. Reiser, G. Habiger and F. J. Hauck, "SmartStream: Towards Efficient Byzantine Resilient Data Streaming through Speculation and Sharding", SIGAPP Appl. Comput. Rev., vol. 21, no. 3, pp. 19–32, Okt. 2021. ACM.
DOI:10.1145/3493499.3493501


Abstract:
Data streaming platforms connect heterogeneous services through the publish-subscribe paradigm. Currently available platforms provide protection against crash faults, but are not resistant against Byzantine faults like arbitrary hardware faults and intrusions. State machine replication can provide this protection, but the higher resource requirements and the more elaborate communication primitives usually result in a higher overall complexity and a non-negligible performance degradation. As data streaming operates on highly-partitionable append-only state, some of these performance losses can be counteracted by applying speculative execution and sharding. We show the effectiveness of these concepts in a prototype implementation, which only results in a reasonable drop in system throughput and latency during average system utilization, when compared to state-of-the-art data streaming platforms like Apache Kafka, while providing stronger resilience guarantees.
13.
J. Köstler, H. P. Reiser, G. Habiger and F. J. Hauck, "SmartStream: Towards Efficient Byzantine Resilient Data Streaming through Speculation and Sharding", SIGAPP Appl. Comput. Rev., vol. 21, no. 3, pp. 19-32, Okt. 2021. ACM.
DOI:10.1145/3493499.3493501


Abstract:
Data streaming platforms connect heterogeneous services through the publish-subscribe paradigm. Currently available platforms provide protection against crash faults, but are not resistant against Byzantine faults like arbitrary hardware faults and intrusions. State machine replication can provide this protection, but the higher resource requirements and the more elaborate communication primitives usually result in a higher overall complexity and a non-negligible performance degradation. As data streaming operates on highly-partitionable append-only state, some of these performance losses can be counteracted by applying speculative execution and sharding. We show the effectiveness of these concepts in a prototype implementation, which only results in a reasonable drop in system throughput and latency during average system utilization, when compared to state-of-the-art data streaming platforms like Apache Kafka, while providing stronger resilience guarantees.

2020

12.
G. Habiger, F. J. Hauck, H. P. Reiser and J. Köstler, "Self-optimising application-agnostic multithreading for replicated state machines" in Proc. of the 39th Int. Symp. on Rel. Distr. Sys. (SRDS), 2020.
DOI:10.1109/SRDS51746.2020.00024

2019

11.
J. Domaschka, C. Berger, H. P. Reiser, P. Eichhammer, F. Griesinger, J. Pietron, M. Tichy, F. J. Hauck and G. Habiger, "SORRIR: a resilient self-organizing middleware for IoT applications" in Proc. of 6th Int. Worksh. on Middlew. and App. for the Internet of Things (M4IoT), Davis, CA, Dez. 2019, pp. 13-16.
DOI:10.1145/3366610.3368098
10.
G. Habiger and F. J. Hauck, "Systems support for efficient state-machine replication" in Tagungsband des FB-SYS Herbsttreffens 2019, Osnabrück, GI, 2019.
DOI:10.18420/fbsys2019-04
9.
P. Eichhammer, C. Berger, H. P. Reiser, J. Domaschka, F. J. Hauck, G. Habiger, F. Griesinger and J. Pietron, "Towards a robust, self-organizing IoT platform for secure and dependable service execution" in Tagungsband des FB-SYS Herbsttreffens 2019, Osnabrück, GI, 2019.
DOI:10.18420/fbsys2019-03

2018

8.
G. Habiger, F. J. Hauck, J. Köstler and H. P. Reiser, "Resource-Efficient State-Machine Replication with Multithreading and Vertical Scaling" in Proc. of the 14th Eur. Dep. Comp. Conf. (EDCC), Iaşi, Romania, IEEE, Sep. 2018.
DOI:10.1109/EDCC.2018.00024


Abstract:
State-machine replication (SMR) enables transparent and delayless masking of node faults. It can tolerate crash faults and malicious misbehavior, but usually comes with high resource costs, not only by requiring multiple active replicas, but also by providing the replicas with enough resources for the expected peak load. This paper presents a vertical resource-scaling solution for SMR systems in virtualized environments, which can dynamically adapt the number of available cores to current load. In similar approaches, benefits of CPU core scaling are usually small due to the inherent sequential execution of SMR systems in order to achieve determinism. In our approach, we utilize sophisticated deterministic multithreading to avoid this bottleneck and experimentally demonstrate that core scaling then allows SMR systems to effectively tailor resources to service load, dramatically reducing service provider costs.

2017

7.
B. Erb, D. Meißner, G. Habiger, J. Pietron and F. Kargl, "Consistent Retrospective Snapshots in Distributed Event-sourced Systems" in Proc. of the Int. Conf. on Netw. Sys. (NetSys), Göttingen, 2017, Mä. 2017.
DOI:10.1109/NetSys.2017.7903947


Abstract:
An increasing number of distributed, event-based systems adopt an architectural style called event sourcing, in which entities keep their entire history in an event log. Event sourcing enables data lineage and allows entities to rebuild any previous state. Restoring previous application states is a straightforward task in event-sourced systems with a global and totally ordered event log. However, the extraction of causally consistent snapshots from distributed, individual event logs is rendered non-trivial due to causal relationships between communicating entities. High dynamicity of entities increases the complexity of such reconstructions even more. We present approaches for retrospective and global state extraction of event-sourced applications based on distributed event logs. We provide an overview on historical approaches towards distributed debugging and breakpointing, which are closely related to event log-based state reconstruction. We then introduce and evaluate our approach for non-local state extraction from distributed event logs, which is specifically adapted for dynamic and asynchronous event-sourced systems.

2016

6.
F. J. Hauck, G. Habiger and J. Domaschka, "UDS: a novel and flexible scheduling algorithm for deterministic multithreading" in Proc. of the 35th Int. Symp. on Reliable Distrib. Sys. (SRDS), Budapest, Hungry, 2016-09-26, Sep. 2016.
DOI:10.1109/SRDS.2016.030
5.
B. Erb, G. Habiger and F. J. Hauck, "On the Potential of Event Sourcing for Retroactive Actor-based Programming" in Proc. of the 1st Workshop on Progr. Models and Lang. for Distrib. Comp., Rome, Italy, 2016-07-17, Jul. 2016.
DOI:10.1145/2957319.2957378


Abstract:
The actor model is an established programming model for distributed applications. Combining event sourcing with the actor model allows the reconstruction of previous states of an actor. When this event sourcing approach for actors is enhanced with additional causality information, novel types of actor-based, retroactive computations are possible. A globally consistent state of all actors can be reconstructed retrospectively. Even retroactive changes of actor behavior, state, or messaging are possible, with partial recomputations and projections of changes in the past. We believe that this approach may provide beneficial features to actor-based systems, including retroactive bugfixing of applications, decoupled asynchronous global state reconstruction for recovery, simulations, and exploration of distributed applications and algorithms.
4.
G. Habiger, "Implementation of asynchronous request handling in BFT SMaRt", Institute of Distributed Systems, 2016.

Abstract:
Current research efforts of our institute include a project on deterministic scheduling of multithreaded applications for State Machine Replication (SMR) systems with Byzantine Fault Tolerance (BFT). One part of this project aims to integrate our own work on deterministic scheduling with the BFT SMaRt library. Currently, BFT SMaRt only supports synchronous request-response patterns, whereas our planned SMR platform needs these patterns to be asynchronous. The goals of this project are (i) to analyze the existing BFT SMaRt codebase, (ii) to implement the necessary interfaces for asynchronous request handling and (iii) to integrate these changes into the existing BFT SMaRt libraries.
3.
G. Habiger, F. J. Hauck, J. Köstler and H. P. Reiser, "Vertikale Skalierung für aktiv replizierte Dienste in Cloud-Infrastrukturen" , 2016.
Datei:pdfhttps://www.uni-ulm.de/fileadmin/website_uni_ulm/iui.inst.200/files/publikationen/Habiger16.pdf

2015

2.
G. Habiger, "Distributed Versioning and Snapshot Mechanisms on Event-Sourced Graphs", Masterarbeit VS-M13-2015, Institut für Verteilte Systeme, Universität Ulm, Okt. 2015.

Abstract:
Two interesting approaches to tackle many of today's problems in large scale data processing and live query resolution on big graph datasets have emerged in recent years. Firstly, after Google's presentation of its graph computing platform Pregel in 2010, an influx of more or less similar platforms could be observed. These platforms all share the goal of providing highly performant data mining and analysis capabilities to users, enabling a wide variety of today's technologies like ranking web pages in the the web graph of the WWW or analysing user interactions in social networks. Secondly, the old concept of message logging for failure recovery was rediscovered and combined with event based computing in the early 2000s and is now known as event sourcing. This approach to system design keeps persistent logs of every single change of all entities in a computation, providing highly interesting options like state restoration by replaying old events, retroactive event modifications, phenomenal debugging capabilities and many more. A recently published paper suggests the merging of those two approaches to create a hybrid event-sourced graph computing platform. This platform would show unique characteristics compared to other known solutions. For example, computations on temporal data can yield information about the evolution of a graph and not only its current state. Furthermore, for backups or to enable offline analysis on large compute clusters, snapshot extraction – i.e. reproducing any consistent global state the graph has ever been in – from the event logs produced by event-sourced graph computations is possible. This thesis provides one of the first major works related to this proposed hybrid platform and provides background knowledge related to these aforementioned topics. It presents a thorough overview over the current state-of-the-art in graph computing platforms and causality tracking in distributed systems and finally develops an efficient mechanism for extracting arbitrary, consistent global snapshots from a distributed event log produced by an event-sourced graph computation.

2012

1.
G. Habiger, "Security and Privacy of Implantable Medical Devices", Bachelorarbeit, Institut für Medieninformatik, Universität Ulm, Mai 2012.

Abstract:
The high demand and growing market for Implantable Medical Devices shows a widespread need for invisible and unobtrusive medical treatment of medical conditions like e.g. diabetes or cardiac arrythmia. The advancements of technology in this field make devices increasingly inter-connected, allowing them to communicate wirelessly with sensors, medical telemetry systems or device programmers. However, the increased complexity and the fact that many medical devices nowadays can be programmed and controlled via wireless links, brings with it a plethora of vulnerabilities. Adversaries capable of imitating authorized device programmers could gain control over IMDs, leading to serious injury or even death of their users. Other attacks could target a patient’s private medical data. This thesis strives to give an overview over the current state of research and recent developments in the field of IMD-security and privacy. It will discuss known vulnerabilities and possible defensive measures and evaluate the current risks involved with using a modern IMD. Based on these discussions, design concerns for IMD manufacturers are then summarized.

Teaching

Exercises for Lectures

  • Fault Tolerant Distributed Systems - FTDS (english) [SuSe 2021]
  • Multimedia Communication - MMK (german) [WiSe 20/21]
  • Fault Tolerant Distributed Systems - FTDS (english) [SuSe 2020]
  • Multimedia Communication - MMK (german) [WiSe 19/20]
  • Fault Tolerant Distributed Systems - FTDS (english) [SuSe 2019]
  • Multimedia Communication - MMK (german) [WiSe 18/19]
  • Fault Tolerant Distributed Systems - FTDS (english) [SuSe 2018]
  • Multimedia Communication - MMK (german) [WiSe 17/18]
  • Fault Tolerant Distributed Systems - FTDS (english) [SuSe 2017]
  • Architectures for Distributed Internet Services - AvID (german) [SuSe 2017]
  • Multimedia Communication - MMK (german) [WiSe 16/17]
  • Architectures for Distributed Internet Services - AvID (german) [SuSe 2016]

Seminars

  • Research Trends in Distributed Systems - RTDS [WiSe 21/22]
  • Proseminar Effective Java - KTT [WiSe 2021/22]
  • Research Trends in Distributed Systems - RTDS [SuSe 2021]
  • Research Trends in Distributed Systems - RTDS [WiSe 20/21]
  • Research Trends in Distributed Systems - RTDS [SuSe 2020]
  • Research Trends in Distributed Systems - RTDS [WiSe 19/20]
  • Research Trends in Distributed Systems - RTDS [SuSe 2019]
  • Research Trends in Distributed Systems - RTDS [WiSe 18/19]
  • Research Trends in Distributed Systems - RTDS [SuSe 2018]
  • Research Trends in Distributed Systems - RTDS [WiSe 17/18]
  • Research Trends in Distributed Systems - RTDS [SuSe 2017]
  • Research Trends in Distributed Systems - RTDS [WiSe 16/17]
  • Research Trends in Distributed Systems - RTDS [SuSe 2016]
  • Proseminar Effective Java - KTT (German) [SuSe 2016]

Lab Projects

  • Development of Middleware Systems - MWE [SuSe 2018]
  • Development of Middleware Systems - MWE [WiSe 17/18]
  • Development of Middleware Systems - MWE [SuSe 2017]
  • Development of Middleware Systems - MWE [WiSe 16/17]
  • Development of Middleware Systems - MWE [SuSe 2016]

 

Abschlussarbeiten und Projekte

M. Kempfle, „Consensus replacement in a modular state-machine replication framework: trial and evaluation,“ Masterarbeit VS-2021-11M, G. Habiger (Betreuung), F. J. Hauck (Prüfer), Inst. f. Vert. Sys., Univ. Ulm, 2021 – Abgeschlossen.
U. Eser, „Benchmarking of BFT-SMaRt with YCSB,“ Projektarbeit VS-2021-22P, G. Habiger (Betreuung), F. J. Hauck (Prüfer), Inst. of Distr. Sys., Ulm Univ., 2021 – Abgeschlossen.
The YCSB is an open source benchmarking specification and framework for evaluating the performance of database-like software. Since its release in 2010, it has evolved into a de facto stan-dard for benchmarking commercial products like Redis, HBase, Cassandra and many others. Not only in the industry, but also in the scientific community, many researchers are using the YCSB to evaluate and compare their scientific findings and software artifacts against other published solutions. This project should create a YCSB Client implementation and workloads for benchmarking our platform for replicated state-machines built within our institute in the recent years. State-machine replication is a technique for providing high levels of fault-tolerance. In research projects we extended the existing BFT-SMaRt framework for our use. In the future we would like to use the results of this project to evaluate performance changes when extending the framework further. Students with previous knowledge in these areas are preferred, but the necessary skills can also be acquired during the project. At the end of the project, a thorough comparison of the newly YCSB-enabled software artifacts should be conducted.
M. Benz, „Modular State Machine Replication,“ Masterarbeit, G. Habiger (Betreuung), F. J. Hauck (Prüfer), Inst. of Distr. Sys., Ulm Univ., 2019 – Abgeschlossen.
M. Kempfle, „Integration of etcd4j and BFT-SMaRt Parallel,“ Projektarbeit, G. Habiger (Betreuung), F. J. Hauck (Prüfer), Inst. of Distr. Sys., Ulm Univ., 2019 – Abgeschlossen.
In our recent research, teaching, and project work, we re-implemented etcd – a popular and well-known fault-tolerant key-value store – in Java, resulting in a multithreaded version that is easier to integrate into our research prototypes. Additionally, recent projects have looked at the State Machine Replication framework BFT-SMaRt, especially our own parallelized version of it, and worked on a way to enable snapshotting functionality. This project aims at integrating these two prototypes – etcd4J and BFT-SMaRt Parallel – into one working project, to yield a fully working, state machine replicated and fault-tolerant version of etcd4J. Further work includes testing and benchmarking this solution. The project can be modified to fit 8 or 16 ECTS.
T. Nguyen, „Parallelizing a Java Re-implementation of etcd,“ Bachelorarbeit, G. Habiger (Betreuung), F. J. Hauck (Prüfer), Inst. of Distr. Sys., Ulm Univ., 2018 – Abgeschlossen.
Ein kürzlich abgeschlossenes Studierendenprojekt reimplemen-tierte den verteilten Key-Value Store etcd in Java. Um diese Implementierung für zukünftige Forschung an fehlertoleranten Systemen weiter zu verwenden, soll dieses Projekt die Java-Implementierung parallelisieren. Durch intelligentes Locking in der zugrunde liegenden Datenstruktur soll ein möglichst hoher Grad an Parallelität erreicht werden, während die Korrektheit des Systems in allen Fällen bestehen bleibt. Anschließend sollen durch Messungen die Performance-veränderungen gegenüber der sequentiellen Variante gezeigt werden.
M. Benz, „Enabling Snapshotting in Multithreaded BFT-SMaRt,“ Projektarbeit, G. Habiger (Betreuung), F. J. Hauck (Prüfer), Inst. of Distr. Sys., Ulm Univ., 2018 – Abgeschlossen.
BFT-SMaRt ist eine Java Library für die einfache Entwicklung von Anwendungen, die durch State Machine Replication ausfallsicher und sogar robust gegenüber beliebigen (byzantinischen) Fehlern laufen können. Ziel unserer Forschung ist die Beschleunigung von State Machine Replication, wozu wir in den letzten Monaten BFT-SMaRt um Multithreading-Komponenten erweitert haben. Ein Problem hierbei ist, dass das für die Fehlertoleranz zwingend notwendige Snapshotting erheblich erschwert wird und zur Zeit für unsere Optimierungen ausgeschaltet bleiben muss. Dieses Projekt soll auf der Basis vorangegangener Projekte Wege ergründen, wie Snapshotting in Verbindung mit Multithreading in BFT-SMaRt reaktiviert werden kann, sowie Implementierungen und Messungen der gefundenen Ansätze bereitstellen.
O. Finnendahl, „Enabling Snapshotting in Multithreaded BFT-SMaRt,“ G. Habiger (Betreuung), F. J. Hauck (Prüfer), Inst. of Distr. Sys., Ulm Univ., 2018 – Abgeschlossen.
Für die Verwaltung großer verteilter Systeme werden in der Regel Koordinierungsdienste wie ZooKeeper oder etcd eingesetzt. Um maximale Ausfallsicherheit und dennoch starke Konsistenzgarantien zu gewährleisten, sind solche Koordinierungsdienste durch State Machine Replication repliziert. Ein Problem dieses Replizierungsansatzes ist die zumeist schlechte Ausnutzung heutiger Multicore-Systeme. In unserer Forschung beschäftigen wir uns mit Methoden, SMR-replizierte Software zu beschleunigen und auch Multithreading zu erlauben. Hierfür sind unter anderem vergleichende Messungen mit etcd vorgesehen. Da etcd selbst komplett in Go geschrieben ist, sich unsere Forschung momentan jedoch auf Java konzentriert, soll in diesem Projekt die nach außen hin sichtbare API und dafür nötige Funktionalität von etcd in Java nachimplementiert werden. Es können dabei nahezu alle Schwierigkeiten die sich durch die Verteilung ergeben (Netzwerkkommunikation und -fehler, Konsensus, etc.) vernachlässigt werden. Es wird primär die Funktionalität, die eine nicht-verteilte Installation von etcd mit einem Host bereitstellen würde, verlangt.
M. Jäckle and C. Vogel, „Provisioning, Monitoring and Snapshotting of BFT-SMaRt,“ G. Habiger (Betreuung), F. J. Hauck (Prüfer), Inst. of Distr. Sys., Ulm Univ., 2017 – Abgeschlossen.
This project deals with the implementation of a platform to support the further development of the BFT-SMaRt Java library. In addition, a currently disabled feature, called checkpointing, should be examined closely and maybe reimplemented in the parallel version of BFT-SMaRt which uses UDS. The first part resulted in a feature-rich platform that encompasses automatic deployment and provisioning as well as live-monitoring capabilities for application related metrics. Parallel checkpointing is not working yet, but was researched extensively and some base work was done to facilitate future developers entering the project. In detail, an extended documentation for BFT-SMaRt was created and several approaches were discussed.
P. Butz, „Implementation, Deployment and Evaluation of UDS,“ G. Habiger (Betreuung), F. J. Hauck (Prüfer), Inst. of Distr. Sys., Ulm Univ., 2017 – Abgeschlossen.
The increasing world wide spread of computers and mobile devices combined with better international network set high demands on service providers. A huge number of parallel access to services offered in the Internet requires huge throughput. At the same time, the availability of services is key, especially in areas like financial and cloud services. Data centers provide these requirements by replicating critical services and data among numerous computers. However, this distribution means that hardware failures are not the exception but become the rule. The State Machine Replication (SMR) approach is an attempt to allow the recovery of crashed servers. Additionally, manipulations of servers and software failures are still a problem of such systems. Byzantine fault tolerant systems build up on state machine replication and face this issue by allowing clients to validate the correctness of service responses. However, SMR requires the client requests to arrive in the same order on every server, so that this has to be decided by a consensus. Furthermore, SMR requires deterministic processing, so that the states among all machines are equal, which is usually ensured by sequential request processing. This seems inefficient, especially considering multi core and multi CPU hardware of today's server systems. Enabling parallel request processing while fulfilling the demands of SMR requires a deterministic scheduler. These are complex and more resource-intensive than general schedulers. The aim of this thesis is the implementation of such a scheduler and the evaluation of the performance to gain knowledge about the efficiency of those schedulers to compare the overhead in scheduling with the gained parallelization. As a result, the overhead in deterministic scheduling is a huge factor, which only allows a performance improvement up to a certain point based on the cost of computations within critical sections.
A. Knittel, „Implementation of asynchronous request handling in BFT SMaRt,“ G. Habiger (Betreuung), F. J. Hauck (Prüfer), Inst. of Distr. Sys., Ulm Univ., 2016 – Abgeschlossen.
Current research efforts of our institute include a project on deterministic scheduling of multithreaded applications for State Machine Replication (SMR) systems with Byzantine Fault Tolerance (BFT). One part of this project aims to integrate our own work on deterministic scheduling with the BFT SMaRt library. Currently, BFT SMaRt only supports synchronous request-response patterns, whereas our planned SMR platform needs these patterns to be asynchronous. The goals of this project are (i) to analyze the existing BFT SMaRt codebase, (ii) to implement the necessary interfaces for asynchronous request handling and (iii) to integrate these changes into the existing BFT SMaRt libraries.