Continued from B6: Distributed Run-Time Monitoring and Control of Data Analysis Workflows
Description
Data centers have a large and growing contribution to global energy consumption. Factors like manufacturing, construction, disassembly, green energy emissions are frequently neglected in current estimates. The total climate footprint can vary dramatically, based on energy sources, hardware, application, utilization, life-cycle management. B6 aims at a holistic management of end-to-end energy profiles and climate footprints of ML-based data analysis workflows (DAW), as well as system-internal configuration and tuning knobs.

Scientists
- Philipp Ortner
- Ilin Tolovski
Publications
2023
Mahling, Fabian; Rößler, Paul; Bodner, Thomas; Rabl, Tilmann
BabelMR: A Polyglot Framework for Serverless MapReduce Journal Article
In: Workshop on Serverless Data Analytics, 2023.
@article{mahling2023babelmr,
title = {BabelMR: A Polyglot Framework for Serverless MapReduce},
author = {Fabian Mahling and Paul Rößler and Thomas Bodner and Tilmann Rabl},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Workshop on Serverless Data Analytics},
abstract = {The MapReduce programming model and its open-source implementation Hadoop have democratized large-scale data processing by providing ease-of-use and scalability. Subsequently, systems such as Spark have dramatically improved efficiency. However, for a large number of users and applications, using these frameworks remains challenging, because they typically restrict them to specific programming languages or require cluster management expertise. In this paper, we present BabelMR, a data processing framework that provides the MapReduce programming model to arbitrary containerized applications to be executed on serverless cloud infrastructure. Users provide application logic in Map and Reduce functions that read and write their inputs and outputs to the ephemeral filesystem of a serverless function container. BabelMR orchestrates the data-parallel programs across stages of concurrent cloud function executions and efficiently integrates with serverless storage systems and columnar storage formats. Our evaluation shows that BabelMR reduces the entry hurdle to analyzing data in a distributed serverless environment in terms of development effort. BabelMR’s I/O and data shuffle building blocks outperform handwritten Python and C# code, and BabelMR is competitive with state-of- the-art serverless MapReduce systems.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Ilic, Ivan; Tolovski, Ilin; Rabl, Tilmann
RMG Sort: Radix-Partitioning-Based Multi-GPU Sorting Proceedings Article
In: Koenig-Ries, B. (Ed.): Datenbanksysteme für Business, Technologie und Web (BTW 2023), 2023.
@inproceedings{rmgsort-ilic,
title = {RMG Sort: Radix-Partitioning-Based Multi-GPU Sorting},
author = {Ivan Ilic and Ilin Tolovski and Tilmann Rabl},
editor = {B. Koenig-Ries},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
booktitle = {Datenbanksysteme für Business, Technologie und Web (BTW 2023)},
abstract = {In recent years, graphics processing units (GPUs) emerged as database accelerators due to their massive parallelism and high-bandwidth memory. Sorting is a core database operation with many applications, such as output ordering, index creation, grouping, and sort-merge joins. Many single-GPU sorting algorithms have been shown to outperform highly parallel CPU algorithms. Today’s systems include multiple GPUs with direct high-bandwidth peer-to-peer (P2P) interconnects. However, previous multi-GPU sorting algorithms do not efficiently harness the P2P transfer capability of modern interconnects, such as NVLink and NVSwitch. In this paper, we propose RMG sort, a novel radix partitioning-based multi-GPU sorting algorithm. We present a most-significant-bit partitioning strategy that efficiently utilizes high-speed P2P interconnects while reducing inter-GPU communication. Independent of the number of GPUs, we exchange radix partitions between the GPUs in one all-to-all P2P key swap and achieve nearly-perfect load balancing. We evaluate RMG sort on two modern multi-GPU systems. Our experiments show that RMG sort scales well with the input size and the number of GPUs, outperforming a parallel CPU-based sort by up to 20×. Compared to two state-of-the-art, merge-based, multi-GPU sorting algorithms, we achieve speedups of up to 1.3× and 1.8× across both systems. Excluding the CPU-GPU data transfer times and on eight GPUs, RMG sort outperforms the two merge-based multi-GPU sorting algorithms up to 2.7× and 9.2×.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Yue, Wang; Benson, Lawrence; Rabl, Tilmann
Desis: Efficient Window Aggregation in Decentralized Networks Proceedings Article
In: 26th International Conference on Extending Database Technology (EDBT '23), 2023.
@inproceedings{streamprocessing,
title = {Desis: Efficient Window Aggregation in Decentralized Networks},
author = {Wang Yue and Lawrence Benson and Tilmann Rabl},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
booktitle = {26th International Conference on Extending Database Technology (EDBT '23)},
abstract = {Stream processing is widely applied in industry as well as in research to process unbounded data streams. In many use cases, specific data streams are processed by multiple continuous queries. Current systems group events of an unbounded data stream into bounded windows to produce results of individual queries in a timely fashion. For multiple concurrent queries, multiple concurrent and usually overlapping windows are generated. To reduce redundant computations and share partial results, state-of-the-art solutions divide windows into slices and then share the results of those slices. However, this is only applicable for queries with the same aggregation function and window measure, as in the case of overlaps for sliding windows. For multiple queries on the same stream with different aggregation functions and window measures, partial results cannot be shared. Furthermore, data streams are produced from devices that are distributed in large decentralized networks. Current systems cannot process queries on decentralized data streams efficiently. All queries in a decentralized network are either computed centrally or processed individually without exploiting partial results across queries. We present Desis, a stream processing system that can efficiently process multiple stream aggregation queries. We propose an aggregation engine that can share partial results between multiple queries with different window types, measures, and aggregation functions. In decentralized networks, Desis moves computation to data sources and shares overlapping computation as early as possible between queries. Desis outperforms existing solutions by orders of magnitude in throughput when processing multiple queries and can scale to millions of queries. In a decentralized setup, Desis can save up to 99% of network traffic and scale performance linearly.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
2022
Benson, Lawrence; Papke, Leon; Rabl, Tilmann
PerMA-Bench: Benchmarking Persistent Memory Access Proceedings Article
In: Proceedings of the VLDB Endowment, pp. 2463-2476, 2022.
@inproceedings{benson2022PerMA,
title = {PerMA-Bench: Benchmarking Persistent Memory Access},
author = {Lawrence Benson and Leon Papke and Tilmann Rabl},
url = {https://www.vldb.org/pvldb/vol15/p2463-benson.pdf},
year = {2022},
date = {2022-07-01},
urldate = {2022-07-01},
booktitle = {Proceedings of the VLDB Endowment},
volume = {15},
number = {11},
pages = {2463-2476},
abstract = {Persistent memory's (PMem) byte-addressability and persistence at DRAM-like speed with SSD-like capacity have the potential to cause a major performance shift in database storage systems. With the availability of Intel Optane DC Persistent Memory, initial benchmarks evaluate the performance of real PMem hardware. However, these results apply to only a single server and it is not yet clear how workloads compare across different PMem servers. In this paper, we propose PerMA-Bench, a configurable benchmark framework that allows users to evaluate the bandwidth, latency, and operations per second for customizable database-related PMem access. Based on PerMA-Bench, we perform an extensive evaluation of PMem performance across four different server configurations, containing both first- and second-generation Optane, with additional parameters such as DIMM power budget and number of DIMMs per server. We validate our results with existing systems and show the impact of low-level design choices. We conduct a price-performance comparison that shows while there are large differences across Optane DIMMs, PMem is generally competitive with DRAM. We discuss our findings and identify eight general and implementation-specific aspects that influence PMem performance and should be considered in future work to improve PMem-aware designs.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Gévay, Gábor E.; Rabl, Tilmann; Breß, Sebastian; Madai-Tahy, Loránd; Quiané-Ruiz, Jorge-Arnulfo; Markl, Volker
Imperative or Functional Control Flow Handling: Why not the Best of Both Worlds? Journal Article
In: ACM SIGMOD Record, vol. 51, no. 1, pp. 1-8, 2022.
@article{noauthororeditorc,
title = {Imperative or Functional Control Flow Handling: Why not the Best of Both Worlds?},
author = {Gábor E. Gévay and Tilmann Rabl and Sebastian Breß and Loránd Madai-Tahy and Jorge-Arnulfo Quiané-Ruiz and Volker Markl},
year = {2022},
date = {2022-03-01},
urldate = {2022-03-01},
journal = {ACM SIGMOD Record},
volume = {51},
number = {1},
pages = {1-8},
abstract = {Modern data analysis tasks often involve control flow statements, such as the iterations in PageRank and K-means. To achieve scalability, developers usually implement these tasks in distributed dataflow systems, such as Spark and Flink. Designers of such systems have to choose between providing imperative or functional control flow constructs to users. Imperative constructs are easier to use, but functional constructs are easier to compile to an efficient dataflow job. We propose Mitos, a system where control flow is both easy to use and efficient. Mitos relies on an intermediate representation based on the static single assignment form. This allows us to abstract away from specific control flow constructs and treat any imperative control flow uniformly both when building the dataflow job and when coordinating the distributed execution.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Damme, Patrick; Birkenbach, Marius; Bitsakos, Constantinos; Boehm, Matthias; Bonnet, Philippe; Ciorba, Florina; Dokter, Mark; Dowgiallo, Pawl; Eleliemy, Ahmed; Faerber, Christian; Goumas, Georgios; Habich, Dirk; Hedam, Niclas; Hofer, Marlies; Huang, Wenjun; Innerebner, Kevin; Karakostas, Vasileios; Kern, Roman; Kosar, Tomaž; Krause, Alexander; Krems, Daniel; Laber, Andreas; Lehner, Wolfgang; Mier, Eric; Paradies, Marcus; Peischl, Bernhard; Poerwawinata, Gabrielle; Psomadakis, Stratos; Rabl, Tilmann; Ratuszniak, Piotr; Silva, Pedro; Skuppin, Nikolai; Starzacher, Andreas; Steinwender, Benjamin; Tolovski, Ilin; Tözün, Pinar; Ulatowski, Wojciech; Wang, Yuanyuan; Wrosz, Izajasz; Zamuda, Aleš; Zhang, Ce; Zhu, Xiao Xiang
DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines Proceedings Article
In: Conference on Innovative Data Systems Research, 2022, (No publisher listed. (jcg: 14/02/2022) In the call for papers it is stated that "Final versions of accepted submissions will be published in the electronic proceedings of the CIDR conference.". Please insert these proceedings as the place of publication (jcg: 22/02/2022)).
@inproceedings{33e1f3af607e45ad821be2e22f3113bc,
title = {DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines},
author = {Patrick Damme and Marius Birkenbach and Constantinos Bitsakos and Matthias Boehm and Philippe Bonnet and Florina Ciorba and Mark Dokter and Pawl Dowgiallo and Ahmed Eleliemy and Christian Faerber and Georgios Goumas and Dirk Habich and Niclas Hedam and Marlies Hofer and Wenjun Huang and Kevin Innerebner and Vasileios Karakostas and Roman Kern and Tomaž Kosar and Alexander Krause and Daniel Krems and Andreas Laber and Wolfgang Lehner and Eric Mier and Marcus Paradies and Bernhard Peischl and Gabrielle Poerwawinata and Stratos Psomadakis and Tilmann Rabl and Piotr Ratuszniak and Pedro Silva and Nikolai Skuppin and Andreas Starzacher and Benjamin Steinwender and Ilin Tolovski and Pinar Tözün and Wojciech Ulatowski and Yuanyuan Wang and Izajasz Wrosz and Aleš Zamuda and Ce Zhang and Xiao Xiang Zhu},
year = {2022},
date = {2022-01-09},
booktitle = {Conference on Innovative Data Systems Research},
abstract = {Integrated data analysis (IDA) pipelines—that combine data management (DM) and query processing, high-performance computing (HPC), and machine learning (ML) training and scoring—become increasingly common in practice. Interestingly, systems of these areas share many compilation and runtime techniques, and the used—increasingly heterogeneous—hardware infrastructure converges as well. Yet, the programming paradigms, cluster resource management, data formats and representations, as well as execution strategies differ substantially. DAPHNE is an open and extensible system infrastructure for such IDA pipelines, including language abstractions, compilation and runtime techniques, multi-level scheduling, hardware (HW) accelerators, and computational storage for increasing productivity and eliminating unnecessary overheads. In this paper, we make a case for IDA pipelines, describe the overall DAPHNE system architecture, its key components, and the design of a vectorized execution engine for computational storage, HW accelerators, as well as local and distributed operations. Preliminary experiments that compare DAPHNE with MonetDB, Pandas, DuckDB, and TensorFlow show promising results.},
note = {No publisher listed. (jcg: 14/02/2022) In the call for papers it is stated that "Final versions of accepted submissions will be published in the electronic proceedings of the CIDR conference.". Please insert these proceedings as the place of publication (jcg: 22/02/2022)},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Ihde, Nina; Marten, Paula; Eleliemy, Ahmed; Poerwawinata, Gabrielle; Silva, Pedro; Tolovski, Ilin; Ciorba, Florina M.; Rabl, Tilmann
A Survey of Big Data, High Performance Computing, and Machine Learning Benchmarks Proceedings Article
In: Nambiar, Raghunath; Poess, Meikel (Ed.): Performance Evaluation and Benchmarking, pp. 98–118, Springer International Publishing, Cham, 2022, ISBN: 978-3-030-94437-7.
@inproceedings{10.1007/978-3-030-94437-7_7,
title = {A Survey of Big Data, High Performance Computing, and Machine Learning Benchmarks},
author = {Nina Ihde and Paula Marten and Ahmed Eleliemy and Gabrielle Poerwawinata and Pedro Silva and Ilin Tolovski and Florina M. Ciorba and Tilmann Rabl},
editor = {Raghunath Nambiar and Meikel Poess},
isbn = {978-3-030-94437-7},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {Performance Evaluation and Benchmarking},
pages = {98--118},
publisher = {Springer International Publishing},
address = {Cham},
abstract = {In recent years, there has been a convergence of Big Data (BD), High Performance Computing (HPC), and Machine Learning (ML) systems. This convergence is due to the increasing complexity of long data analysis pipelines on separated software stacks. With the increasing complexity of data analytics pipelines comes a need to evaluate their systems, in order to make informed decisions about technology selection, sizing and scoping of hardware. While there are many benchmarks for each of these domains, there is no convergence of these efforts. As a first step, it is also necessary to understand how the individual benchmark domains relate.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Strassenburg, Nils; Tolovski, Ilin; Rabl, Tilmann
Efficiently Managing Deep Learning Models in a Distributed Environment Proceedings Article
In: 25th International Conference on Extending Database Technology (EDBT '22), 2022.
@inproceedings{strassenburg_2022_mmlib,
title = {Efficiently Managing Deep Learning Models in a Distributed Environment},
author = {Nils Strassenburg and Ilin Tolovski and Tilmann Rabl},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {25th International Conference on Extending Database Technology (EDBT '22)},
abstract = {Deep learning has revolutionized many domains relevant in research and industry, including computer vision and natural language processing by significantly outperforming previous state-of-the-art approaches. This is why deep learning models are part of many essential software applications. To guarantee their reliable and consistent performance even in changing environments, they need to be regularly adjusted, improved, and retrained but also documented, deployed, and monitored. An essential part of this set of processes, referred to as model management, is to save and recover models. To enable debugging, many applications require an exact model representation. In this paper, we investigate if, and to what extend, we can outperform a baseline approach capable of saving and recovering models, while focusing on storage consumption, time-to-save, and time-to-recover. We present our Python library MMlib, offering three approaches: a baseline approach that saves complete model snapshots, a parameter update approach that saves the updated model data, and a model provenance approach that saves the model’s provenance instead of the model itself. We evaluate all approaches in four distributed environments on different model architectures, model relations, and data sets. Our evaluation shows that both the model provenance and parameter update approach outperform the baseline by up to 15.8% and 51.7% in time-to-save and by up to 70.0% and 95.6% in storage consumption, respectively.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Benson, Lawrence; Rabl, Tilmann
Darwin: Scale-In Stream Processing Proceedings Article
In: 12th Annual Conference on Innovative Data Systems Research (CIDR ’22), 2022.
@inproceedings{benson_darwin_2022,
title = {Darwin: Scale-In Stream Processing},
author = {Lawrence Benson and Tilmann Rabl},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {12th Annual Conference on Innovative Data Systems Research (CIDR ’22)},
abstract = {Companies increasingly rely on stream processing engines (SPEs) to quickly analyze data and monitor infrastructure. These systems enable continuous querying of data at high rates. Current production-level systems, such as Apache Flink and Spark, rely on clusters of servers to scale out processing capacity. Yet, these scale-out systems are resource inefficient and cannot fully utilize the hardware. As a solution, hardware-optimized, single-server, scale-up SPEs were developed. To get the best performance, they neglect essential features for industry adoption, such as larger-than-memory state and recovery. This requires users to choose between high performance or system availability. While some streaming workloads can afford to lose or reprocess large amounts of data, others cannot, forcing them to accept lower performance. Users also face a large performance drop once their workloads slightly exceed a single server and force them to use scale-out SPEs. To acknowledge that real-world stream processing setups have drastically varying performance and availability requirements, we propose scale-in processing. Scale-in processing is a new paradigm that adapts to various application demands by achieving high hardware utilization on a wide range of single- and multi-node hardware setups, reducing overall infrastructure requirements. In contrast to scaling-up or -out, it focuses on fully utilizing the given hardware instead of demanding more or ever-larger servers. We present Darwin, our scale-in SPE prototype that tailors its execution towards arbitrary target environments through compiling stream processing queries while recoverable larger-than-memory state management. Early results show that Darwin achieves an order of magnitude speed-up over current scale-out systems and matches processing rates of scale-up systems.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Monte, Bonaventura Del; Zeuch, Steffen; Rabl, Tilmann; Markl, Volker
Rethinking Stateful Stream Processing with RDMA Proceedings Article
In: ACM SIGMOD International Conference on Management of Data (SIGMOD ’22), 2022.
@inproceedings{delmonte2022rethinking,
title = {Rethinking Stateful Stream Processing with RDMA},
author = {Bonaventura Del Monte and Steffen Zeuch and Tilmann Rabl and Volker Markl},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {ACM SIGMOD International Conference on Management of Data (SIGMOD ’22)},
abstract = {Remote Direct Memory Access (RDMA) hardware has bridged the gap between network and main memory speed and thus invalidated the common assumption that network is often the bottleneck in distributed data processing systems. However, high-speed networks do not provide "plug-and-play" performance (e.g., using IP-overInfiniBand) and require a careful co-design of system and application logic. As a result, system designers need to rethink the architecture of their data management systems to benefit from RDMA acceleration. In this paper, we focus on the acceleration of stream processing engines, which is challenged by real-time constraints and state consistency guarantees. To this end, we propose Slash, a novel stream processing engine that uses high-speed networks and RDMA to efficiently execute distributed streaming computations. Slash embraces a processing model suited for RDMA acceleration and scales out by omitting the expensive data re-partitioning demands of scale-out SPEs. While scale-out SPEs rely on data re-partitioning to execute a query over many nodes, Slash uses RDMA to share mutable state among nodes. Overall, Slash achieves a throughput improvement up to two orders of magnitude over existing systems deployed on an InfiniBand network. Furthermore, it is up to a factor of 22 faster than a self-developed solution that relies on RDMA-based data repartitioning to scale out query processing.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
