The efficient implementation of complex DAWs in various scientific disciplines requires deep knowledge of a large stack – consisting of an abstract DAW description, compilation of a logical plan, mapping onto the currently available infrastructure, and appropriate configuration of execution engines. Components and configurations developed for one computational infrastructure are often unsuitable for another, either leading to an undesirable platform lock-in or to a considerable loss of efficiency.
The goal of subproject B1 is therefore to improve portability. To this end, we
- compare DAW requirements with declarative descriptions of the available infrastructure,
- profile both DAWs and infrastructure as needed, and
- then map the DAWs onto the infrastructure using novel scheduling and load balancing (SLB) techniques to automatically optimize efficiency.
Ultimately, we aim to allow scientists to focus on the domain-specific challenges in their DAWs, while our new components provide an efficient selection and use of the available computing infrastructure automatically.
In: 34th International Conference on Scientific and Statistical Database Management (SSDBM 2022), pp. to appear, ACM, 2022.
Collaborative Cluster Configuration for Distributed Data-Parallel Processing: A Research Overview Journal Article
In: Datenbank-Spektrum, 2022.
Get Your Memory Right: The Crispy Resource Allocation Assistant for Large-Scale Data Processing Miscellaneous
Reshi: Recommending Resources for Scientific Workflow Tasks on Heterogeneous Infrastructures Inproceedings
In: 41th International Performance Computing and Communications Conference 2022, IEEE, 2022.
C3O: Collaborative Cluster Configuration Optimization for Distributed Data Processing in Public Clouds Journal Article
In: 2021 IEEE International Conference on Cloud Engineering (IC2E), pp. 43-52, 2021.
In: 2021 IEEE International Conference on Big Data (Big Data), pp. 3113-3118, 2021.
In: 2021 IEEE International Conference on Big Data (Big Data), pp. 3141-3146, 2021.
In: 2021 IEEE International Conference on Big Data (Big Data), pp. 65-75, 2021.
In: 2020 IEEE International Conference on Big Data (Big Data), pp. 2851-2856, 2020.
A consolidated View on Specification Languages for Data Analysis Workflows Proceeding Forthcoming
Automated Software Re-Engineering (ISoLA2022 · ASRE) (accepted), Forthcoming.