B4: Exploiting Software-Defined Networks for Efficient Data Management in Next-Generation Data Analysis Workflows

Description

b4

Running a given DAW on different computational infrastructures than it was developed for often incurs severe performance penalties. One reason is that DAWs are typically designed for specific infrastructures, which leads to hard-coded decisions regarding file locations, file movement, or means of network-based data exchange between tasks. This subproject will investigate the usage of software-defined networks (SDNs) to bring the requirements of the DAW and the capabilities of the underlying physical infrastructure in terms of data access closer together. It thus aims at improving portability and adaptability of DAW execution engines by means of adapting the underlying infrastructure. Technically, it will develop a light-weight declarative specification language for annotating DAWs with their communication and computation demands, which nicely connects to A2 working in the related field of data access pattern. It will furthermore cooperate with A2 on annotations for specifying data access properties and with B1 on the interplay of file placement and scheduling. The subproject will be led by Prof. Reinefeld, an expert in distributed management of large scientific data sets and high-performance computing, and Prof. Scheuermann, expert in network protocols and communication systems.

PIs

Contact

Joel Witzke, PhD Student, Mail: witzke at zib dot de

Publications

2022

Jonathan Bader; Joel Witzke; Soeren Becker; Ansgar Lößer; Fabian Lehmann; Leon Doehler; Anh Duc Vu; Odej Kao

Towards Advanced Monitoring for Scientific Workflows Inproceedings

In: 2022 IEEE International Conference on Big Data (IEEE BigData 2022), IEEE, 2022.

Links | BibTeX

Masoud Gholami; Florian Schintke

IOSIG: Declarative I/O-Stream Properties Using Pragmas Journal Article

In: Datenbank-Spektrum, vol. 22, no. 2, pp. 109–119, 2022.

Links | BibTeX