November 25th-26th: Workshop on reducing carbon emissions in large-scale computational data analysis

On Tuesday November 25th, in lieu of our normal Lecture Series talks, we will begin our Workshop on reducing carbon emissions in large-scale computational data analysis! The workshop will continue on Wednesday with more speakers and opportunities for discussion.

The workshop was planned in collaboration with the Quantitative Biology Center (QBiC), the bioinformatics core facility from the University of Tübingen. We have invited 14 speakers from around Europe and the UK to discuss current research into the environmental impacts of large scale scientific computing. More information, including the schedule, can be found here.

Introducing TCO2: Total CO2 Cost of Ownership

TCO2: Total CO2 Cost of Ownership is a tool for analyzing the carbon footprint of database server replacements. It was developed by the Data Engineering Systems group at the Hasso Platner Institute, including Tilmann Rabl (PI) and Ilin Tolovski (doctoral researcher) from FONDA subproject B6.  Ilin, Tilmann and Marcel Weisgut presented TCO2 at the 51st Conference on Very Large Databases (VLDB) on September 3rd. The tool provides a break-even analysis of server replacement, taking into consideration, among other factors, the embodied carbon of new hardware, the carbon intensity of a nation’s power supply, and the type of workload being run. Explore these relationships for yourself here!

Abstract: 

Data centers produce a significant and increasing amount of CO2 emissions. In the past, these have been predominantly due to energy generation for powering data centers. With the transition to energy sources with lower carbon production, the embodied carbon (i.e., CO2 and other greenhouse gas emissions during production, transport, and end-of-life) plays an increasing role when planning server lifecycles. While replacing an old server with newer hardware will typically reduce the power consumption of individual tasks, due to better efficiency of modern CPUs, offsetting the embodied carbon of new hardware can take months to tens of years, depending on the grid carbon intensity.

Read more here!

FONDA PhD Defense: Jonathan Bader on “Task Resource Prediction for Efficient Execution of Scientific Workflows”

Jonathan Bader defended his doctoral dissertation “Task Resource Prediction for Efficient Execution of Scientific Workflows” with distinction on June 4th, 2025. He is a member of the group “Distributed and Operating Systems” at TU Berlin, where he worked on FONDA subproject B1. His work focuses on predicting which tasks in a workflow are most resource intensive in order to dynamically adjust resource allocation and scheduling.

As part of this research, he introduced Lotaru and Sizey, two novel methods for predicting task run-time and memory requirements, respectively. Lotaru allows researchers to create a sensible baseline resource allocation profile for a workflow based on the task requirements and target infrastructure. Sizey continuously predicts the amount of memory each task requires and adjusts the memory allocation during runtime to minimize over-allocation while also preventing failures. Both outperform previous methods and improve the efficiency of workflow execution.

Congratulations Jonathan!

PI-Lecture Series part 2

Our second set of PI-Lectures will be today at 15:00 at Einstein Center Digital Future. The following PIs will be presenting their research areas:

  • Patrick Hostert – Satellite Remote Sensing
  • Matthias Boehm – System Infrastructure for Data-centric ML Pipelines
  • Tillman Rabl – Carbon-efficient Data Systems
  • Odej Kao – LLMOps for Reliability and Availability of Massive AI Infrastructures