November 25th-26th: Workshop on reducing carbon emissions in large-scale computational data analysis

On Tuesday November 25th, in lieu of our normal Lecture Series talks, we will begin our Workshop on reducing carbon emissions in large-scale computational data analysis! The workshop will continue on Wednesday with more speakers and opportunities for discussion.

The workshop was planned in collaboration with the Quantitative Biology Center (QBiC), the bioinformatics core facility from the University of Tübingen. We have invited 14 speakers from around Europe and the UK to discuss current research into the environmental impacts of large scale scientific computing. More information, including the schedule, can be found here.

Snakemake Tutorial with Johannes Köster

FONDA has invited Professor Johannes Köster to give a tutorial on Snakemake, a widely used, python based workflow management system. Snakemake allows users to create scalable, human readable, reproducible workflows for scientific data analysis.

Professor Köster the leader of the Bioinformatics and Computational Oncology group at the Institute for AI in Medicine at the University of Duisburg-Essen, where his work focuses on reproducibility and bioinformatics workflows. He is the author and lead developer of Snakemake.

The full-day tutorial will be on July 02, 2025 starting at 9 am. Please contact Tobias Price if you are interested in attending.

FONDA Integrated Research Training Group Courses

The Integrated Research Training Group (IRTG) has been very active in the past month providing courses and workshops for FONDA’s doctoral researchers. On April 28th and 29th Seqera, the company which supports the open source workflow engine “Nextflow” provided a tutorial using Nextflow to write and evaluate bioinformatics workflows. On May 7th, the IRTG members participated in a training course in Good Scientific Practice, especially geared towards computational sciences.

Future courses include a tutorial on GitLab with scientists from the German Aerospace Center on May 20th and 21st, a course on gender bias awareness on June 17, along with courses on workflow simulator WfCommons, and python-based workflow execution engine “Snakemake”.

PI-Lecture Series Part 8

The final installment of FONDA’s PI-Lecture Series will take place on March 17th from 15:00-17:30 in Adlershof (Humboldt-Kabinett, Rudower Chaussee 25). The following PIs will give talks on their ongoing research:

  • Henning Meyerhenke – Workflow Scheduling and (Other) Graph Algorithms for Parallel & Distributed Systems
  • Thomas Kosch – TBA
  • Ulf Leser – Knowledge Management in Bioinformatics
  • Björn Scheuermann – Modern Web Transport Protocols (online)

We have had a lot of excellent talks over the last few months. The purpose of this lecture series was to introduce all of our new FONDA members to the research areas of the PIs. Based on the quality of questions and conversations, this has been very successful!

I’m looking forward to more conversations about science with everyone at our upcoming spring retreat.

PI Lecture Series Part 5

Our fifth set of PI-Lectures will be Monday Feb 24th starting at 15:00 in Adlershof – Rudower Chaussee 25, 12489 Berlin. We will meet in the Humboldt-Kabinett for talks regarding ongoing research by the following PIs:

  • Nicole Schweikardt – Logic in Computer Science
  • Matthias Weidlich – On Events and Processes
  • Claudia Draxl – From science to data and back
  • Knut Reinert – Hierarchical Interleaved Bloom Filter: Enabling ultrafast, approximate sequence queries

FONDA PhD student Mario Sänger successfully defends his PhD thesis on “Representation Learning for Biomedical Text Mining”

Mario Sänger, a member of the group “Human-computer interaction for Scientific Software”, successfully defended his PhD thesis on November 25, 2024. His work focuses on using representation learning to extract meaningful connections between biomedical entities, such as genes, diseases, proteins, and pharmaceuticals from a corpus of PubMed abstracts, as well as biomedical knowledge bases. In addition to demonstrating the feasibility of this corpus-wide approach, he also benchmarked and tested existing pre-trained language models (PLMs) for sentence-level relation prediction. His results show that additional context from biomedical knowledge databases does not enhance the most robust carefully tuned PLMs.

In FONDA, he collaborated with Prof. Dr. Thomas Kosch, exploring the use of ChatGPT as a tool to support users in designing and implementing scientific workflows.

Congratulations Mario, and all the best!