A6: Data Analysis Workflows for Interactive Scientific Exploration


DAWs for scientific discoveries are often exploratory. Furthermore, also the process of specifying a DAW is exploratory, involving the repeated adaptation of the current DAW specification based on results of previous executions or based on refined requirements. The interdisciplinary subproject A6 will investigate means to support the explorative process of DAW specification systematically by developing a specification model for exploratory DAWs, its mapping to distributed DAW infrastructures, and abstractions for interactive exploratory DAWs that connect exploration spaces with states of DAW executions. It focuses on DAWs for genome analysis, which are often long and complex and whose development involves numerous design choices and time-consuming trial-and-error phases. It will team-up especially with subproject A1 on analyzing traces of DAW executions and with B3 regarding the problem of mapping logged events back to the abstract tasks which produced them. The subproject addresses the phases of DAW modification, specification, and deployment. The subproject will be carried out jointly by Dr. Kehr, an expert in large-scale genome analysis methods, and Prof. Weidlich, an expert in workflow management and mining.




Nourhan Elfaramawy

Interactive Workflows for Exploratory Data Analysis Inproceedings

In: Bao, Zhifeng; Sellis, Timos (Ed.): Proceedings of the VLDB 2022 PhD Workshop co-located with the 48th International Conference on Very Large Databases (VLDB 2022), Sydney, Australia, September 5, 2022, CEUR-WS.org, 2022.

Thomas Krannich; W Timothy J White; Sebastian Niehus; Guillaume Holley; Bjarni V Halldórsson; Birte Kehr

Population-scale detection of non-reference sequence variants using colored de Bruijn graphs Journal Article

In: Bioinformatics, vol. 38, no. 3, pp. 604-611, 2021, ISSN: 1367-4803.

