Date: | 2021, June 1 |
Time: | 10:30 a. m. |
Place: | Online |
Author: | Günther, Stephan |
Title: | Implementing the eGo^n Data Processing Pipeline Using Apache Airflow – An Update After more than Half a Year of Development |
In this follow up talk to last years presentation, which focussed more on how to use Apache Airflow to implement a complex workflow in general, we'll take an in depth look at a practical example by looking into the current state of the implementation of the eGo^n data processing pipeline. The milestones reached and insights gained after more than half a year of development will be presented during this talk.
We'll see the impact of generating the workflow visualisation directly from its implementation, look at the performance implications of specifying the workflow as a directed acyclic graph (DAG) and note some of the benefits of using an API designed for writing workflows. We'll also go beyond Apache Airflow by looking at additional features implemented specifically for the eGo^n data processing pipeline and how these features, while not part of Apache Airflow, would have been hard or even impossible to implement without it.