Session: Automating Data Pipelines With Apache Airflow

There are many open source tools to help you with the different steps you typically need to extract insights from your data. As you scale and grow your use of data, keeping on top of the steps can be difficult. Apache Airflow is an open source orchestration tool that helps you to programmatically create workflows in Python that will help you run, schedule, monitor and mange data engineering pipelines – no more manually managing those cron jobs! In this session, we will take a look at the architecture of Apache Airflow, and walk you through creating your first workflow and how you can use a growing number of provider libraries to help you work with other open source tools and services. This session is intended for beginners/those wanting to learn more about this open source project.

Presenters:

This track proudly sponsored by