Skip to main content

Command Palette

Search for a command to run...

How To Poll an Airflow Job (i.e. DAG Run)

Published
1 min read
J

On multiple occasions in my career I've been tasked with creating production data samples and setting up environments to use this data. Seeing a pattern here, and having spent a lot of time working on perfecting this, I built a business out of this (https://www.redactics.com) and later found my co-founders to take the original concept, generalize it, and extend the original concept into full-fledged data privacy product with its initial focus around bringing safe production data into test, development (local and cloud-based), and demo environments.

Ever wanted to actually know when an Airflow DAG Run has completed? Perhaps your use case involves this completed work being some sort of workflow dependency, or perhaps it is used in a CI/CD pipeline. I'm sure there are a myriad of possible scenarios here beyond ours at Redactics, which is using Airflow to clone a database for a dry-run of a database migration, but you be the judge!

Here is the repo

In this repo you can find a sample Dockerfile, a sample Kubernetes job that provides some context as to how this poller script can be used, as well as the script itself. There isn't much to it, but I hope it saves some developers a moment or three of their time should they need to recreate this functionality!