Data lineage support and compatibility reference
All Astro Deployments use the OpenLineage Airflow library (openlineage-airflow
) to gather lineage data. The OpenLineage Airflow library is installed on Astro Runtime by default and includes built-in support for several Apache Airflow operators. These operators use tools called extractors to emit lineage data and they don't require additional configuration. Tasks that run with supported operators appear as nodes in your data lineage graphs and show connections to any input and output datasets.
If you’re using an unsupported operator, create an issue in the OpenLineage GitHub repository or write your own custom extractor.
This functionality is early access. If you have questions or feedback, contact Astronomer support.
Supported Airflow operators
The following operators are supported in Astro lineage:
PostgresOperator
BigQueryOperator
SnowflakeOperator
GreatExpectationsOperator
MySqlOperator
RedshiftDataOperator
RedshiftSQLOperator
SQLCheckOperator
SQLValueCheckOperator
SQLThresholdCheckOperator
SQLIntervalCheckOperator
SQLColumnCheckOperator
BigQueryColumnCheckOperator
SQLTableCheckOperator
BigQueryTableCheckOperator
The GreatExpectationsOperator
additionally emits data quality information to the Quality tab in the Lineage view of the Cloud UI. For more information, see Data lineage on Astro.
Partially supported Airflow operators
The following operators are partially supported by the Airflow integration with OpenLineage:
PythonOperator
BashOperator
Airflow tasks that are run with partially supported operators:
- Emit source code to the lineage backend.
- Emit task run data to the lineage backend.
- Appear in the graph view of the Lineage tab in the Cloud UI as nodes.
- Do not emit lineage data about input or output datasets.
Unsupported operators
Airflow tasks that run with unsupported operators send information about the task duration, status, and parent DAG to the lineage backend. However, information about the task's input or output datasets isn't sent to the backend. A task running with an unsupported operator appears as a single node in the lineage graph.
Other known limitations
Lineage on Astro is in active development. Keep in mind the following limitations when using lineage functionality:
- Source code emitted by partially supported operators doesn't appear in the lineage UI.
- Airflow operators emit lineage data about failed task runs only for Deployments on Astro Runtime v5.0+.