Configure an external secrets backend on Astro
Apache Airflow variables and connections often contain sensitive information about your external systems that should be kept secret in a secure, centralized location that complies with your organization's security requirements.
While secret values of Airflow variables and connections are encrypted in the Airflow metadata database of every Deployment, Astronomer recommends integrating with a secrets backend tool.
Benefits
Integrating a secrets backend tool with Astro allows you to:
- Store Airflow variables and connections in a centralized location alongside secrets from other tools and systems used by your organization, including Kubernetes secrets, SSL certificates, and more.
- Comply with internal security postures and policies that protect your organization.
- Recover in the case of an incident.
- Automatically pull Airflow variables and connections that are already stored in your secrets backend when you create a new Deployment instead of having to set them manually in the Airflow UI.
Astro integrates with the following secrets backend tools:
- Hashicorp Vault
- AWS Systems Manager Parameter Store
- AWS Secrets Manager
- Google Cloud Secret Manager
- Azure Key Vault
Secrets backend integrations are configured individually with each Astro Deployment.
If you enable a secrets backend on Astro, you can continue to define Airflow variables and connections either as environment variables or in the Airflow UI as needed. If set via the Airflow UI, variables and connections are stored as encrypted values in Airflow's metadata database.
Airflow checks for the value of an Airflow variable or connection in the following order:
- Secrets backend
- Environment variable
- The Airflow UI
Setting Airflow connections via secrets requires knowledge of how to generate Airflow connection URIs. If you plan to store Airflow connections on your secrets backend, read the Apache Airflow documentation for guidance on how to generate a connection URI.
Setup
- AWS Secrets Manager
- Hashicorp Vault
- Google Cloud Secret Manager
- Azure Key Vault
- AWS Parameter Store
This topic provides setup steps for configuring AWS Secrets Manager as a secrets backend on Astro.
Prerequisites
- A Deployment.
- The Astro CLI.
- An Astro project with
apache-airflow-providers-amazon
version 5.1.0 or later. See Add Python and OS-level packages. - An IAM role with the
SecretsManagerReadWrite
policy that your Astro cluster can assume. See AWS IAM roles.
Step 1: Add Airflow secrets to Secrets Manager
Create directories for Airflow variables and connections in AWS Secrets Manager that you want to store as secrets. You can use real or test values.
- When setting the secret type, choose
Other type of secret
and select thePlaintext
option. - If creating a connection URI or a non-dict variable as a secret, remove the brackets and quotations that are pre-populated in the plaintext field.
- The secret name is assigned after providing the plaintext value and clicking
Next
.
Secret names must correspond with the connections_prefix
and variables_prefix
set below in step 2. Specifically:
If you use
"variables_prefix": "airflow/variables"
, you must set Airflow variable names as:airflow/variables/<variable-key>
The
<variable-key>
is how you will retrieve that variable's value in a DAG. For example:my_var = Variable.get("variable-key>")
If you use
"connections_prefix": "airflow/connections"
, you must set Airflow connections as:airflow/connections/<connection-id>
The
<connection-id>
is how you will retrieve that connection's URI in a DAG. For example:conn = BaseHook.get_connection(conn_id="<connection-id>")
Be sure to not include a leading
/
at the beginning of your variable or connection name
For more information on adding secrets to Secrets Manager, see AWS documentation.
Step 2: Configure your Astro project
Add the following lines to your
Dockerfile
:ENV AIRFLOW__SECRETS__BACKEND=airflow.providers.amazon.aws.secrets.systems_manager.SystemsManagerParameterStoreBackend
ENV AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "/airflow/connections", "variables_prefix": "/airflow/variables", "role_arn": $SECRETS_BACKEND_ARN, "region_name": $SECRETS_BACKEND_REGION}Add the following lines to your
.env
file:SECRETS_BACKEND_ARN=<your-role-arn>
SECRETS_BACKEND_REGION=<your-aws-region>
Step 3: Deploy to Astro
Run the following command to deploy the contents of your
.env
file to your Cloud UI:astro deployment variable create --deployment-id <your-deployment-id> --load --env .env
Run the following command to deploy your Astro project and implement your Dockerfile changes:
astro deploy
The Dockerfile contains the core configuration for your secrets backend. Because your AWS IAM role and region are set as Astro environment variables, you can now configure multiple roles and environments to use the same secrets backend without needing to redeploy changes to your Dockerfile.
To further customize the Airflow and AWS SSM Parameter Store integration, see the full list of available kwargs.
This topic provides steps for using Hashicorp Vault as a secrets backend for both local development and on Astro. To do this, you will:
- Create an AppRole in Vault which grants Astro minimal required permissions.
- Write a test Airflow variable or connection as a secret to your Vault server.
- Configure your Astro project to pull the secret from Vault.
- Test the backend in a local environment.
- Deploy your changes to Astro.
Prerequisites
- A Deployment on Astro.
- The Astro CLI.
- A Hashicorp Vault server.
- An Astro project.
- The Vault CLI.
- Your Vault Server's URL. If you're using a local server, this should be
http://127.0.0.1:8200/
.
If you do not already have a Vault server deployed but would like to test this feature, Astronomer recommends that you either:
- Sign up for a Vault trial on Hashicorp Cloud Platform (HCP) or
- Deploy a local Vault server. See Starting the server in Hashicorp documentation.
Step 1: Create a Policy and AppRole in Vault
To use Vault as a secrets backend, Astronomer recommends configuring a Vault AppRole with a policy that grants only the minimum necessary permissions for Astro. To do this:
Create a Vault policy with the following permissions:
path "secret/data/variables/*" {
capabilities = ["read", "list"]
}
path "secret/data/connections/*" {
capabilities = ["read", "list"]
}Create a Vault AppRole and attach the policy you just created to it.
Retrieve the
role-id
andsecret-id
for your AppRole by running the following commands:vault read auth/approle/role/<your-approle>/role-id
vault write -f auth/approle/role/<your-approle>/secret-idSave these values for Step 3.
Step 2: Create an Airflow variable or connection in Vault
To start, create an Airflow variable or connection in Vault that you want to store as a secret. It can be either a real or test value. You will use this secret to test your backend's functionality in Step 4.
To store an Airflow variable in Vault as a secret, run the following Vault CLI command with your own values:
vault kv put secret/variables/<your-variable-key> value=<your-variable-value>
To store a connection in Vault as a secret, run the following Vault CLI command with your own values:
vault kv put secret/connections/<your-connection-id> conn_uri=<connection-type>://<connection-login>:<connection-password>@<connection-host>:5432
To confirm that your secret was written to Vault successfully, run:
# For variables
$ vault kv get secret/variables/<your-variable-key>
# For connections
$ vault kv get secret/connections/<your-connection-id>
Step 3: Set up Vault locally
In your Astro project, add the Hashicorp Airflow provider to your project by adding the following to your requirements.txt
file:
apache-airflow-providers-hashicorp
Then, add the following environment variables to your Dockerfile
:
# Make sure to replace `<your-approle-id>` and `<your-approle-secret>` with your own values.
ENV AIRFLOW__SECRETS__BACKEND=airflow.providers.hashicorp.secrets.vault.VaultBackend
ENV AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_path": "connections", "variables_path": "variables", "config_path": null, "url": "http://host.docker.internal:8200", "auth_type": "approle", "role_id":"<your-approle-id>", "secret_id":"<your-approle-secret>"}
This tells Airflow to look for variable and connection information at the secret/variables/*
and secret/connections/*
paths in your Vault server. In the next step, you'll test this configuration in a local Airflow environment.
If you want to deploy your project to a hosted Git repository before deploying to Astro, be sure to save <your-approle-id>
and <your-approle-secret>
securely. Astronomer recommends adding them to your project's .env
file and specifying this file in .gitignore
.
When you deploy to Astro in Step 4, you can set these values as secrets in the Cloud UI.
By default, Airflow uses "kv_engine_version": 2
, but this secret was written using v1. You can change this to accommodate how you write and read your secrets.
For more information on the Airflow provider for Hashicorp Vault and how to further customize your integration, see the Apache Airflow documentation.
Step 4: Run an example DAG to test Vault locally
To test Vault, write a simple DAG which calls your test secret and add this DAG to your project's dags
directory. For example, you can use the following DAG to print the value of a variable to your task logs:
from airflow import DAG
from airflow.hooks.base import BaseHook
from airflow.models import Variable
from airflow.operators.python import PythonOperator
from datetime import datetime
def print_var():
my_var = Variable.get("<your-variable-key>")
print(f'My variable is: {my_var}')
conn = BaseHook.get_connection(conn_id="<your-connection-key>")
print(conn.get_uri())
with DAG('example_secrets_dag', start_date=datetime(2022, 1, 1), schedule_interval=None) as dag:
test_task = PythonOperator(
task_id='test-task',
python_callable=print_var,
)
Once you've added this DAG to your project:
Run
astro dev restart
to push your changes to your local Airflow environment.In the Airflow UI (
http://localhost:8080/admin/
), trigger your new DAG.Click on
test-task
> View Logs. If you ran the example DAG above, you should see the contents of your secret in the task logs:{logging_mixin.py:109} INFO - My variable is: my-test-variable
Once you confirm that the setup was successful, you can delete this example DAG.
Step 5: Deploy on Astro
Once you've confirmed that the integration with Vault works locally, you can complete a similar set up with a Deployment on Astro.
- In the Cloud UI, add the same environment variables found in your
Dockerfile
to your Deployment environment variables. SpecifyAIRFLOW__SECRETS__BACKEND_KWARGS
as secret to ensure that your Vault credentials are stored securely. - In your Astro project, delete the environment variables from your
Dockerfile
. - Deploy your changes to Astro.
Now, any Airflow variable or connection that you write to your Vault server can be successfully accessed and pulled by any DAG in your Deployment on Astro.
In this section, you'll learn how to use AWS Systems Manager (SSM) Parameter Store as a secrets backend on Astro.
Prerequisites
- A Deployment.
- The Astro CLI.
- An Astro project with version 5.1.0+ of
apache-airflow-providers-amazon
. See Add Python and OS-level packages. - An IAM role with access to the Parameter Store that your Astro cluster can assume. See AWS IAM roles.
Step 1: Create Airflow secrets directories in Parameter Store
Create directories for Airflow variables and connections in Parameter Store that you want to store as secrets.
Variables and connections should should be stored in /airflow/variables
and /airflow/connections
, respectively. For example, if you're setting a secret variable with the key my_secret
, it should be stored in the /airflow/connections/
directory. If you modify the directory paths, make sure you change the values for variables_prefix
and connections_prefix
in Step 2.
For instructions, see the AWS Systems Manager Console, the AWS CLI, or the Tools for Windows PowerShell documentation.
Step 2: Configure your Astro project
Add the following lines to your
Dockerfile
:ENV AIRFLOW__SECRETS__BACKEND=airflow.providers.amazon.aws.secrets.systems_manager.SystemsManagerParameterStoreBackend
ENV AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "/airflow/connections", "variables_prefix": "/airflow/variables", "role_arn": $PARAMETER_STORE_ARN, "region_name": $PARAMETER_STORE_REGION}Add the following lines to your
.env
file:PARAMETER_STORE_ARN=<your-role-arn>
PARAMETER_STORE_REGION=<your-aws-region>
Step 3: Deploy to Astro
Run the following command to deploy the contents of your
.env
file directly to your Cloud UIastro deployment variable create --deployment-id <your-deployment-id> --load --env .env
Run the following command to deploy your Astro project and implement your Dockerfile changes:
astro deploy
The Dockerfile contains the core configuration for your secrets backend. Because your AWS IAM role and region are set as Astro environment variables, you can now configure multiple roles and environments to use the same secrets backend without needing to redeploy changes to your Dockerfile.
To further customize the Airflow and AWS SSM Parameter Store integration, see the full list of available kwargs.
This topic provides setup steps for configuring Google Cloud Secret Manager as a secrets backend on Astro.
Prerequisites
- A Deployment.
- The Astro CLI.
- An Astro project.
- Cloud SDK.
- A Google Cloud environment with Secret Manager configured.
- A service account with the Secret Manager Secret Accessor role on Google Cloud.
- A JSON service account key for the service account.
Step 1: Create an Airflow variable or connection in Google Cloud Secret Manager
To start, create an Airflow variable or connection in Google Cloud Secret Manager that you want to store as a secret. You can do so via the Cloud Console or the gcloud CLI.
You will use this secret to test your backend's functionality in Step 3, so it can be either a real or test value.
Secrets must be formatted such that:
- Airflow variables are set as
airflow-variables-<variable-key>
. - Airflow connections are set as
airflow-connections-<connection-id>
.
For example, to add an Airflow variable with a key my-secret-variable
, you would run the following gcloud CLI command:
gcloud secrets create airflow-variables-<my-secret-variable> \
--replication-policy="automatic"
For more information on creating secrets in Google Cloud Secret Manager, read the Google Cloud documentation.
Step 2: Set up Secret Manager locally
To test Google Secret Manager locally, configure it as a secrets backend in your Astro project.
First, install the Airflow provider for Google by adding the following to your project's requirements.txt
file:
apache-airflow-providers-google
Then, add the following environment variables to your project's Dockerfile:
ENV AIRFLOW__SECRETS__BACKEND=airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend
ENV AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow-connections", "variables_prefix": "airflow-variables", "gcp_keyfile_dict": <your-key-file>}
Make sure to paste your entire JSON service account key in place of <your-key-file>
. In the next step, you'll test that this configuration is valid locally.
If you want to deploy your project to a hosted Git repository before deploying to Astro, be sure to save <your-key-file>
securely. Astronomer recommends adding it to your project's .env
file and specifying this file in .gitignore
. When you deploy to Astro, you should set these values as secrets via the Cloud UI.
Step 3: Run an example DAG to test Secret Manager locally
Write a test DAG which calls the secret you created in Step 1 and add this DAG to your project's dags
directory. For example, you can use the following DAG to print the value of a variable to your task logs:
from airflow import DAG
from airflow.hooks.base import BaseHook
from airflow.models import Variable
from airflow.operators.python import PythonOperator
from datetime import datetime
def print_var():
my_var = Variable.get("<your-variable-key>")
print(f'My variable is: {my_var}')
conn = BaseHook.get_connection(conn_id="<your-connection-key>")
print(conn.get_uri())
with DAG('example_secrets_dag', start_date=datetime(2022, 1, 1), schedule_interval=None) as dag:
test_task = PythonOperator(
task_id='test-task',
python_callable=print_var
)
To test your changes:
Run
astro dev restart
to push your changes to your local Airflow environment.In the Airflow UI (
http://localhost:8080/admin/
), trigger your new DAG.Click on
test-task
> View Logs. If you ran the example DAG above, you should see the contents of your secret in the task logs:{logging_mixin.py:109} INFO - My variable is: my-test-variable
Once you confirm that the setup was successful, you can delete this DAG.
Step 4: Deploy to Astro
Once you've confirmed that the integration with Google Cloud Secret Manager works locally, you can complete a similar set up with a Deployment on Astro.
- In the Cloud UI, add the same environment variables found in your
Dockerfile
to your Deployment environment variables. SpecifyAIRFLOW__SECRETS__BACKEND_KWARGS
as Secret to ensure that your credentials are stored securely. - In your Astro project, delete the environment variables from your
Dockerfile
. - Deploy your changes to Astro.
You now should be able to see your secret information being pulled from Secret Manager on Astro. From here, you can store any Airflow variables or connections as secrets on Secret Manager and use them in your project.
This topic provides setup steps for configuring Azure Key Vault as a secrets backend on Astro.
Prerequisites
- A Deployment.
- The Astro CLI.
- An Astro project.
- An existing Azure Key Vault linked to a resource group.
- Your Key Vault URL. To find this, go to your Key Vault overview page > Vault URI.
If you do not already have Key Vault configured, read Microsoft Azure documentation.
Step 1: Register Astro as an app on Azure
Follow the Microsoft Azure documentation to register a new application for Astro.
At a minimum, you need to add a secret that Astro can use to authenticate to Key Vault.
Note the value of the application's client ID and secret for Step 3.
Step 2: Create an access policy
Follow the Microsoft documentation to create a new access policy for the application that you just registered. The settings you need to configure for your policy are:
- Configure from template: Select
Key, Secret, & Certificate Management
. - Select principal: Select the name of the application that you registered in Step 1.
Step 3: Set up Key Vault locally
In your Astro project, add the following line to your requirements.txt
file:
apache-airflow-providers-microsoft-azure
In your Dockerfile
, add the following environment variables with your own values:
ENV AZURE_CLIENT_ID="<your-client-id>" # Found on App Registration page > 'Application (Client) ID'
ENV AZURE_TENANT_ID="<your-tenant-id>" # Found on App Registration page > 'Directory (tenant) ID'
ENV AZURE_CLIENT_SECRET="<your-client-secret>" # Found on App Registration Page > Certificates and Secrets > Client Secrets > 'Value'
ENV AIRFLOW__SECRETS__BACKEND=airflow.providers.microsoft.azure.secrets.azure_key_vault.AzureKeyVaultBackend
ENV AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow-connections", "variables_prefix": "airflow-variables", "vault_url": "<your-vault-url>"}
This tells Airflow to look for variable information at the airflow/variables/*
path in Azure Key Vault and connection information at the airflow/connections/*
path. In the next step, you'll run an example DAG to test this configuration locally.
By default, this setup requires that you prefix any secret names in Key Vault with airflow-connections
or airflow-variables
. If you don't want to use prefixes in your Key Vault secret names, set the values for sep
, "connections_prefix"
, and "variables_prefix"
to ""
within AIRFLOW__SECRETS__BACKEND_KWARGS
.
If you want to deploy your project to a hosted Git repository before deploying to Astro, be sure to save <your-client-id>
, <your-tenant-id>
, and <your-client-secret>
in a secure manner. When you deploy to Astro, you should set these values as secrets via the Cloud UI.
Step 4: Test Key Vault locally
To test your Key Vault setup on Astro locally, create a new secret in Key Vault containing either a variable or a connection.
Once you create a test secret, write a simple DAG which calls the secret and add this DAG to your project's dags
directory. For example, you can use the following DAG to print the value of a variable to your task logs:
from airflow import DAG
from airflow.hooks.base import BaseHook
from airflow.models import Variable
from airflow.operators.python import PythonOperator
from datetime import datetime
def print_var():
my_var = Variable.get("<your-variable-key>")
print(f'My variable is: {my_var}')
conn = BaseHook.get_connection(conn_id="<your-connection-key>")
print(conn.get_uri())
with DAG('example_secrets_dag', start_date=datetime(2022, 1, 1), schedule_interval=None) as dag:
test_task = PythonOperator(
task_id='test-task',
python_callable=print_var
)
To test your changes:
Run
astro dev restart
to push your changes to your local Airflow environment.In the Airflow UI (
http://localhost:8080/admin/
), trigger your new DAG.Click on
test-task
> View Logs. If you ran the example DAG above, you should see the contents of your secret in the task logs:{logging_mixin.py:109} INFO - My variable is: my-test-variable
Once you confirm that the setup was successful, you can delete this DAG.
Step 5: Push changes to Astro
Once you've confirmed that your secrets are being imported correctly to your local environment, you're ready to configure the same feature in a Deployment on Astro.
- In the Cloud UI, add the same environment variables found in your
Dockerfile
to your Deployment environment variables. Specify theAZURE_CLIENT_ID
,AZURE_TENANT_ID
, andAZURE_CLIENT_SECRET
variables as Secret to ensure that your credentials are stored securely. - In your Astro project, delete the environment variables from your
Dockerfile
. - Deploy your changes to Astro.
From here, you can store any Airflow variables or connections as secrets on Key Vault and use them in your project.