Configure an external secrets backend on Astro
Apache Airflow variables and connections often contain sensitive information about your external systems that should be kept secret in a secure, centralized location that complies with your organization's security requirements.
While secret values of Airflow variables and connections are encrypted in the Airflow metadata database of every Deployment, Astronomer recommends integrating with a secrets backend tool. This guide explains how to configure connections to various secrets backend tools on Astro.
If you only need a local connection to your cloud for testing purposes, consider mounting your user credentials to a local Airflow environment. While this implementation is not recommended for deployed environments, it lets you quickly test pipelines with data hosted in your cloud. See Authenticate to cloud services.
Benefits
Integrating a secrets backend tool with Astro allows you to:
- Store Airflow variables and connections in a centralized location alongside secrets from other tools and systems used by your organization, including Kubernetes secrets, SSL certificates, and more.
- Comply with internal security postures and policies that protect your organization.
- Recover in the case of an incident.
- Automatically pull Airflow variables and connections that are already stored in your secrets backend when you create a new Deployment instead of having to set them manually in the Airflow UI.
Astro integrates with the following secrets backend tools:
- Hashicorp Vault
- AWS Systems Manager Parameter Store
- AWS Secrets Manager
- Google Cloud Secret Manager
- Azure Key Vault
Secrets backend integrations are configured individually with each Astro Deployment.
If you enable a secrets backend on Astro, you can continue to define Airflow variables and connections either as environment variables or in the Airflow UI. If you set Airflow variables and connections in the Airflow UI, they are stored as encrypted values in the Airflow metadata database.
Airflow checks for the value of an Airflow variable or connection in the following order:
- Secrets backend
- Environment variable
- The Airflow UI
Using secrets to set Airflow connections requires knowledge of how to generate Airflow connection URIs. If you plan to store Airflow connections on your secrets backend, see URI format for guidance on how to generate a connection URI.
Setup
- AWS Secrets Manager
- Hashicorp Vault
- Google Cloud Secret Manager
- Azure Key Vault
- AWS Parameter Store
This topic provides setup steps for configuring AWS Secrets Manager as a secrets backend on Astro.
For more information about Airflow and AWS connections, see Amazon Web Services Connection.
Prerequisites
- A Deployment.
- The Astro CLI.
- An Astro project with
apache-airflow-providers-amazon
version 5.1.0 or later. See Add Python and OS-level packages. - An IAM role with the
SecretsManagerReadWrite
policy that your Astro cluster can assume. See AWS IAM roles.
Add Airflow secrets to Secrets Manager
Create directories for Airflow variables and connections in AWS Secrets Manager that you want to store as secrets. You can use real or test values.
- When setting the secret type, choose
Other type of secret
and select thePlaintext
option. - If creating a connection URI or a non-dict variable as a secret, remove the brackets and quotations that are pre-populated in the plaintext field.
- The secret name is assigned after providing the plaintext value and clicking
Next
.
Secret names must correspond with the connections_prefix
and variables_prefix
set below in step 2. Specifically:
If you use
"variables_prefix": "airflow/variables"
, you must set Airflow variable names as:airflow/variables/<variable-key>
The
<variable-key>
is how you will retrieve that variable's value in a DAG. For example:my_var = Variable.get("variable-key>")
If you use
"connections_prefix": "airflow/connections"
, you must set Airflow connections as:airflow/connections/<connection-id>
The
<connection-id>
is how you will retrieve that connection's URI in a DAG. For example:conn = BaseHook.get_connection(conn_id="<connection-id>")
Be sure to not include a leading
/
at the beginning of your variable or connection name
For more information on adding secrets to Secrets Manager, see AWS documentation.
Set up Secrets Manager locally
Add the following environment variables to your Astro project's .env
file:
AIRFLOW__SECRETS__BACKEND=airflow.providers.amazon.aws.secrets.secrets_manager.SecretsManagerBackend
AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow/connections", "variables_prefix": "airflow/variables"}
AWS_DEFAULT_REGION=<region>
AWS_ACCESS_KEY_ID=<Access Key> # Make sure the user has the permission to access secret manager
AWS_SECRET_ACCESS_KEY=<secret key>
After you configure an Airflow connection to AWS, can run a DAG locally to check that your variables are accessible using Variable.get("<your-variable-key>")
.
Deploy environment variables to Astro
Run the following commands to export your secrets backend configurations as environment variables to Astro.
$ astro deployment variable create --deployment-id <your-deployment-id> AIRFLOW__SECRETS__BACKEND=airflow.providers.amazon.aws.secrets.secrets_manager.SecretsManagerBackend
$ astro deployment variable create --deployment-id <your-deployment-id> AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow/connections", "variables_prefix": "airflow/variables", "role_arn": "<your-role-arn>", "region_name": "<your-region>"} --secretOptional. Remove the environment variables from your
.env
file or store your.env
file in a safe location to protect your credentials.infoIf you delete the
.env
file, the Secrets Manager backend won't work locally.Open the Airflow UI for your Deployment and create an Amazon Web Services connection without credentials. When you use this connection in a DAG, Airflow will automatically fall back to using the credentials in your configured environment variables.
To further customize the Airflow and AWS SSM Parameter Store integration, see the full list of available kwargs.
This topic provides steps for using Hashicorp Vault as a secrets backend for both local development and on Astro. To do this, you will:
- Create an AppRole in Vault which grants Astro minimal required permissions.
- Write a test Airflow variable or connection as a secret to your Vault server.
- Configure your Astro project to pull the secret from Vault.
- Test the backend in a local environment.
- Deploy your changes to Astro.
Prerequisites
- A Deployment on Astro.
- The Astro CLI.
- A local or hosted Vault server. See Starting the Server or Create a Vault Cluster on HCP.
- An Astro project.
- The Vault CLI.
- Your Vault Server's URL. If you're using a local server, this should be
http://127.0.0.1:8200/
.
If you do not already have a Vault server deployed but would like to test this feature, Astronomer recommends that you either:
- Sign up for a Vault trial on Hashicorp Cloud Platform (HCP) or
- Deploy a local Vault server. See Starting the server in Hashicorp documentation.
Create a Policy and AppRole in Vault
To use Vault as a secrets backend, Astronomer recommends configuring a Vault AppRole with a policy that grants only the minimum necessary permissions for Astro. To do this:
Run the following command to create a Vault policy that Astro can use to access a Vault server:
vault auth enable approle
vault policy write astro_policy - <<EOF
path "secret/*" {
capabilities = ["create", "read", "update", "patch", "delete", "list"]
}
EOFRun the following command to create a Vault AppRole:
vault auth enable approle
vault write auth/approle/role/astro_role \
role_id=astro_role \
secret_id_ttl=0 \
secret_id_num_uses=0 \
token_num_uses=0 \
token_ttl=24h \
token_max_ttl=24h \
token_policies=astro_policyRun the following commands to retrieve the
secret-id
for your AppRole:vault write -f auth/approle/role/<your-approle>/secret-id
Save this value. You'll use this later to complete the setup.
Create an Airflow variable or connection in Vault
To start, create an Airflow variable or connection in Vault that you want to store as a secret. It can be either a real or test value. You will use this secret to test your backend's functionality in Step 4.
To store an Airflow variable in Vault as a secret, run the following Vault CLI command with your own values:
vault secrets enable -path=secret -version=2 kv
vault kv put secret/variables/<your-variable-key> value=<your-variable-value>
To store a connection in Vault as a secret, run the following Vault CLI command with your own values:
vault secrets enable -path=secret -version=2 kv
vault kv put secret/connections/<your-connection-id> conn_uri=<connection-type>://<connection-login>:<connection-password>@<connection-host>:5432
To confirm that your secret was written to Vault successfully, run:
# For variables
$ vault kv get secret/variables/<your-variable-key>
# For connections
$ vault kv get secret/connections/<your-connection-id>
Set up Vault locally
In your Astro project, add the Hashicorp Airflow provider to your project by adding the following to your requirements.txt
file:
apache-airflow-providers-hashicorp
Then, add the following environment variables to your .env
file:
AIRFLOW__SECRETS__BACKEND=airflow.providers.hashicorp.secrets.vault.VaultBackend
AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_path": "connections", "variables_path": "variables", "config_path": null, "url": "http://host.docker.internal:8200", "auth_type": "approle", "role_id":"astro_role", "secret_id":"<your-approle-secret>"}
If you run Vault on Hashicorp Cloud Platform (HCP):
- Replace
http://host.docker.internal:8200
withhttps://<your-cluster>.hashicorp.cloud:8200
. - Add
"namespace": "admin"
as an argument afterurl
.
This tells Airflow to look for variable and connection information at the secret/variables/*
and secret/connections/*
paths in your Vault server. You can now run a DAG locally to check that your variables are accessible using Variable.get("<your-variable-key>")
.
For more information on the Airflow provider for Hashicorp Vault and how to further customize your integration, see the Apache Airflow documentation.
Deploy to Astro
Run the following commands to export your environment variables to Astro:
$ astro deployment variable create --deployment-id <your-deployment-id> AIRFLOW__SECRETS__BACKEND=airflow.providers.hashicorp.secrets.vault.VaultBackend
$ astro deployment variable create --deployment-id <your-deployment-id> AIRFLOW__SECRETS__BACKEND_KWARGS='{"connections_path": "connections", "variables_path": "variables", "config_path": null, "url": "http://host.docker.internal:8200", "auth_type": "approle", "role_id":"astro_role", "secret_id":"<your-approle-secret>"}' --secretRun the following command to push your updated
requirements.txt
file to Astro:astro deploy --deployment-id <your-deployment-id>
Optional. Remove the environment variables from your
.env
file or store your.env
file in a safe location to protect your credentials inAIRFLOW__SECRETS__BACKEND_KWARGS
.
Now, any Airflow variable or connection that you write to your Vault server can be successfully accessed and pulled by any DAG in your Deployment on Astro.
In this section, you'll learn how to use AWS Systems Manager (SSM) Parameter Store as a secrets backend on Astro.
Prerequisites
- A Deployment.
- The Astro CLI.
- An Astro project with version 5.1.0+ of
apache-airflow-providers-amazon
. See Add Python and OS-level packages. - An IAM role with access to the Parameter Store that your Astro cluster can assume. See AWS IAM roles.
Create Airflow secrets directories in Parameter Store
Create directories for Airflow variables and connections in Parameter Store that you want to store as secrets.
Variables and connections should be stored in /airflow/variables
and /airflow/connections
, respectively. For example, if you're setting a secret variable with the key my_secret
, it should be stored in the /airflow/connections/
directory. If you modify the directory paths, make sure you change the values for variables_prefix
and connections_prefix
in Step 2.
For instructions, see the AWS Systems Manager Console, the AWS CLI, or the Tools for Windows PowerShell documentation.
Set up Parameter Store locally
Add the following environment variables to your Astro project's .env
file:
AIRFLOW__SECRETS__BACKEND=airflow.providers.amazon.aws.secrets.systems_manager.SystemsManagerParameterStoreBackend
AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow/connections", "variables_prefix": "airflow/variables", "role_arn": "<your-role-arn>", "region_name": "<your-region>"}
You can now run a DAG locally to check that your variables are accessible using Variable.get("<your-variable-key>")
.
Deploy environment variables to Astro
Run the following commands to export your secrets backend configurations as environment variables to Astro.
$ astro deployment variable create --deployment-id <your-deployment-id> AIRFLOW__SECRETS__BACKEND=airflow.providers.amazon.aws.secrets.systems_manager.SystemsManagerParameterStoreBackend
$ astro deployment variable create --deployment-id <your-deployment-id> AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow/connections", "variables_prefix": "airflow/variables", "role_arn": "<your-role-arn>", "region_name": "<your-region>"} --secretOptional. Remove the environment variables from your
.env
file or store your.env
file in a safe location to protect your credentials inAIRFLOW__SECRETS__BACKEND_KWARGS
.
This topic provides setup steps for configuring Google Cloud Secret Manager as a secrets backend on Astro.
Prerequisites
- A Deployment.
- The Astro CLI.
- An Astro project.
- Cloud SDK.
- A Google Cloud environment with Secret Manager configured.
- A service account with the Secret Manager Secret Accessor role on Google Cloud.
- Optional: A JSON service account key for the service account. This is required to provide access to a secrets backend from a local machine, or when you're not using Workload Identity.
Create an Airflow variable or connection in Google Cloud Secret Manager
To start, create an Airflow variable or connection in Google Cloud Secret Manager that you want to store as a secret. You can use the Cloud Console or the gcloud CLI.
Secrets must be formatted such that:
- Airflow variables are set as
airflow-variables-<variable-key>
. - Airflow connections are set as
airflow-connections-<connection-id>
.
For example, to add an Airflow variable with a key my-secret-variable
, you would run the following gcloud CLI command:
gcloud secrets create airflow-variables-<my-secret-variable> \
--replication-policy="automatic"
For more information on creating secrets in Google Cloud Secret Manager, read the Google Cloud documentation.
Set up GCP Secret Manager locally
Copy the complete JSON service account key for the service account that will be used to access Secret Manager.
Add the following environment variables to your Astro project's
.env
file, replacing<your-service-account-key>
with the key you copied in step 1:AIRFLOW__SECRETS__BACKEND=airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend
AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow-connections", "variables_prefix": "airflow-variables", "gcp_keyfile_dict": "<your-service-account-key>"}Optional. Run
Variable.get("<your-variable-key>")
to run a DAG locally and confirm that your variables are accessible.
Configure Secret Manager on Astro using Workload Identity (Recommended)
Set up Workload Identity for your Airflow Deployment. See Connect Astro to GCP data sources.
Run the following commands to set the secrets backend for your Astro Deployment:
$ astro deployment variable create --deployment-id <your-deployment-id> AIRFLOW__SECRETS__BACKEND=airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend
$ astro deployment variable create --deployment-id <your-deployment-id> AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow-connections", "variables_prefix": "airflow-variables", "project_id": "<your-secret-manager-project-id>"}Optional. Remove the environment variables from your
.env
file or store your.env
file in a safe location to protect your credentials inAIRFLOW__SECRETS__BACKEND_KWARGS
.
To ensure the security of secrets, the .env
variable is only available in your local environment and not in the Cloud UI . See Set Environment Variables Locally.
Configure Secret Manager on Astro using a service account JSON key file
Set up the Secret Manager locally. See Set up GCP Secret Manager locally.
Run the following command to set the
SECRET_VAR_SERVICE_ACCOUNT
environment variable on your Astro Deployment:astro deployment variable create --deployment-id <your-deployment-id> SECRET_VAR_SERVICE_ACCOUNT="<your-service-account-key>" --secret
Optional. Remove the environment variables from your
.env
file or store your.env
file in a safe location to protect your credentials inAIRFLOW__SECRETS__BACKEND_KWARGS
.
This topic provides setup steps for configuring Azure Key Vault as a secrets backend on Astro.
Prerequisites
- A Deployment.
- The Astro CLI.
- An Astro project.
- An existing Azure Key Vault linked to a resource group.
- Your Key Vault URL. To find this, go to your Key Vault overview page > Vault URI.
If you do not already have Key Vault configured, read Microsoft Azure documentation.
Register Astro as an app on Azure
Follow the Microsoft Azure documentation to register a new application for Astro.
At a minimum, you need to add a secret that Astro can use to authenticate to Key Vault.
Note the value of the application's client ID and secret for Step 3.
Create an access policy
Follow the Microsoft documentation to create a new access policy for the application that you just registered. The settings you need to configure for your policy are:
- Configure from template: Select
Key, Secret, & Certificate Management
. - Select principal: Select the name of the application that you registered in Step 1.
Set up Key Vault locally
In your Astro project, add the following line to your requirements.txt
file:
apache-airflow-providers-microsoft-azure
Add the following environment variables to your .env
file:
AZURE_CLIENT_ID="<your-client-id>" # Found on App Registration page > 'Application (Client) ID'
AZURE_TENANT_ID="<your-tenant-id>" # Found on App Registration page > 'Directory (tenant) ID'
AZURE_CLIENT_SECRET="<your-client-secret>" # Found on App Registration Page > Certificates and Secrets > Client Secrets > 'Value'
AIRFLOW__SECRETS__BACKEND=airflow.providers.microsoft.azure.secrets.key_vault.AzureKeyVaultBackend
AIRFLOW__SECRETS__BACKEND_KWARGS={"connections_prefix": "airflow-connections", "variables_prefix": "airflow-variables", "vault_url": "<your-vault-url>"}
This tells Airflow to look for variable information at the airflow/variables/*
path in Azure Key Vault and connection information at the airflow/connections/*
path. You can now run a DAG locally to check that your variables are accessible using Variable.get("<your-variable-key>")
.
By default, this setup requires that you prefix any secret names in Key Vault with airflow-connections
or airflow-variables
. If you don't want to use prefixes in your Key Vault secret names, set the values for sep
, "connections_prefix"
, and "variables_prefix"
to ""
within AIRFLOW__SECRETS__BACKEND_KWARGS
.
Deploy to Astro
Run the following commands to export your environment variables to Astro.
astro deployment variable create --deployment-id <your-deployment-id> --load --env .env
In the Cloud UI, mark
AZURE_CLIENT_ID
,AZURE_TENANT_ID
, andAZURE_CLIENT_SECRET
, andAIRFLOW__SECRETS__BACKEND_KWARGS
as Secret. See Set environment variables in the Cloud UI.Run the following command to push your updated
requirements.txt
file to Astro:astro deploy --deployment-id <your-deployment-id>
Optional. Remove the environment variables from your
.env
file, or store your.env
file so that your credentials are hidden, for example with GitHub secrets.