Set up Astro alerts
Astro alerts provide an additional level of observability to Airflow's notification systems. You can configure an alert to notify you in Slack, PagerDuty, or through email when a DAG completes, if you have a DAG run failure, or if a task duration exceeds a specified time.
Unlike Airflow callbacks and SLAs, Astro alerts require no changes to DAG code. Follow this guide to set up your Slack, PagerDuty, or email to receive alerts from Astro and then configure your Deployment to send alerts.
To configure Airflow notifications, see Airflow email notifications and Manage Airflow DAG notifications.
Alert types
Each Astro alert has a communication channel and a trigger type. The communication channel determines the format and destination of an alert, and the trigger type defines what causes the alert trigger.
Trigger types
You can trigger an alert to a communication channel using one of the following trigger types:
- DAG failure: The alert triggers whenever the specified DAG fails.
- DAG success: The alert triggers whenever the specified DAG completes
- Task duration: The alert triggers when a specified task takes longer than expected to complete.
- Absolute Time: The alert triggers when a given DAG does not have a successful DAG run within a defined time window.
You can only set a task duration trigger for an individual task. Alerting on task group duration is not supported.
Communication channels
You can send Astro alerts to the following communication channels
- Slack
- PagerDuty
- DAG trigger
The DAG Trigger communication channel works differently from other communication channel types. Instead of sending a pre-formatted alert message, Astro makes a generic request through the Airflow REST API to trigger a DAG on Astro. You can configure the triggered DAG to complete any action, such as sending a message to your own incident management system or writing data about an incident to a table.
Prerequisites
- An Astro project.
- An Astro Deployment. Your Deployment must run Astro Runtime 7.1.0 or later to configure Astro alerts, and it must also have OpenLineage enabled.
- A Slack workspace, PagerDuty service, or email address.
Astro alerts requires OpenLineage. By default, every Astro Deployment has OpenLineage enabled. If you disabled OpenLineage in your Deployment, you need to enable it to use Astro alerts. See Disable OpenLineage to find how to disable and re-enable OpenLineage.
Step 1: Configure your communication channel
- Slack
- PagerDuty
- DAG Trigger
To set up alerts in Slack, you need to create a Slack app in your Slack workspace. After you've created your app, you can generate a webhook URL in Slack where Astro will send Astro alerts.
-
Go to Slack API: Applications to create a new app in your organization's Slack workspace.
-
Click From scratch when prompted to choose how you want to create your app.
-
Enter a name for your app, like
astro-alerts
, choose the Slack workspace where you want Astro to send your alerts, and then click Create App.
If you do not have permission to install apps into your Slack workspace, you can still create the app, but you will need to request that an administrator from your team completes the installation.
-
Select Incoming webhooks.
-
On the Incoming webhooks page, click the toggle to turn on Activate Incoming Webhooks. See Sending messages using Incoming Webhooks.
-
In the Webhook URLs for your Workspace section, click Add new Webhook to Workspace.
If you do not have permission to install apps in your Slack workspace, click Request to Add New Webhook to send a request to your organization administrator.
-
Choose the channel where you want to send your Astro alerts and click Allow.
-
After your webhook is created, copy the webhook URL from the new entry in the Webhook URLs for your Workspace table.
To set up an alert integration with PagerDuty, you need access to your organization's PagerDuty Service. PagerDuty uses the Events API v2 to create a new integration that connects your Service with Astro.
- Open your PagerDuty service and click the Integrations tab.
-
Click Add an integration.
-
Select Events API v2 as the Integration Type.
-
On your Integrations page, open your new integration and enter an Integration Name.
-
Copy the Integration Key for your new Astro alert integration.
No external configuration is required for the email integration. Astronomer recommends allowlisting astronomer.io
with your email provider to ensure that no alerts go to your spam folder. Alerts are sent from no-reply@astronomer.io
.
This feature is in Private Preview. Please reach out to your customer success manager to enable this feature.
The DAG Trigger communication channel works differently from other communication channel types. Instead of sending a pre-formatted alert message, Astro makes a generic request through the Airflow REST API to trigger a DAG on Astro. You can configure the triggered DAG to complete any action, such as sending a message to your own incident management system or writing data about an incident to a table.
- Create a DAG that you want to run when the alert is triggered. For example, you can use the following DAG to run arbitrary Python code when the alert is triggered:
import datetime
from typing import Any
from airflow.models.dag import DAG
from airflow.operators.python import PythonOperator
with DAG(
dag_id="register_incident",
start_date=datetime.datetime(2023, 1, 1),
schedule=None,
):
def _register_incident(params: dict[str, Any]):
# Here you can run arbitrary Python code. Example DAG run conf payload:
# {
# "dagName": "fail_dag",
# "alertType": "PIPELINE_FAILURE",
# "alertId": "d75e7517-88cc-4bab-b40f-660dd79df216",
# "message": "[Astro Alerts] Pipeline failure detected on DAG fail_dag. \\nStart time: 2023-11-17 17:32:54 UTC. \\nFailed at: 2023-11-17 17:40:10 UTC. \\nAlert notification time: 2023-11-17 17:40:10 UTC. \\nClick link to investigate in Astro UI: https://cloud.astronomer.io/clkya6zgv000401k8zafabcde/dags/clncyz42l6957401bvfuxn8zyxw/fail_dag/c6fbe201-a3f1-39ad-9c5c-817cbf99d123?utm_source=alert\"\\n"
# }
# Example:
failed_dag = params["dagName"]
print(f"Register an incident in my system for DAG {failed_dag}.")
PythonOperator(task_id="register_incident", python_callable=_register_incident)
- Deploy the DAG to any Deployment in the Workspace where you want to create the alert. The DAG that triggers the alert and the DAG that the alert runs can be in different Deployments, but they must be deployed in the same Workspace.
- Create a Deployment API token for the Deployment where you deployed the DAG that the alert will run. Copy the token to use in the next step.
Step 2: Create your Workspace alert in the Astro UI
In the Astro UI, you can enable alerts from the Workspace Settings page.
-
In the Astro UI, click Alerts.
-
Click Add Alert.
-
Enter your Alert Name and choose the alert type, DAG Success, DAG Failure, or Task Duration.
-
Choose the Communication Channels where you want to send your alert.
-
Add your communication channel information.
- Slack
- PagerDuty
- DAG Trigger
Paste the Webhook URL from your Slack workspace app. If you need to find a URL for an app you've already created, go to your Slack Apps page, select your app, and then choose the Incoming Webhooks page.
Paste the Integration Key from your PagerDuty Integration and select the Severity of the alert.
Enter the email addresses that should receive the alert.
Select the Deployment where your DAG is deployed, then select the DAG. Enter the Deployment API token that you created in Step 1.
-
Add DAGs or tasks that your alert applies to.
-
DAG failure: Click + DAG to choose the Deployment and the DAG that you want to send an alert about if it fails.
-
DAG success: Click DAG and choose the Deployment and the DAG that you want to send an alert about when it completes.
-
Task duration: Click Task and choose the Deployment, DAG, and task name. Enter the Duration for how long a task should take to run before you send an alert to your communication channels.
-
Absolute Time: Click + DAG and choose the Deployment and the DAG that you want the alert to assess. Then, select the Days of Week that the alert should observe, the Verification Time when it should look for a DAG success, and the Lookback Period for how long it should look back for a verification time.
For example, if an alert has a Verification Time of 3:00 PM and a Lookback Period of 60 minutes, it will trigger whenever the given DAG does not produce a successful DAG run from 2:00 to 3:00 PM. Astro applies the times you specify based on the time zone of your current web browser session, then translates them to UTC in your Airflow environment.
-
-
(Optional) Repeat Step 6 for additional DAGs or tasks that you want to alert on.
-
Click Create alert.
Step 3: (Optional) Test your DAG failure alert
Astro alerts work whether your DAG run is manual or scheduled, so you can test your configured Astro alerts by failing your DAG manually.
-
In the Astro UI, click DAGs.
-
Choose the DAG that has your alert configured.
-
Trigger a DAG run.
-
Select Mark as and choose Failed to trigger an alert for a DAG failure.
-
Check your Slack, PagerDuty, or Email alerts for your DAG failure alert. The alert includes information about the DAG, Workspace, Deployment, and data lineage associated with the failure as well as direct links to the Astro UI.