Test and troubleshoot locally
As you develop data pipelines on Astro, Astronomer recommends running and testing your DAGs locally before deploying your project to a Deployment on Astro. This document provides information about testing and troubleshooting DAGs in a local Apache Airflow environment with the Astro CLI.
For information about creating an Astro project, see Create an Astro project. For information about adding DAGs to your Astro project and applying changes, see Develop a project.
Run a DAG with
astro run command to run a DAG from the command line. When you run the command, the CLI compiles your DAG and runs it in a single Airflow worker container based on your Astro project configurations, including your
Dockerfile, DAG utility files, Python requirements, and environment variables. You can review task logs and whether a task succeeded or failed in your terminal without opening the Airflow UI. You can only run one DAG at a time. Running DAGs without a scheduler or webserver can help reduce the time required to develop and test data pipelines.
To run a DAG located within your local
/dags directory run:
astro run <dag-id>
All the tasks in your DAG run sequentially. Any errors produced by your code while parsing or running your DAG appear in the command line. For more information about this command, see the CLI command reference.
Test DAGs with the Astro CLI
To enhance the development experience for data pipelines, Astro enables users to run DAG unit tests with two different Astro CLI commands:
astro dev parse
astro dev pytest
To quickly parse your DAGs, you can run:
astro dev parse
This command parses your DAGs to ensure that they don't contain any basic syntax or import errors and that they can successfully render in the Airflow UI.
astro dev parse is a more convenient but less customizable version of
astro dev pytest. If you don't have any specific test files that you want to run on your DAGs, Astronomer recommends using
astro dev parse as your primary testing tool. For more information about this command, see the CLI command reference.
Run tests with pytest
To perform unit tests on your Astro project, you can run:
astro dev pytest
This command runs all tests in your project's
tests directory with pytest, a testing framework for Python. With pytest, you can test custom Python code and operators locally without having to start a local Airflow environment.
By default, the
tests directory in your Astro project includes a default DAG integrity test called
test_dag_integrity.py. This test checks that:
- All Airflow tasks have required arguments.
- DAG IDs are unique across the Astro project.
- DAGs have no cycles.
- There are no general import or syntax errors.
astro dev pytest runs this default test alongside any other custom tests that you add to the
tests directory. For more information about this command, see the CLI command reference.
View Airflow logs
You can use the Astro CLI to view logs for Airflow tasks and components from your local Airflow environment. This is useful if you want to troubleshoot a specific task instance, or if your environment suddenly stops working after a code change.
See View logs.
Run Airflow CLI commands
To run Apache Airflow CLI commands locally, run the following:
astro dev run <airflow-cli-command>
For example, the Airflow CLI command for listing connections is
airflow connections list. To run this command with the Astro CLI, you would run
astro dev run connections list instead.
astro dev run with the Astro CLI is the equivalent of running
docker exec in local containers and then running an Airflow CLI command within those containers.
You can only use
astro dev run in a local Airflow environment. To automate Airflow actions on Astro, you can use the Airflow REST API. For example, you can make a request to the
dagRuns endpoint to trigger a DAG run programmatically, which is equivalent to running
airflow dags trigger in the Airflow CLI.
Make requests to the Airflow REST API locally
Make requests to the Airflow REST API in a local Airflow environment with HTTP basic access authentication. This can be useful for testing and troubleshooting API calls before executing them in a Deployment on Astro.
To make local requests with cURL or Python, you only need the username and password for your local user. Both of these values are
admin by default. They are the same credentials that are listed when you run
astro dev start with the Astro CLI and required by the Airflow UI in a local environment.
To make requests to the Airflow REST API in a Deployment on Astro, see Airflow API.
curl -X GET localhost:8080/api/v1/<endpoint> --user "admin:admin"
response = requests.get(
Troubleshoot KubernetesPodOperator issues
View local Kubernetes logs to troubleshoot issues with Pods that are created by the operator. See Test and Troubleshoot the KubernetesPodOperator Locally.
Hard reset your local environment
In most cases, restarting your local project is sufficient for testing and making changes to your project. However, it is sometimes necessary to kill your Docker containers and metadata database for testing purposes. To do so, run the following command:
astro dev kill
This command forces your running containers to stop and deletes all data associated with your local Postgres metadata database, including Airflow connections, logs, and task history.
Troubleshoot dependency errors
When dependency errors occur, the error message that is returned often doesn't contain enough information to help you resolve the error. To retrieve additional error information, you can review individual operating system or python package dependencies inside your local Docker containers.
For example, if your
packages.txt file contains the
libsas12-dev packages and you receive build errors after running
astro dev start, you can enter the container and install the packages manually to review additional information about the errors.
packages.txtfiles for your project and remove the references to the packages that are returning error messages.
Run the following command to build your Astro project into a Docker image and start a local Docker container for each Airflow component:
astro dev start
Run the following command to open a bash terminal in your scheduler container:
astro dev bash --scheduler
In the bash terminal for your container, run the following command to install a package and review any error messages that are returned:
apt-get install <package-name>
For example, to install the GNU Compiler Collection (GCC) compiler, you would run:
apt-get install gcc
packages.txtfiles for your project and add the package references you removed in step 1 one by one until you find the package that is the source of the error.
Override the CLI Docker Compose file
The Astro CLI uses a default set of Docker Compose configurations to define and run local Airflow components. For advanced testing cases, you might need to override these default configurations. For example:
- Adding extra containers to mimic services that your Airflow environment needs to interact with locally, such as an SFTP server.
- Change the volumes mounted to any of your local containers.
The Astro CLI does not support overrides to environment variables that are required globally. For the list of environment variables that Astro enforces, see Global environment variables. To learn more about environment variables, read Environment variables.
- Reference the Astro CLI's default Docker Compose file (
composeyml.yml) and determine one or more configurations to override.
- Add a
docker-compose.override.ymlfile to your Astro project.
- Specify your new configuration values in
docker-compose.override.ymlfile using the same format as in
For example, to add another volume mount for a directory named
custom_dependencies, add the following to your
Make sure to specify
version: "3.1" and follow the format of the source code file linked above.
To see your override file live in your local Airflow environment, run the following command to see the file in your scheduler container:
astro dev bash --scheduler "ls -al"
Troubleshoot common issues
Use the information provided here to resolve common issues with running an Astro project in a local environment.
New DAGs aren't visible in the Airflow UI
Make sure that no DAGs have duplicate
dag_ids. When two DAGs use the same
dag_id, the newest DAG won't appear in the Airflow UI and you won't receive an error message.
By default, the Airflow scheduler scans the
dags directory of your Astro project for new files every 300 seconds (5 minutes). For this reason, it might take a few minutes for new DAGs to appear in the Airflow UI. Changes to existing DAGs appear immediately.
To have the scheduler check for new DAGs more frequently, you can set the
AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL environment variable to less than 300 seconds. If you have less than 200 DAGs in a Deployment, it's safe to set
30 (30 seconds). See Environment variables.
In Astro Runtime 7.0 and later, the Airflow UI Code page includes a Parsed at value which shows when a DAG was last parsed. This value can help you determine when a DAG was last rendered in the Airflow UI. To view the Parsed at value in the Airflow UI, click DAGs, select a DAG, and then click Code. The Parsed at value appears at the top of the DAG code pane.
DAGs are running slowly
If your Astro project contains many DAGs or tasks, then you might experience performance issues in your local Airflow environment.
To improve the performance of your environment, you can:
- Adjust CPU and memory resource allocation in your Docker configuration. Be aware that increasing Docker resource allocation might decrease the performance of your computer.
- Modify Airflow-level environment variables, including concurrency and parallelism. See Scaling out Airflow.
Generating DAGs dynamically can also decrease the performance of your local Airflow environment, though it's a common authoring pattern for advanced use cases. For more information, see Dynamically Generating DAGs in Airflow. If your DAGs continue to run slowly and you can't scale Docker or Airflow any further, Astronomer recommends pushing your project to a Deployment on Astro that's dedicated to testing.
If you don't have enough Docker resources allocated to your local Airflow environment, you might see tasks fail and exit with this error:
Task exited with return code Negsignal.SIGKILL
If you see this error, increase the CPU and memory allocated to Docker. If you're using Docker Desktop, you can do this by opening Docker Desktop and going to Preferences > Resources > Advanced. See Change Docker Desktop preferences on Mac.
Astro project won't load after
astro dev start
If you're running the Astro CLI on a Mac computer that's built with the Apple M1 chip, your Astro project might take more than 5 mins to start after running
astro dev start. This is a current limitation of Astro Runtime and the Astro CLI.
If your project won't load, it might also be because your webserver or scheduler is unhealthy. In this case, you might need to debug your containers.
astro dev start, retrieve a list of running containers by running
astro dev ps.
If the webserver and scheduler containers exist but are unhealthy, check their logs by running:
$ astro dev logs --webserver
$ astro dev logs --scheduler
Optional. Run the following command to prune volumes and free disk space:
docker system prune --volumes
These logs should help you understand why your webserver or scheduler is unhealthy. Possible reasons why these containers might be unhealthy include:
- Not enough Docker resources.
- A failed Airflow or Astro Runtime version upgrade.
- Misconfigured Dockerfile or Docker override file.
- Misconfigured Airflow settings.
Ports are not available
By default, the Astro CLI uses port
8080 for the Airflow webserver and port
5432 for the Airflow metadata database in a local Airflow environment. If these ports are already in use on your local computer, an error message similar to the following appears:
Error: error building, (re)creating or starting project containers: Error response from daemon: Ports are not available: exposing port TCP 0.0.0.0:5432 → 0.0.0.0:0: listen tcp 0.0.0.0:5432: bind: address already in use
To resolve a port availability error, you have the following options:
- Stop all running Docker containers and restart your local environment.
- Change the default ports for these components.
Stop all running Docker containers
docker psto identify the Docker containers running on your computer.
Copy the values in the
Select one of the following options:
docker stop <container_id>to stop a specific Docker container. Replace
<container_id>with one of the values you copied in step 2.
docker stop $(docker ps -q)to stop all running Docker containers.
Change the default port assignment
If port 8080 or 5432 are in use on your machine by other services, the Airflow webserver and metadata database won't be able to start. To run these components on different ports, run the following commands in your Astro project:
astro config set webserver.port <available-port>
astro config set postgres.port <available-port>
For example, to use 8081 for your webserver port and 5435 for your database port, you would run the following commands:
astro config set webserver.port 8081
astro config set postgres.port 5435