Get started with Airflow using the Astro CLI
One of the Astro CLI's main features is its ability to run Airflow on your local computer. After you install the Astro CLI and Docker Desktop, follow these steps to quickly build a project and run Airflow locally.
Prerequisites
- The Astro CLI
- Docker Desktop (v18.09 or higher).
Step 1: Create an Astro project
An Astro project contains the set of files necessary to run Airflow, including dedicated folders for your DAG files, plugins, and dependencies. All new Astro projects contain two example DAGs. This set of files builds a Docker image that you can both run on your local machine with Airflow and deploy to Astro.
astro dev init
This command generates the following files in the directory:
.
├── .env # Local environment variables
├── dags # Where your DAGs go
│ ├── example-dag-basic.py # Example DAG that showcases a simple ETL data pipeline
│ └── example-dag-advanced.py # Example DAG that showcases more advanced Airflow features, such as the TaskFlow API
├── Dockerfile # For the Astro Runtime Docker image, environment variables, and overrides
├── include # For any other files you'd like to include
├── plugins # For any custom or community Airflow plugins
│ └── example-plugin.py
├── tests # For any DAG unit test files to be run with pytest
│ └── test_dag_integrity.py # Test that checks for basic errors in your DAGs
├── airflow_settings.yaml # For your Airflow connections, variables and pools (local only)
├── packages.txt # For OS-level packages
└── requirements.txt # For Python packages
Step 2: Run Airflow locally
Running your project locally allows you to test your DAGs before you deploy them to a production environment. While this step is not required for deploying and running your code on Astro, Astronomer recommends always using the Astro CLI to test locally before deploying.
To start running your project in a local Airflow environment, run the following command from your project directory:
astro dev start
This command builds your project and spins up 4 Docker containers on your machine, each for a different Airflow component:
- Postgres: Airflow's metadata database
- Webserver: The Airflow component responsible for rendering the Airflow UI
- Scheduler: The Airflow component responsible for monitoring and triggering tasks
- Triggerer: The Airflow component responsible for running Triggers and signaling tasks to resume when their conditions have been met. The triggerer is used exclusively for tasks that are run with deferrable operators
After your project builds successfully, open the Airflow UI in your web browser at
https://localhost:8080/
.Find your DAGs in the
dags
directory in the Airflow UI.In this directory, you can find several example DAGs including
example-dag-basic
DAG, which was generated with your Astro project. To provide a basic demonstration of an ETL pipeline, this DAG creates an example JSON string, calculates a value based on the string, and prints the results of the calculation to the Airflow logs.
The Astro CLI uses port 8080
for the Airflow webserver and port 5432
for the Airflow metadata database by default. If these ports are already in use on your local computer, an error message might appear. To resolve this error message, see Test and troubleshoot locally.
Step 3: Develop locally with the CLI
The astro dev
command options include a number of other useful subcommands that you can use while developing locally.
astro dev restart
You must restart your environment to apply changes from any of the following files in your Astro project:
packages.txt
Dockerfile
requirements.txt
airflow_settings.yaml
To restart your local Airflow environment, run:
astro dev restart
This command rebuilds your image and restarts the Docker containers running on your local machine with the new image. Alternatively, you can run astro dev stop
to stop your Docker containers without restarting your environment, then run astro dev start
when you want to restart.
astro dev stop
Run the following to pause all Docker containers running your local Airflow environment.
astro dev stop
Unlike astro dev kill
, this command does not prune mounted volumes and delete data associated with your local Postgres database. If you run this command, Airflow connections and task history will be preserved.
This command can be used regularly with astro dev start
and astro dev restart
during local development.
astro dev kill
When you want to force-stop all four Docker containers for your local Airflow environment, use the following command.
astro dev kill
astro dev kill
also deletes all data associated with your local Postgres
database which includes Airflow connections, logs, and task history.
Next Steps
After you have finished Getting Started with the CLI, you can configure your CLI to locally debug your Airflow environment, authenticate to cloud services to test your DAGs with data stored on the cloud, or you can learn more about developing DAGs with Astro.