Skip to main content

Resources Required for Astro on AWS

Overview

Unless otherwise specified, new Clusters on Astro are created with a set of default AWS resources that our team has deemed appropriate for most use cases.

Read the following document for a reference of our default resources as well as supported Cluster configurations, including AWS Region and Node Instance Type.

Default Cluster Values

ResourceDescriptionQuantity / Default Size
EKS ClusterAn EKS cluster is required to run the Astro Data Plane, which hosts the resources and data required to execute Airflow tasks.1x
EC2 InstancesEC2 instances (nodes) power the system and Airflow components (Webserver, Scheduler, Workers). EC2 instances auto-scale for additional Airflow Deployments.2x m5.xlarge
RDS for PostgreSQL InstanceThe RDS instance is the primary database of the Astro Data Plane. It hosts a metadata database for each Airflow Deployment hosted on the EKS cluster.1x db.r5.large
Elastic IPsElastic IPs are required for connectivity with the Control Plane, and other public services.2x
SubnetsSubnets are provisioned in 2 different Availability Zones (AZs) for redundancy, with 1 public and 1 private subnet per AZ. Public subnets are required for the NAT and Internet gateways, while private subnets are required for EC2 nodes.2x /26 (public) and 1x /20 + 1x /21 (private)
Internet GatewayRequired for connectivity with the Control Plane and other public services.1x
NAT GatewaysNAT Gateways translate outbound traffic from private subnets to public subnets.2x
RoutesRoutes are necessary to direct network traffic from the subnets and gateways.2x
Route TablesHome for the routes.2x
VPCVirtual network for launching and hosting AWS resources.1x /19
S3S3 bucket for storage of Airflow task logs.1x
Max Node CountThe maximum number of EC2 nodes that your Astro Cluster can support at any given time. Once this limit is reached, your Cluster cannot auto-scale and worker pods may fail to schedule.20

Supported Cluster Configurations

Depending on the needs of your team, you may be interested in modifying certain configurations of a new or existing Cluster on Astro. This section provides a reference for which configuration options are supported during the install process.

To create a new Cluster on Astro with a specified configuration, see Install on AWS or Create a Cluster. For instructions on how to make a change to an existing Cluster, see Modify a Cluster.

AWS Region

Astro supports the following AWS regions:

  • af-south-1 - Africa (Cape Town)
  • ap-east-1 - Asia Pacific (Hong Kong)
  • ap-northeast-1 - Asia Pacific (Tokyo)
  • ap-northeast-2 - Asia Pacific (Seoul)
  • ap-northeast-3 - Asia Pacific (Osaka)
  • ap-southeast-1 - Asia Pacific (Singapore)
  • ap-southeast-2 - Asia Pacific (Sydney)
  • ap-south-1 - Asia Pacific (Mumbai)
  • ca-central-1 - Canada (Central)
  • me-south-1 - Middle East (Bahrain)
  • sa-east-1 - South America (São Paulo)
  • us-east-1 - US East (N. Virginia)
  • us-east-2 - US East (Ohio)
  • us-west-1 - US West (N. California)
  • us-west-2 - US West (Oregon)

Modifying the region of an existing Cluster on Astro is not supported. If you're interested in an AWS region that is not on this list, reach out to Astronomer Support.

RDS Instance Type

Every Astro Cluster on AWS is created with and requires an RDS instance. RDS serves as a primary relational database for the Data Plane and powers the metadata database of each Astro Deployment within a single Cluster. During the Cluster creation process, you'll be asked to specify an RDS instance type according to your use case and expected workload, but it can be modified at any time.

Astro supports a variety of AWS RDS instance types. Instance types comprise of varying combinations of CPU, memory, storage, and networking capacity. For detailed information on each instance type, reference AWS documentation. If you're interested in an RDS instance type that is not on this list, reach out to Astronomer Support.

db.r5

  • db.r5.large (default)
  • db.r5.xlarge
  • db.r5.2xlarge
  • db.r5.4xlarge
  • db.r5.8xlarge
  • db.r5.12xlarge
  • db.r5.16xlarge
  • db.r5.24xlarge

db.m5

  • db.m5.large
  • db.m5.xlarge
  • db.m5.2xlarge
  • db.m5.4xlarge
  • db.m5.8xlarge
  • db.m5.12xlarge
  • db.m5.16xlarge
  • db.m5.24xlarge

Node Instance Type

Astro supports a variety of AWS EC2 instance types. Instance types comprise of varying combinations of CPU, memory, storage, and networking capacity. While the resources allocated to system and Airflow components are managed by Astronomer, the node instance type you select for your Cluster powers the workers of all Deployments within that Cluster.

For detailed information on each instance type, refer to AWS documentation. If you're interested in a node type that is not on this list, reach out to Astronomer Support. Not all instance types are supported in all AWS regions.

c6i

  • c6i.xlarge
  • c6i.2xlarge
  • c6i.4xlarge
  • c6i.8xlarge
  • c6i.12xlarge
  • c6i.16xlarge
  • c6i.24xlarge
  • c6i.32xlarge
  • c6i.metal

m5

  • m5.xlarge (default)
  • m5.2xlarge
  • m5.4xlarge
  • m5.8xlarge
  • m5.12xlarge
  • m5.16xlarge
  • m5.24xlarge
  • m5.metal

m5d

  • m5d.xlarge
  • m5d.2xlarge
  • m5d.4xlarge
  • m5d.8xlarge
  • m5d.12xlarge
  • m5d.16xlarge
  • m5d.24xlarge
  • m5d.metal

m6i

  • m6i.xlarge
  • m6i.2xlarge
  • m6i.4xlarge
  • m6i.8xlarge
  • m6i.12xlarge
  • m6i.16xlarge
  • m6i.24xlarge
  • m6i.32xlarge
  • m6i.metal

r6i

  • r6i.xlarge
  • r6i.2xlarge
  • r6i.4xlarge
  • r6i.8xlarge
  • r6i.12xlarge
  • r6i.16xlarge
  • r6i.24xlarge
  • r6i.32xlarge
  • r6i.metal

t2

  • t2.xlarge

t3

  • t3.2xlarge
info

A single Cluster on Astro cannot currently be configured with more than one node instance type. In the first half of 2022, we expect to introduce support for Worker Queues, which will allow you to run workers of varying node types and sizes within a single Deployment. If this is something that your team is interested in, reach out to us. We'd love to hear from you.

Deployment Worker Size Limits

In addition to setting a node instance type for each Cluster, you can configure a unique worker size for each Deployment within a Cluster. Worker size can be specified at any time in the Worker Resources field in the Deployment view of the Cloud UI. You can select any worker size up to 400 AU (40 CPUs, 150 GiB memory) as long as the worker size is supported by the node instance type selected for the Cluster. If you attempt to provision a worker size that isn't supported by your Cluster's instance type, you will see an error in the Cloud UI.

This table lists the maximum worker size that is supported on Astro for each node instance type. These numbers may vary slightly over time.

Node Instance TypeMaximum AUCPUMemory
m5.xlarge272.7 CPUs10.1 GiB MEM
m5.2xlarge676.7 CPUs25.1 GiB MEM
m5.4xlarge14714.7 CPUs55.1 GiB MEM
m5.8xlarge30730.7 CPUs115 GiB MEM
m5.12xlarge467*46.7 CPUs175 GiB MEM
m5.16xlarge627*62.7 CPUs235 GiB MEM
m5.24xlarge947*94.7 CPUs355 Gib MEM
m5.metal947*94.7 CPUs355 Gib MEM
m5d.xlarge272.7 CPUs10.1 GiB MEM
m5d.2xlarge676.7 CPUs25.1 GiB MEM
m5d.4xlarge14714.7 CPUs55.1 GiB MEM
m5d.8xlarge30730.7 CPUs115 GiB MEM
m5d.12xlarge467*46.7 CPUs175 GiB MEM
m5d.16xlarge627*62.7 CPUs235 GiB MEM
m5d.24xlarge947*94.7 CPUs355 Gib MEM
m5d.metal947*94.7 CPUs355 Gib MEM
m6i.xlarge272.7 CPUs10.1 GiB MEM
m61.2xlarge676.7 CPUs25.1 GiB MEM
m6i.4xlarge14714.7 CPUs55.1 GiB MEM
m6i.8xlarge30730.7 CPUs115 GiB MEM
m6i.12xlarge467*46.7 CPUs175 GiB MEM
m6i.16xlarge627*62.7 CPUs235 GiB MEM
m6i.24xlarge947*94.7 CPUs355 Gib MEM
m6i.metal947*94.7 CPUs355 Gib MEM
r6i.xlarge272.7 CPUs10.1 GiB MEM
r61.2xlarge676.7 CPUs25.1 GiB MEM
r6i.4xlarge14714.7 CPUs55.1 GiB MEM
r6i.8xlarge30730.7 CPUs115 GiB MEM
r6i.12xlarge467*46.7 CPUs175 GiB MEM
r6i.16xlarge627*62.7 CPUs235 GiB MEM
r6i.24xlarge947*94.7 CPUs355 Gib MEM
r6i.metal947*94.7 CPUs355 Gib MEM
c6i.xlarge272.7 CPUs10.1 GiB MEM
c61.2xlarge676.7 CPUs25.1 GiB MEM
c6i.4xlarge14714.7 CPUs55.1 GiB MEM
c6i.8xlarge30730.7 CPUs115 GiB MEM
c6i.12xlarge467*46.7 CPUs175 GiB MEM
c6i.16xlarge627*62.7 CPUs235 GiB MEM
c6i.24xlarge947*94.7 CPUs355 Gib MEM
c6i.metal947*94.7 CPUs355 Gib MEM
c6i.xlarge272.7 CPUs10.1 GiB MEM
t3.xlarge272.7 CPUs10.1 GiB MEM
t3.2xlarge676.7 CPUs25.1 GiB MEM

The maximum supported worker size on Astro is currently 400 AU, which means that an Astro Cluster may not make full use of the CPU and memory capacity of some node instance types on this list. If your team is interested in using an instance type that supports a worker size limit higher than 400 AU, reach out to Astronomer Support. For more information on configuring worker size on Astro, see Configure a Deployment.

info

The size limits defined here currently also apply to Scheduler Resources, which determines the CPU and memory allocated to the Airflow Scheduler(s) of each Deployment. The maximum Scheduler size on Astro is 30 AU, which means there are some node instance types for which that maximum size is not supported.

For more information about the Scheduler, see Configure a Deployment.

tip

With the exception of m5d nodes, all suppported node types have a maximum of 20GB of storage per node for system use only. If you need locally attached storage for task execution, Astronomer recommends modifying your cluster to run m5d nodes, which Astronomer provisions with NVMe SSD volumes.

Astronomer plans to support optional ephemeral storage for all node instance types in the first half of 2022.

If you need to pass significant data between Airflow tasks, Astronomer recommends using an XCom backend such as AWS S3 or Google Cloud Storage (GCS). For more information and best practices, see the Airflow Guide on Passing Data Between Airflow Tasks.