Skip to main content

Connect Astro to AWS data sources

Use the information provided here to learn how you can securely connect your Astro data plane to your existing AWS instance. A connection to AWS allows Astro to access data stored on your AWS instance and is a necessary step to running pipelines in a production environment.

Connection options

The connection option that you choose is determined by the requirements of your organization and your existing infrastructure. You can choose a straightforward implementation, or a more complex implementation that provides enhanced data security. Astronomer recommends that you review all of the available connection options before selecting one for your organization.

Publicly accessible endpoints allow you to quickly connect Astro to AWS. To configure these endpoints, you can use one of the following methods:

When you use publicly accessible endpoints to connect Astro and AWS, traffic moves directly between your Astro data plane and the AWS API endpoint. Data in this traffic never reaches the control plane, which is managed by Astronomer.

Authorization options

Authorization is the process of verifying a user or service's permissions before allowing them access to organizational applications and resources. Astro clusters must be authorized to access external resources from your cloud. Which authorization option that you choose is determined by the requirements of your organization and your existing infrastructure. Astronomer recommends that you review all of the available authorization options before selecting one for your organization.

To grant an Astro cluster access to a service that is running in an AWS account not managed by Astronomer, use AWS IAM roles. IAM roles on AWS are often used to manage the level of access a specific user, object, or group of users has to a resource. This includes an Amazon S3 bucket, Redshift instance, or secrets backend.

  1. In the Cloud UI, click Clusters and then copy the value displayed in the Cluster ID column for the Astro cluster that needs access to AWS service resources.

  2. Create an IAM role in the AWS account that contains your AWS service. See Creating a role to delegate permissions to an AWS service.

  3. In the AWS Management Console, go to the Identity and Access Management (IAM) dashboard.

  4. Click Roles and in the Role name column, select the role you created in step 2.

  5. Click the Trust relationships tab.

  6. Click Edit trust policy and update the arn value:

        {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Effect": "Allow",
    "Principal": {
    "AWS": [
    "arn:aws:iam::<dataplane-AWS-account-ID>:role/AirflowS3Logs-<cluster-ID>"
    ]
    },
    "Action": "sts:AssumeRole"
    },
    ]
    }

    To locate your <dataplane-AWS-account-ID> and <cluster-ID>, in the Cloud UI click Clusters. The <dataplane-AWS-account-ID> is located in the Account ID column and the cluster ID is located in the ID column.

    The Astro cluster data plane account includes the AirflowLogsS3-<clusterid> role. When you configure an Airflow connection for a Deployment, specify this role when you create the AWS Airflow Connection.

  7. Click Update policy.

  8. In the Airflow UI or as an environment variable on Astro, create an Airflow connection to AWS for each Deployment that requires the resources you connected. See Managing connections to Apache Airflow.

  9. Optional. Repeat these steps for each Astro cluster that requires access to external data services on AWS.