Terraform Data Source: Usage and Real-World Example with AWS
What is Terraform?
Terraform is an open-source Infrastructure as Code (IaC) tool developed by HashiCorp that allows you to define and provision data center infrastructure using a high-level configuration language called HashiCorp Configuration Language (HCL) or optionally JSON. Terraform enables you to create, update, and version your infrastructure safely and efficiently. It can manage both existing service providers and custom in-house solutions.
What is a Data Source in Terraform?
A data source in Terraform allows you to fetch and use data from external sources that are not managed by Terraform. This can include existing cloud resources, configuration data, or information from external APIs. Data sources are read-only and do not modify any resources; they merely provide data that can be used in your Terraform configuration.
When to Use Data Sources
Data sources are useful when you need to:
- Fetch information about existing resources: When you need details about resources that are created outside of Terraform or by another Terraform configuration.
- Reference dynamic data: When the data changes frequently or is managed outside of Terraform, such as fetching the latest AMI ID for EC2 instances.
- Retrieve configuration information: For instance, fetching details from an external API or a different part of your infrastructure.
Example: Using Data Source in Terraform with AWS
Use Case: Fetching the Latest Amazon Linux 2 AMI
Imagine you need to launch an EC2 instance using the latest Amazon Linux 2 AMI. Instead of hardcoding the AMI ID, which can become outdated, you can use a data source to fetch the latest AMI dynamically.
Here’s how you can do it:
- Define the Data Source: Use the
aws_ami
data source to find the latest Amazon Linux 2 AMI. - Use the Data Source: Reference the data source in your EC2 instance resource definition.
# Configure the AWS provider
provider "aws" {
region = "us-west-2"
}
# Data source to fetch the latest Amazon Linux 2 AMI
data "aws_ami" "amazon_linux" {
most_recent = true
filter {
name = "name"
values = ["ami-0b72821e2f351e396"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}
# Resource to create an EC2 instance
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
instance_type = "t2.micro"
tags = {
Name = "ExampleInstance"
}
}
# Output the instance ID
output "instance_id" {
value = aws_instance.web.id
}
Explanation
- Provider Block: Configures the AWS provider with the desired region (
us-west-2
in this example). - Data Source Block: Uses the
aws_ami
data source to find the most recent Amazon Linux 2 AMI. It applies filters to match the AMI name pattern and virtualization type, and it specifies the owner ID to ensure the AMI is from Amazon. - Resource Block: Defines an
aws_instance
resource using the AMI ID retrieved from the data source. - Output Block: Outputs the ID of the created EC2 instance.
How to get the ec2 instance ami in aws console
By using a data source, you ensure that your EC2 instance always launches with the latest Amazon Linux 2 AMI, without the need to manually update the AMI ID in your configuration.
Benefits of Using Data Sources
- Dynamic Configuration: Automatically fetches the latest data, reducing the need for manual updates.
- Consistency: Ensures consistency by fetching data from a single source of truth.
- Modularity: Enhances modularity by separating data fetching logic from resource definitions.
This example demonstrates how data sources in Terraform can be used to dynamically fetch and utilize information from AWS, making your infrastructure configurations more robust and maintainable.