Data source vs resource difference in Terraform - Performance Comparison
We want to understand how the time to run Terraform changes when using data sources versus resources.
Specifically, how does Terraform's work grow as we add more data sources or resources?
Analyze the time complexity of this Terraform snippet.
data "aws_ami" "example" {
most_recent = true
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-ebs"]
}
}
resource "aws_instance" "example" {
ami = data.aws_ami.example.id
instance_type = "t2.micro"
}
This code fetches an existing AMI using a data source and then creates an EC2 instance resource using that AMI.
Look at what Terraform does repeatedly when applying this configuration.
- Primary operation: API call to fetch AMI details (data source), and API call to create EC2 instance (resource).
- How many times: Each data source triggers one read API call per apply; each resource triggers one create API call per instance.
As you add more data sources and resources, the number of API calls grows.
| Input Size (n) | Approx. API Calls/Operations |
|---|---|
| 10 data sources + 10 resources | ~20 API calls (10 reads + 10 creates) |
| 100 data sources + 100 resources | ~200 API calls (100 reads + 100 creates) |
| 1000 data sources + 1000 resources | ~2000 API calls (1000 reads + 1000 creates) |
Pattern observation: The total API calls grow roughly linearly with the number of data sources and resources.
Time Complexity: O(n)
This means the time to apply grows in direct proportion to how many data sources and resources you have.
[X] Wrong: "Data sources are free and do not add to execution time."
[OK] Correct: Data sources make API calls to read existing info, so they add to the total work Terraform does.
Understanding how Terraform scales with data sources and resources helps you design efficient infrastructure code and shows you think about real-world costs.
"What if we replaced many data sources with a single resource that creates and manages all those items? How would the time complexity change?"