Data source dependencies in Terraform - Time & Space Complexity
When Terraform uses data sources, it fetches information from existing infrastructure. Understanding how the time to fetch this data grows helps us plan for bigger setups.
We want to know: how does the number of data sources affect the total time Terraform takes to get all the data?
Analyze the time complexity of the following operation sequence.
data "aws_subnet" "example" {
count = var.subnet_count
id = var.subnet_ids[count.index]
}
output "subnet_cidrs" {
value = [for s in data.aws_subnet.example : s.cidr_block]
}
This code fetches details for multiple subnets using their IDs, then outputs their CIDR blocks.
Identify the API calls, resource provisioning, data transfers that repeat.
- Primary operation: Each
aws_subnetdata source makes one API call to fetch subnet details. - How many times: The API call repeats once for each subnet ID in
var.subnet_ids.
As the number of subnet IDs increases, the number of API calls grows directly with it.
| Input Size (n) | Approx. API Calls/Operations |
|---|---|
| 10 | 10 API calls |
| 100 | 100 API calls |
| 1000 | 1000 API calls |
Pattern observation: The number of API calls grows in a straight line with the number of subnets.
Time Complexity: O(n)
This means the time to fetch all subnet data grows directly in proportion to how many subnets you ask for.
[X] Wrong: "Fetching multiple data sources happens all at once, so time stays the same no matter how many."
[OK] Correct: Each data source triggers a separate API call, so more data sources mean more calls and more time.
Knowing how data source calls add up helps you design Terraform configurations that scale well and avoid surprises in deployment time.
"What if we combined multiple subnet IDs into one data source call using a filter? How would the time complexity change?"