Terraformcloud~5 mins

Why data sources query existing infrastructure in Terraform - Performance Analysis

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Time Complexity: Why data sources query existing infrastructure

O(n)

Understanding Time Complexity

When Terraform uses data sources, it asks the cloud for information about resources already there.

We want to know how the time to get this information changes as we ask about more resources.

Scenario Under Consideration

Analyze the time complexity of the following operation sequence.

data "aws_instance" "example" {
  filter {
    name   = "tag:Name"
    values = ["web-server"]
  }
}

output "instance_id" {
  value = data.aws_instance.example.id
}

This code asks AWS to find an existing instance with a specific tag and returns its ID.

Identify Repeating Operations

Identify the API calls, resource provisioning, data transfers that repeat.

Primary operation: Querying the cloud provider's API to find matching resources.
How many times: Once per data source block in the configuration.

How Execution Grows With Input

As you add more data sources querying different resources, the number of API calls grows.

Input Size (n)	Approx. API Calls/Operations
10	10 API calls
100	100 API calls
1000	1000 API calls

Pattern observation: Each new data source adds one more API call, so the total grows directly with the number of data sources.

Final Time Complexity

Time Complexity: O(n)

This means the time to query existing infrastructure grows linearly with how many data sources you use.

Common Mistake

[X] Wrong: "Data sources run once and then don't add to execution time no matter how many are used."

[OK] Correct: Each data source makes its own call to the cloud, so more data sources mean more calls and more time.

Interview Connect

Understanding how querying existing resources scales helps you design efficient infrastructure code and shows you think about real-world cloud costs and delays.

Self-Check

"What if multiple data sources queried the same resource? How would the time complexity change?"