0
0
Terraformcloud~15 mins

Data source vs resource difference in Terraform - Trade-offs & Expert Analysis

Choose your learning style9 modes available
Overview - Data source vs resource difference
What is it?
In Terraform, a resource is something you create and manage, like a server or a database. A data source is a way to look up or read information about something that already exists outside your Terraform setup. Resources change your cloud environment, while data sources only read information without making changes.
Why it matters
Understanding the difference helps you avoid mistakes like trying to create something that already exists or accidentally changing infrastructure you only wanted to observe. Without this, managing cloud infrastructure can become confusing and error-prone, leading to downtime or unexpected costs.
Where it fits
Before learning this, you should know basic Terraform concepts like providers and configuration files. After this, you can learn about Terraform modules and state management to organize and reuse your infrastructure code better.
Mental Model
Core Idea
Resources build or change infrastructure; data sources only read existing information without changing anything.
Think of it like...
Think of resources as planting a tree in your garden, and data sources as looking at a tree that someone else planted to learn about it.
Terraform Configuration
┌───────────────┐       ┌───────────────┐
│   Resource    │──────▶│ Creates/Changes│
│ (e.g., server)│       │ Infrastructure│
└───────────────┘       └───────────────┘

┌───────────────┐       ┌───────────────┐
│ Data Source   │──────▶│ Reads Existing │
│ (e.g., info)  │       │ Infrastructure│
└───────────────┘       └───────────────┘
Build-Up - 7 Steps
1
FoundationWhat is a Terraform Resource
🤔
Concept: Resources are the building blocks that Terraform uses to create or change infrastructure.
In Terraform, a resource block defines something you want to create or manage, like a virtual machine or a storage bucket. When you run Terraform, it looks at these resource blocks and makes sure the real infrastructure matches what you described.
Result
Terraform creates or updates the infrastructure to match the resource definitions.
Understanding resources is key because they are how Terraform controls your cloud environment.
2
FoundationWhat is a Terraform Data Source
🤔
Concept: Data sources let you read information about existing infrastructure without changing it.
A data source block in Terraform fetches details about something that already exists, like an existing network or a database you didn't create with Terraform. This lets you use that information in your Terraform code without managing that resource directly.
Result
Terraform reads and makes available information about existing infrastructure.
Knowing data sources helps you integrate Terraform with infrastructure created outside Terraform or by other teams.
3
IntermediateHow Resources and Data Sources Work Together
🤔Before reading on: do you think resources and data sources can both create infrastructure? Commit to your answer.
Concept: Resources create or change infrastructure; data sources only read existing infrastructure to use in your configuration.
You can use data sources to get information needed to configure resources. For example, you might use a data source to find an existing network ID, then use that ID in a resource block to create a server inside that network.
Result
Terraform uses data source information to configure resources correctly without duplicating existing infrastructure.
Understanding this relationship prevents duplication and ensures Terraform configurations are accurate and safe.
4
IntermediateState Management Differences
🤔Before reading on: does Terraform track data sources in its state file the same way as resources? Commit to your answer.
Concept: Terraform tracks resources in its state file because it manages them, but data sources are not tracked since they only read information.
When Terraform applies changes, it records resource details in the state file to know what it manages. Data sources fetch live information each time without storing it in state, so Terraform always has up-to-date info about existing infrastructure.
Result
Terraform state file contains resources but not data sources.
Knowing this helps you understand why changes to data sources don't trigger Terraform to recreate infrastructure.
5
IntermediateCommon Use Cases for Data Sources
🤔
Concept: Data sources are useful for referencing existing infrastructure or dynamic values outside Terraform control.
Examples include getting the latest machine image ID, reading existing network details, or fetching user information from a cloud provider. This lets you build flexible configurations that adapt to your environment.
Result
Terraform configurations become more dynamic and reusable.
Recognizing when to use data sources improves your Terraform code's flexibility and safety.
6
AdvancedAvoiding Conflicts Between Resources and Data Sources
🤔Before reading on: can defining a resource and a data source for the same infrastructure cause problems? Commit to your answer.
Concept: Defining both a resource and a data source for the same infrastructure can cause conflicts and errors.
If you try to create a resource that already exists and also read it as a data source, Terraform may get confused or try to recreate it. Best practice is to manage infrastructure with either resources or data sources, not both for the same item.
Result
Terraform runs smoothly without conflicts or accidental changes.
Understanding this prevents costly mistakes like accidental resource deletion or duplication.
7
ExpertData Sources in Complex Module Architectures
🤔Before reading on: do you think data sources can be used inside Terraform modules to share information? Commit to your answer.
Concept: Data sources can be used inside modules to fetch external information, enabling modules to be more reusable and environment-aware.
In large Terraform projects, modules often need to know about existing infrastructure outside their scope. Using data sources inside modules allows them to adapt without hardcoding values, making modules flexible and composable.
Result
Modules become more portable and easier to maintain across different environments.
Knowing how to use data sources inside modules unlocks advanced Terraform design patterns for scalable infrastructure.
Under the Hood
Terraform parses resource blocks and plans actions to create, update, or delete infrastructure. It records these managed resources in a state file to track their real-world status. Data sources, however, are evaluated during the plan phase by querying the provider's API to fetch current information. They do not modify state or infrastructure but provide live data for use in resource configurations.
Why designed this way?
Separating resources and data sources allows Terraform to clearly distinguish between what it manages and what it only observes. This design prevents accidental changes to existing infrastructure and supports integration with resources created outside Terraform. It also simplifies state management by only tracking what Terraform controls.
Terraform Workflow
┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│ Configuration │──────▶│ Plan Phase    │──────▶│ Apply Phase   │
│ (Resources &  │       │ - Evaluate    │       │ - Create/     │
│  Data Sources)│       │   Data Sources│       │   Update      │
└───────────────┘       └───────────────┘       │ - Delete      │
                                                └───────────────┘

State File
┌───────────────┐
│ Tracks only   │
│ Resources     │
│ Not Data      │
│ Sources       │
└───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Do data sources create infrastructure when applied? Commit to yes or no.
Common Belief:Data sources create or modify infrastructure just like resources.
Tap to reveal reality
Reality:Data sources only read existing infrastructure information and never create or change anything.
Why it matters:Believing data sources create infrastructure can lead to confusion and incorrect Terraform plans, causing unexpected errors or no changes when expected.
Quick: Does Terraform track data sources in its state file? Commit to yes or no.
Common Belief:Terraform stores data source information in its state file like resources.
Tap to reveal reality
Reality:Terraform does not store data source data in the state file; it fetches fresh data each time it runs.
Why it matters:Assuming data sources are tracked can cause misunderstandings about why changes to external infrastructure don't trigger Terraform updates.
Quick: Can you safely define both a resource and a data source for the same infrastructure? Commit to yes or no.
Common Belief:It's fine to define a resource and a data source for the same infrastructure to manage and read it simultaneously.
Tap to reveal reality
Reality:Doing so can cause conflicts, errors, or Terraform trying to recreate resources, leading to downtime or data loss.
Why it matters:Mismanaging this can cause serious production issues and infrastructure instability.
Quick: Are data sources only useful for cloud infrastructure? Commit to yes or no.
Common Belief:Data sources are only for cloud resources like servers or databases.
Tap to reveal reality
Reality:Data sources can fetch information from many providers, including DNS, version control, or configuration management systems.
Why it matters:Limiting data sources to cloud resources restricts their powerful use in integrating diverse systems.
Expert Zone
1
Data sources can introduce non-determinism if the external data changes between Terraform runs, affecting plan stability.
2
Using data sources inside modules requires careful input/output design to avoid tight coupling and maintain module reusability.
3
Terraform providers may cache data source results during a run, so changes outside Terraform might not be immediately visible without refresh.
When NOT to use
Avoid using data sources when you need to manage lifecycle or enforce configuration; use resources instead. For infrastructure that must be created or updated, resources are mandatory. Also, if you need guaranteed immutability or versioning, data sources alone are insufficient.
Production Patterns
In production, teams use data sources to reference shared infrastructure like networks or IAM roles managed by other teams. Modules often use data sources to adapt to environment-specific settings dynamically. Careful separation of resources and data sources prevents accidental overwrites and supports multi-team collaboration.
Connections
Immutable Infrastructure
Resources enforce immutable infrastructure by managing lifecycle; data sources support this by referencing stable existing components.
Understanding data sources helps maintain immutability by avoiding direct changes to shared infrastructure.
API Querying
Data sources perform live API queries to fetch current state, similar to how monitoring tools query APIs for status.
Knowing data sources as API queries clarifies why they don't change state and always reflect live data.
Database Views
Data sources are like database views that provide read-only access to data without modifying it.
Seeing data sources as read-only views helps understand their role in safely integrating existing infrastructure.
Common Pitfalls
#1Trying to create a resource that already exists using a data source.
Wrong approach:resource "aws_vpc" "example" { cidr_block = "10.0.0.0/16" } data "aws_vpc" "example" { filter { name = "cidr-block" values = ["10.0.0.0/16"] } }
Correct approach:data "aws_vpc" "example" { filter { name = "cidr-block" values = ["10.0.0.0/16"] } } # Use data source info without creating resource
Root cause:Confusing data sources as resources leads to attempts to create infrastructure that already exists.
#2Using data source output as a resource ID without ensuring the resource exists.
Wrong approach:resource "aws_instance" "example" { subnet_id = data.aws_subnet.example.id # But data source might not find subnet }
Correct approach:resource "aws_instance" "example" { subnet_id = var.subnet_id # Ensure subnet exists or create it as resource }
Root cause:Assuming data sources always return valid data without validation causes runtime errors.
#3Defining both resource and data source for the same infrastructure causing conflicts.
Wrong approach:resource "aws_s3_bucket" "example" { bucket = "my-bucket" } data "aws_s3_bucket" "example" { bucket = "my-bucket" }
Correct approach:resource "aws_s3_bucket" "example" { bucket = "my-bucket" } # Use resource outputs directly, avoid data source for same bucket
Root cause:Mixing management and read-only references for the same resource confuses Terraform's state and plan.
Key Takeaways
Resources in Terraform create and manage infrastructure, while data sources only read existing information without making changes.
Terraform tracks resources in its state file but does not track data sources, which fetch live data each run.
Using data sources helps integrate existing infrastructure safely and makes configurations more flexible and dynamic.
Avoid defining both a resource and a data source for the same infrastructure to prevent conflicts and errors.
Understanding the difference between resources and data sources is essential for writing safe, maintainable, and effective Terraform code.