Compiler-designConceptBeginner · 3 min read

What is DAG in Compiler: Definition and Usage Explained

In a compiler, a DAG (Directed Acyclic Graph) is a data structure used to represent expressions without repeating common subexpressions. It helps the compiler optimize code by identifying and reusing repeated calculations, improving efficiency.

⚙️

How It Works

A DAG in a compiler is like a flowchart that shows how different parts of a calculation connect, but without any loops. Imagine you want to bake a cake and some steps, like mixing ingredients, are done multiple times. Instead of repeating those steps, you do them once and reuse the result. Similarly, a DAG helps the compiler spot repeated calculations and avoid doing them again.

Each node in the DAG represents an operation or a value, and edges show how these connect. Because it has no cycles, the compiler can safely follow the graph from inputs to outputs without getting stuck. This structure makes it easier to optimize code by sharing results and reducing unnecessary work.

💻

Example

This example shows how a DAG can represent the expression a + b + a + b by sharing the common subexpression a + b.

python

class Node:
    def __init__(self, op, left=None, right=None):
        self.op = op
        self.left = left
        self.right = right

# Create nodes for variables
node_a = Node('a')
node_b = Node('b')

# Create a node for a + b
node_a_plus_b = Node('+', node_a, node_b)

# Reuse node_a_plus_b for the full expression (a + b) + (a + b)
root = Node('+', node_a_plus_b, node_a_plus_b)

# Function to print the DAG structure

def print_node(node, indent=0, visited=None):
    if visited is None:
        visited = set()
    if id(node) in visited:
        print('  ' * indent + f'(Already visited {node.op})')
        return
    visited.add(id(node))
    if node.left is None and node.right is None:
        print('  ' * indent + f'Var: {node.op}')
    else:
        print('  ' * indent + f'Op: {node.op}')
        print_node(node.left, indent + 1, visited)
        print_node(node.right, indent + 1, visited)

print_node(root)

Output

Op: + Op: + Var: a Var: b (Already visited +)

🎯

When to Use

DAGs are used in compilers mainly during the optimization phase to detect and eliminate repeated calculations, called common subexpression elimination. This reduces the amount of work the computer does, making programs run faster.

For example, if a program calculates (x + y) * (x + y), the compiler can use a DAG to compute x + y once and reuse it, instead of doing it twice. DAGs also help in instruction scheduling and register allocation, improving overall code efficiency.

✅

Key Points

A DAG is a graph with directed edges and no cycles, used to represent expressions in compilers.
It helps identify repeated calculations to avoid redundant work.
Each node represents an operation or value, and edges show dependencies.
DAGs improve compiler optimizations like common subexpression elimination.
They ensure safe and efficient code generation by avoiding loops in expression representation.

✅

Key Takeaways

A DAG in a compiler represents expressions without repeating common parts to optimize code.

It helps the compiler avoid recalculating the same expression multiple times.

DAGs are essential for optimizations like common subexpression elimination.

They use nodes for operations and edges for dependencies without cycles.

Using DAGs leads to faster and more efficient compiled programs.