How to Convert NFA to DFA in Compilers - Step by Step Guide
epsilon-closure and input symbols.Examples
How to Think About It
Algorithm
Code
def epsilon_closure(states, transitions): stack = list(states) closure = set(states) while stack: state = stack.pop() for nxt in transitions.get((state, ''), []): if nxt not in closure: closure.add(nxt) stack.append(nxt) return closure def nfa_to_dfa(nfa_states, nfa_start, nfa_accept, nfa_transitions, alphabet): start = frozenset(epsilon_closure({nfa_start}, nfa_transitions)) dfa_states = {start} unmarked = [start] dfa_transitions = {} dfa_accept = set() while unmarked: current = unmarked.pop() for symbol in alphabet: if symbol == '': continue next_states = set() for state in current: next_states.update(nfa_transitions.get((state, symbol), [])) next_closure = frozenset(epsilon_closure(next_states, nfa_transitions)) if not next_closure: continue dfa_transitions[(current, symbol)] = next_closure if next_closure not in dfa_states: dfa_states.add(next_closure) unmarked.append(next_closure) for state_set in dfa_states: if state_set & nfa_accept: dfa_accept.add(state_set) print('DFA States:', dfa_states) print('DFA Accept States:', dfa_accept) print('DFA Transitions:') for (src, sym), dst in dfa_transitions.items(): print(f'{src} -{sym}-> {dst}') # Example usage nfa_states = {'q0', 'q1'} nfa_start = 'q0' nfa_accept = {'q1'} nfa_transitions = { ('q0', 'a'): {'q0', 'q1'}, ('q1', 'b'): {'q1'}, ('q0', ''): set() } alphabet = {'a', 'b'} nfa_to_dfa(nfa_states, nfa_start, nfa_accept, nfa_transitions, alphabet)
Dry Run
Let's trace the NFA with states {q0, q1} and transitions q0 -a-> q0,q1; q1 -b-> q1 through the subset construction.
Start State
Start with epsilon-closure({q0}) = {q0} as DFA start state.
Process input 'a'
From {q0}, on 'a' go to {q0, q1}.
Process input 'b'
From {q0, q1}, on 'b' go to {q1}.
| DFA State | Input | Next DFA State |
|---|---|---|
| {q0} | a | {q0, q1} |
| {q0, q1} | a | {q0, q1} |
| {q0, q1} | b | {q1} |
| {q1} | b | {q1} |
Why This Works
Step 1: Grouping NFA States
Each DFA state represents a set of NFA states combined to handle nondeterminism.
Step 2: Using Epsilon-Closure
Epsilon-closure finds all states reachable without input, ensuring completeness.
Step 3: Defining Transitions
Transitions in DFA are defined by moving from one set of NFA states to another on input symbols.
Alternative Approaches
def nfa_to_dfa_explicit(nfa, start, accept, alphabet): from collections import deque def epsilon_closure(states): stack = list(states) closure = set(states) while stack: state = stack.pop() for nxt in nfa.get((state, ''), []): if nxt not in closure: closure.add(nxt) stack.append(nxt) return closure start_set = frozenset(epsilon_closure({start})) dfa_states = {start_set: 'S0'} queue = deque([start_set]) dfa_transitions = {} count = 1 while queue: current = queue.popleft() for symbol in alphabet: if symbol == '': continue next_states = set() for state in current: next_states.update(nfa.get((state, symbol), [])) next_closure = frozenset(epsilon_closure(next_states)) if not next_closure: continue if next_closure not in dfa_states: dfa_states[next_closure] = f'S{count}' count += 1 queue.append(next_closure) dfa_transitions[(dfa_states[current], symbol)] = dfa_states[next_closure] print('DFA States:', list(dfa_states.values())) print('DFA Transitions:') for (src, sym), dst in dfa_transitions.items(): print(f'{src} -{sym}-> {dst}') # Note: This method names DFA states explicitly for clarity.
def nfa_to_dfa_minimized(nfa_states, nfa_start, nfa_accept, nfa_transitions, alphabet): # First convert NFA to DFA using subset construction (as above) # Then apply table-filling algorithm to minimize DFA states # This code focuses on conversion; minimization is a separate step pass
Complexity: O(2^n * |Σ|) time, O(2^n) space
Time Complexity
The subset construction can generate up to 2^n DFA states from n NFA states, and for each state, transitions for each input symbol are computed.
Space Complexity
Storing all subsets of NFA states requires space proportional to 2^n, which can be large for big NFAs.
Which Approach is Fastest?
Direct subset construction is standard; naming states explicitly improves clarity but adds overhead; minimization reduces states but adds extra steps.
| Approach | Time | Space | Best For |
|---|---|---|---|
| Standard Subset Construction | O(2^n * |Σ|) | O(2^n) | General conversion |
| Explicit State Naming | O(2^n * |Σ|) | O(2^n) | Readability and debugging |
| Conversion + Minimization | O(2^n * |Σ|) + extra | O(2^n) | Optimized DFA size |