Top Down vs Bottom Up Parsing: Key Differences and When to Use Each
bottom up parsing builds the parse tree from the input tokens up to the start symbol. Top down parsers predict structure using grammar rules, whereas bottom up parsers reduce input to grammar rules by recognizing patterns.Quick Comparison
This table summarizes the main differences between top down and bottom up parsing approaches.
| Factor | Top Down Parsing | Bottom Up Parsing |
|---|---|---|
| Parsing Direction | From start symbol to input tokens | From input tokens to start symbol |
| Parse Tree Construction | Builds parse tree from root to leaves | Builds parse tree from leaves to root |
| Grammar Types Supported | Works well with LL grammars | Works well with LR grammars |
| Handling Left Recursion | Cannot handle left recursion | Can handle left recursion |
| Common Algorithms | Recursive descent, predictive parsing | Shift-reduce, LR parsing |
| Error Detection | Detects errors early | Detects errors late |
Key Differences
Top down parsing begins with the start symbol of the grammar and tries to rewrite it to match the input string by applying production rules. It predicts which rule to use next and recursively breaks down the input. This approach is intuitive and easy to implement but struggles with left-recursive grammars and requires the grammar to be LL(k) compatible.
In contrast, bottom up parsing starts with the input tokens and attempts to combine them into higher-level constructs by reversing the production rules. It uses a stack to hold symbols and applies reductions when a pattern matches a grammar rule. Bottom up parsers can handle a wider range of grammars, including left recursion, and are more powerful but also more complex to implement.
Another key difference is error handling: top down parsers detect errors as soon as they find no matching rule, while bottom up parsers may detect errors later during reductions. Overall, top down parsing is simpler and better for hand-written parsers, while bottom up parsing is preferred for automatic parser generation and complex languages.
Code Comparison
Here is a simple example of a top down parser using recursive descent to parse expressions with addition and multiplication.
def parse_expression(tokens): def parse_term(): nonlocal pos if pos < len(tokens) and tokens[pos].isdigit(): val = int(tokens[pos]) pos += 1 return val raise SyntaxError('Expected number') def parse_factor(): nonlocal pos val = parse_term() while pos < len(tokens) and tokens[pos] == '*': pos += 1 val *= parse_term() return val pos = 0 val = parse_factor() while pos < len(tokens) and tokens[pos] == '+': pos += 1 val += parse_factor() if pos != len(tokens): raise SyntaxError('Unexpected token') return val # Example usage expr = '2 + 3 * 4'.split() result = parse_expression(expr) print(result)
Bottom Up Equivalent
This example shows a simple bottom up parser using a shift-reduce approach to parse the same expression grammar.
def bottom_up_parse(tokens): stack = [] pos = 0 def reduce_stack(): if len(stack) >= 3 and isinstance(stack[-3], int) and stack[-2] == '*' and isinstance(stack[-1], int): right = stack.pop() stack.pop() # remove '*' left = stack.pop() stack.append(left * right) return True if len(stack) >= 3 and isinstance(stack[-3], int) and stack[-2] == '+' and isinstance(stack[-1], int): right = stack.pop() stack.pop() # remove '+' left = stack.pop() stack.append(left + right) return True return False while pos < len(tokens): token = tokens[pos] if token.isdigit(): stack.append(int(token)) pos += 1 else: stack.append(token) pos += 1 while reduce_stack(): pass while reduce_stack(): pass if len(stack) == 1 and isinstance(stack[0], int): return stack[0] raise SyntaxError('Invalid expression') # Example usage expr = '2 + 3 * 4'.split() result = bottom_up_parse(expr) print(result)
When to Use Which
Choose top down parsing when you want a simple, easy-to-understand parser for grammars without left recursion, especially if you write the parser by hand. It is ideal for small languages or when you need quick error detection.
Choose bottom up parsing when working with complex or ambiguous grammars that include left recursion, or when using parser generators like Yacc or Bison. Bottom up parsers handle a wider range of grammars and are better for production compilers.