Bottom Up Parsing: Definition, How It Works, and Examples
shift and reduce actions.How It Works
Bottom up parsing works like solving a puzzle from the small pieces to the big picture. Imagine you have a sentence made of words, and you want to understand its structure. Instead of starting from the sentence's main idea, bottom up parsing starts from the words (the smallest parts) and tries to combine them into bigger parts until it forms the whole sentence.
In technical terms, the parser reads the input from left to right and uses a stack to keep track of symbols. It performs two main actions: shift, which means reading the next input symbol and pushing it onto the stack, and reduce, which means replacing a group of symbols on the stack with a higher-level symbol according to grammar rules. This process continues until the stack contains only the start symbol, meaning the input is successfully parsed.
Example
This example shows a simple bottom up parser simulation for the expression id + id using a grammar with rules for identifiers and addition.
grammar = {
'E': ['E + E', 'id']
}
input_tokens = ['id', '+', 'id']
stack = []
# Simple function to check if top of stack matches a grammar rule's right side
def can_reduce(stack):
for lhs, rhs_list in grammar.items():
for rhs in rhs_list:
rhs_tokens = rhs.split()
if len(stack) >= len(rhs_tokens) and stack[-len(rhs_tokens):] == rhs_tokens:
return lhs, len(rhs_tokens)
return None, 0
print('Parsing steps:')
for token in input_tokens:
stack.append(token) # shift
print(f'Shift: stack = {stack}')
while True:
lhs, length = can_reduce(stack)
if lhs:
# reduce
for _ in range(length):
stack.pop()
stack.append(lhs)
print(f'Reduce: stack = {stack}')
else:
break
if stack == ['E']:
print('Input successfully parsed as E')
else:
print('Parsing failed')When to Use
Bottom up parsing is used when you want a parser that can handle a wide range of programming languages and complex grammar rules. It is especially useful in compiler design because it can detect syntax errors as soon as possible and can parse languages that have ambiguous or complicated structures.
Real-world use cases include compilers for languages like C, C++, and Java, where bottom up parsers such as LR parsers are common. It is also used in tools that analyze or transform code, like interpreters and static analyzers.
Key Points
- Bottom up parsing builds the parse tree from the leaves (input symbols) up to the root (start symbol).
- It uses a stack and two main actions: shift (read input) and reduce (apply grammar rules).
- It can handle a larger class of grammars than top down parsing.
- Common bottom up parsers include LR, SLR, and LALR parsers.