0
0
Compiler-designConceptBeginner · 3 min read

What is LR Parser: Explanation, Example, and Use Cases

An LR parser is a type of bottom-up parser used in compilers to analyze the syntax of programming languages. It reads input from left to right and produces a rightmost derivation in reverse, efficiently handling a wide range of grammars.
⚙️

How It Works

An LR parser works like a detective piecing together clues from left to right to understand a sentence's structure. It uses a stack to keep track of what it has seen and a set of rules to decide when to combine parts into bigger chunks.

Imagine reading a sentence word by word and trying to build its meaning by grouping words into phrases. The LR parser shifts words onto a stack until it recognizes a pattern that matches a grammar rule, then it reduces those words into a single unit. This process repeats until the whole sentence is understood or an error is found.

This method is powerful because it can handle many complex grammar rules that simpler parsers cannot, making it very useful in real-world programming languages.

💻

Example

This example shows a simple LR parser simulation in Python that parses arithmetic expressions with addition and multiplication.

python
class LRParser:
    def __init__(self):
        self.stack = [0]  # stack holds states
        self.input = []

        # Action table: state -> input -> action
        # s = shift, r = reduce, acc = accept
        self.action = {
            0: {'num': 's5', '+': None, '*': None, '(': 's4', ')': None, '$': None},
            1: {'num': None, '+': 's6', '*': None, '(': None, ')': None, '$': 'acc'},
            2: {'num': None, '+': 'r2', '*': 's7', '(': None, ')': 'r2', '$': 'r2'},
            3: {'num': None, '+': 'r4', '*': 'r4', '(': None, ')': 'r4', '$': 'r4'},
            4: {'num': 's5', '+': None, '*': None, '(': 's4', ')': None, '$': None},
            5: {'num': None, '+': 'r6', '*': 'r6', '(': None, ')': 'r6', '$': 'r6'},
            6: {'num': 's5', '+': None, '*': None, '(': 's4', ')': None, '$': None},
            7: {'num': 's5', '+': None, '*': None, '(': 's4', ')': None, '$': None},
            8: {'num': None, '+': 's6', '*': None, '(': None, ')': 's11', '$': None},
            9: {'num': None, '+': 'r1', '*': 's7', '(': None, ')': 'r1', '$': 'r1'},
            10: {'num': None, '+': 'r3', '*': 'r3', '(': None, ')': 'r3', '$': 'r3'},
            11: {'num': None, '+': 'r5', '*': 'r5', '(': None, ')': 'r5', '$': 'r5'},
        }

        # Goto table: state -> non-terminal -> state
        self.goto = {
            0: {'E': 1, 'T': 2, 'F': 3},
            4: {'E': 8, 'T': 2, 'F': 3},
            6: {'T': 9, 'F': 3},
            7: {'F': 10},
        }

        # Production rules: number -> (LHS, RHS length)
        self.productions = {
            1: ('E', 3),  # E -> E + T
            2: ('E', 1),  # E -> T
            3: ('T', 3),  # T -> T * F
            4: ('T', 1),  # T -> F
            5: ('F', 3),  # F -> ( E )
            6: ('F', 1),  # F -> num
        }

    def parse(self, tokens):
        self.input = tokens + ['$']
        index = 0

        while True:
            state = self.stack[-1]
            token = self.input[index]
            action = self.action.get(state, {}).get(token)

            if action is None:
                return False  # error

            if action == 'acc':
                return True  # accept

            if action.startswith('s'):
                # shift
                self.stack.append(token)
                self.stack.append(int(action[1:]))
                index += 1

            elif action.startswith('r'):
                # reduce
                prod_num = int(action[1:])
                lhs, rhs_len = self.productions[prod_num]
                pop_len = rhs_len * 2
                for _ in range(pop_len):
                    self.stack.pop()
                state = self.stack[-1]
                self.stack.append(lhs)
                self.stack.append(self.goto[state][lhs])

parser = LRParser()
tokens = ['num', '+', 'num', '*', 'num']
result = parser.parse(tokens)
print('Accepted' if result else 'Rejected')
Output
Accepted
🎯

When to Use

Use an LR parser when you need a reliable and efficient way to check if a program's code follows the grammar rules of a language. It is especially useful in compiler design for languages with complex syntax because it can handle a wide variety of grammar types.

Real-world compilers for languages like C, C++, and Java often use LR parsers or their variants because they can detect errors early and provide clear feedback. If you are building a new programming language or a tool that processes code, an LR parser is a strong choice for the syntax analysis step.

Key Points

  • LR parsers read input left to right and build a rightmost derivation in reverse.
  • They use a stack and tables to decide when to shift (read) or reduce (combine) tokens.
  • LR parsers can handle a large class of grammars, including many programming languages.
  • They are commonly used in compiler syntax analysis for their power and efficiency.

Key Takeaways

An LR parser efficiently analyzes programming language syntax by reading left to right and using a stack-based approach.
It decides actions using tables to shift tokens or reduce them based on grammar rules.
LR parsers are powerful and handle complex grammars, making them ideal for compiler design.
They help detect syntax errors early and provide clear structure for code analysis.