0
0
Compiler-designConceptBeginner · 3 min read

What is Pattern in Lexical Analysis: Definition and Examples

In lexical analysis, a pattern is a rule or description that defines the structure of a token, such as keywords, identifiers, or numbers. It helps the lexer recognize and group characters from source code into meaningful units called tokens.
⚙️

How It Works

Think of lexical analysis as reading a sentence and breaking it into words. A pattern is like a rule that tells you what a valid word looks like. For example, a pattern for an identifier might say: "start with a letter, followed by letters or digits."

The lexer uses these patterns to scan the source code from left to right. When it finds characters that match a pattern, it groups them into a token. This process is similar to how you recognize words in a sentence by their shape and letters.

💻

Example

This example shows simple patterns for tokens like identifiers and numbers using regular expressions. The code matches input strings to these patterns and prints the token type.

python
import re

def lexical_analysis(text):
    patterns = {
        'IDENTIFIER': r'[a-zA-Z_][a-zA-Z0-9_]*',
        'NUMBER': r'\d+(\.\d+)?',
        'PLUS': r'\+',
        'WHITESPACE': r'\s+'
    }

    pos = 0
    while pos < len(text):
        match = None
        for token_type, pattern in patterns.items():
            regex = re.compile(pattern)
            match = regex.match(text, pos)
            if match:
                if token_type != 'WHITESPACE':  # skip whitespace
                    print(f"Token: {token_type}, Value: '{match.group()}'")
                pos = match.end()
                break
        if not match:
            print(f"Unknown character: {text[pos]}")
            pos += 1

# Test the lexer
lexical_analysis("var1 + 42")
Output
Token: IDENTIFIER, Value: 'var1' Token: PLUS, Value: '+' Token: NUMBER, Value: '42'
🎯

When to Use

Patterns in lexical analysis are used when building compilers or interpreters to convert raw source code into tokens. They help identify keywords, operators, numbers, and identifiers clearly and efficiently.

In real life, any tool that reads structured text—like code editors, syntax highlighters, or data parsers—uses patterns to understand and process input correctly.

Key Points

  • A pattern defines the shape or structure of a token in source code.
  • Lexical analyzers use patterns to split code into meaningful tokens.
  • Patterns are often expressed using regular expressions.
  • They are essential for compilers, interpreters, and text processing tools.

Key Takeaways

A pattern describes how to recognize a token in source code during lexical analysis.
Lexers use patterns to group characters into tokens like identifiers, numbers, and operators.
Patterns are commonly written as regular expressions for easy matching.
Using patterns helps tools understand and process programming languages efficiently.