Tokens, patterns, and lexemes in Compiler Design - Time & Space Complexity
When analyzing tokens, patterns, and lexemes, we want to understand how the time to recognize these elements grows as the input size increases.
We ask: How does the work of scanning text for tokens change when the input gets longer?
Analyze the time complexity of the following lexical analysis snippet.
for each character in input:
if character matches pattern:
form lexeme
create token
else:
move to next character
This code scans each character of the input to find lexemes matching token patterns.
Look for repeated actions in the code.
- Primary operation: Checking each character against token patterns.
- How many times: Once for every character in the input.
As the input text gets longer, the number of characters to check grows directly with its length.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 checks |
| 100 | About 100 checks |
| 1000 | About 1000 checks |
Pattern observation: The work grows in a straight line with input size.
Time Complexity: O(n)
This means the time to process tokens grows directly in proportion to the input length.
[X] Wrong: "Checking tokens takes the same time no matter how long the input is."
[OK] Correct: Each character must be examined, so longer inputs always take more time.
Understanding how scanning input grows with size helps you explain how compilers handle code efficiently.
What if the code used nested loops to check multiple patterns for each character? How would the time complexity change?