Python Program to Compress a String Using Run-Length Encoding
def compress_string(s): compressed = ''; count = 1; for i in range(1, len(s)+1): if i < len(s) and s[i] == s[i-1]: count += 1 else: compressed += s[i-1] + str(count); count = 1; return compressed.Examples
How to Think About It
Algorithm
Code
def compress_string(s): if not s: return "" compressed = "" count = 1 for i in range(1, len(s)): if s[i] == s[i - 1]: count += 1 else: compressed += s[i - 1] + str(count) count = 1 compressed += s[-1] + str(count) return compressed print(compress_string("aaabbc")) print(compress_string("abcd")) print(compress_string(""))
Dry Run
Let's trace the string 'aaabbc' through the code
Initialize variables
compressed = '', count = 1
Compare characters at index 1 and 0
s[1] = 'a', s[0] = 'a', same -> count = 2
Compare characters at index 2 and 1
s[2] = 'a', s[1] = 'a', same -> count = 3
Compare characters at index 3 and 2
s[3] = 'b', s[2] = 'a', different -> compressed = 'a3', count = 1
Compare characters at index 4 and 3
s[4] = 'b', s[3] = 'b', same -> count = 2
Compare characters at index 5 and 4
s[5] = 'c', s[4] = 'b', different -> compressed = 'a3b2', count = 1
End of string reached
Add last char and count: compressed = 'a3b2c1'
| Index | Current Char | Previous Char | Count | Compressed String |
|---|---|---|---|---|
| 1 | a | a | 2 | |
| 2 | a | a | 3 | |
| 3 | b | a | 1 | a3 |
| 4 | b | b | 2 | a3 |
| 5 | c | b | 1 | a3b2 |
Why This Works
Step 1: Counting consecutive characters
The code counts how many times each character repeats in a row using a counter variable.
Step 2: Building the compressed string
When a new character appears, the previous character and its count are added to the result string.
Step 3: Handling the last character
After the loop ends, the last character and its count are added to ensure all characters are included.
Alternative Approaches
from itertools import groupby def compress_string_itertools(s): return ''.join(char + str(len(list(group))) for char, group in groupby(s)) print(compress_string_itertools('aaabbc'))
import re def compress_string_regex(s): return ''.join(f'{m.group(0)[0]}{len(m.group(0))}' for m in re.finditer(r'(.)\1*', s)) print(compress_string_regex('aaabbc'))
Complexity: O(n) time, O(n) space
Time Complexity
The program loops through the string once, so it takes linear time proportional to the string length.
Space Complexity
The compressed string can be at most twice the length of the input (character plus count), so space is linear.
Which Approach is Fastest?
The manual loop and itertools.groupby both run in O(n) time; itertools is concise but may have slight overhead.
| Approach | Time | Space | Best For |
|---|---|---|---|
| Manual loop | O(n) | O(n) | Clear logic and beginner-friendly |
| itertools.groupby | O(n) | O(n) | Concise and Pythonic |
| Regular expressions | O(n) | O(n) | When comfortable with regex |