PythonProgramBeginner · 2 min read

Python Program to Compress a String Using Run-Length Encoding

You can compress a string in Python using run-length encoding by iterating through the string, counting consecutive characters, and storing each character with its count as in

def compress_string(s): compressed = ''; count = 1; for i in range(1, len(s)+1): if i < len(s) and s[i] == s[i-1]: count += 1 else: compressed += s[i-1] + str(count); count = 1; return compressed

📋

Examples

Inputaaabbc

Outputa3b2c1

Inputabcd

Outputa1b1c1d1

Input

Output

🧠

How to Think About It

To compress a string, look at each character and count how many times it repeats in a row. When the character changes, write down the character and the count. Repeat this until the end of the string.

📐

Algorithm

Get the input string.

Initialize an empty result string and a count variable set to 1.

Loop through the string from the second character to the end.

If the current character is the same as the previous one, increase the count.

If it is different, add the previous character and count to the result, then reset count to 1.

After the loop, add the last character and its count to the result.

Return the compressed string.

💻

Code

python

def compress_string(s):
    if not s:
        return ""
    compressed = ""
    count = 1
    for i in range(1, len(s)):
        if s[i] == s[i - 1]:
            count += 1
        else:
            compressed += s[i - 1] + str(count)
            count = 1
    compressed += s[-1] + str(count)
    return compressed

print(compress_string("aaabbc"))
print(compress_string("abcd"))
print(compress_string(""))

Output

a3b2c1 a1b1c1d1

🔍

Dry Run

Let's trace the string 'aaabbc' through the code

Initialize variables

compressed = '', count = 1

Compare characters at index 1 and 0

s[1] = 'a', s[0] = 'a', same -> count = 2

Compare characters at index 2 and 1

s[2] = 'a', s[1] = 'a', same -> count = 3

Compare characters at index 3 and 2

s[3] = 'b', s[2] = 'a', different -> compressed = 'a3', count = 1

Compare characters at index 4 and 3

s[4] = 'b', s[3] = 'b', same -> count = 2

Compare characters at index 5 and 4

s[5] = 'c', s[4] = 'b', different -> compressed = 'a3b2', count = 1

End of string reached

Add last char and count: compressed = 'a3b2c1'

Index	Current Char	Previous Char	Count	Compressed String
1	a	a	2
2	a	a	3
3	b	a	1	a3
4	b	b	2	a3
5	c	b	1	a3b2

💡

Why This Works

Step 1: Counting consecutive characters

The code counts how many times each character repeats in a row using a counter variable.

Step 2: Building the compressed string

When a new character appears, the previous character and its count are added to the result string.

Step 3: Handling the last character

After the loop ends, the last character and its count are added to ensure all characters are included.

🔄

Alternative Approaches

Using itertools.groupby

python

from itertools import groupby

def compress_string_itertools(s):
    return ''.join(char + str(len(list(group))) for char, group in groupby(s))

print(compress_string_itertools('aaabbc'))

This method is shorter and uses Python's built-in grouping but may be less clear for beginners.

Using regular expressions

python

import re

def compress_string_regex(s):
    return ''.join(f'{m.group(0)[0]}{len(m.group(0))}' for m in re.finditer(r'(.)\1*', s))

print(compress_string_regex('aaabbc'))

This uses regex to find repeated characters but requires understanding of regular expressions.

⚡

Complexity: O(n) time, O(n) space

Time Complexity

The program loops through the string once, so it takes linear time proportional to the string length.

Space Complexity

The compressed string can be at most twice the length of the input (character plus count), so space is linear.

Which Approach is Fastest?

The manual loop and itertools.groupby both run in O(n) time; itertools is concise but may have slight overhead.

Approach	Time	Space	Best For
Manual loop	O(n)	O(n)	Clear logic and beginner-friendly
itertools.groupby	O(n)	O(n)	Concise and Pythonic
Regular expressions	O(n)	O(n)	When comfortable with regex

💡

Use run-length encoding to compress strings by counting repeated characters in a row.

⚠️

Forgetting to add the last character and its count after the loop ends.