How to Use itertools.groupby in Python: Simple Guide
itertools.groupby to group consecutive items in an iterable by a key function. It returns keys and groups as iterators, so the input should be sorted by the same key for correct grouping.Syntax
The itertools.groupby function groups consecutive elements in an iterable based on a key function.
It returns an iterator of pairs: each pair has a key and a group iterator of items matching that key.
Basic syntax:
groupby(iterable, key=None)
iterable: The data to group.
key: A function to compute the grouping key for each element. Defaults to identity (element itself).
import itertools # groupby syntax groups = itertools.groupby(iterable, key=None) for key, group in groups: # key is the grouping key # group is an iterator of grouped items pass
Example
This example groups a sorted list of fruits by their first letter.
It shows how groupby returns keys and groups of items.
import itertools fruits = ['apple', 'apricot', 'banana', 'blueberry', 'cherry', 'clementine'] # Sort by first letter to group correctly fruits.sort(key=lambda x: x[0]) groups = itertools.groupby(fruits, key=lambda x: x[0]) for letter, group in groups: print(f"{letter}: {[item for item in group]}")
Common Pitfalls
1. Input must be sorted by the same key function. Otherwise, groupby only groups consecutive matching items, not all matching items.
2. The group is an iterator, so you must consume it before moving to the next group. If you try to reuse it later, it will be empty.
3. Using groupby on unsorted data leads to unexpected groups.
import itertools data = ['apple', 'banana', 'apricot', 'blueberry'] # Wrong: data not sorted by first letter groups = itertools.groupby(data, key=lambda x: x[0]) for key, group in groups: print(f"{key}: {[item for item in group]}") # Correct: sort data first print('\nAfter sorting:') data.sort(key=lambda x: x[0]) groups = itertools.groupby(data, key=lambda x: x[0]) for key, group in groups: print(f"{key}: {[item for item in group]}")
Quick Reference
- Input: Iterable sorted by the key function.
- Output: Iterator of (key, group) pairs.
- Group: An iterator of items with the same key.
- Key function: Defaults to identity if not provided.
- Use case: Group consecutive items sharing a property.