Python Program to Find Duplicate in List
duplicates = set(x for x in lst if lst.count(x) > 1).Examples
How to Think About It
Algorithm
Code
def find_duplicates(lst): duplicates = set() seen = set() for item in lst: if item in seen: duplicates.add(item) else: seen.add(item) return duplicates # Example usage my_list = [1, 2, 2, 3, 4, 4, 4] print(find_duplicates(my_list))
Dry Run
Let's trace the list [1, 2, 2, 3, 4, 4, 4] through the code.
Start with empty sets
duplicates = set(), seen = set()
Check item 1
1 not in seen, add 1 to seen -> seen = {1}
Check item 2
2 not in seen, add 2 to seen -> seen = {1, 2}
Check next item 2
2 in seen, add 2 to duplicates -> duplicates = {2}
Check item 3
3 not in seen, add 3 to seen -> seen = {1, 2, 3}
Check item 4
4 not in seen, add 4 to seen -> seen = {1, 2, 3, 4}
Check next item 4
4 in seen, add 4 to duplicates -> duplicates = {2, 4}
Check next item 4
4 in seen, add 4 to duplicates (already there) -> duplicates = {2, 4}
| Item | Seen Set | Duplicates Set |
|---|---|---|
| 1 | {1} | {} |
| 2 | {1, 2} | {} |
| 2 | {1, 2} | {2} |
| 3 | {1, 2, 3} | {2} |
| 4 | {1, 2, 3, 4} | {2} |
| 4 | {1, 2, 3, 4} | {2, 4} |
| 4 | {1, 2, 3, 4} | {2, 4} |
Why This Works
Step 1: Track seen items
We use a seen set to remember which items we have already checked.
Step 2: Identify duplicates
If an item is already in seen, it means it appeared before, so we add it to duplicates.
Step 3: Return duplicates
At the end, the duplicates set contains all items that appeared more than once.
Alternative Approaches
def find_duplicates(lst): return set(x for x in lst if lst.count(x) > 1) print(find_duplicates([1, 2, 2, 3, 4, 4, 4]))
from collections import Counter def find_duplicates(lst): counts = Counter(lst) return {item for item, count in counts.items() if count > 1} print(find_duplicates([1, 2, 2, 3, 4, 4, 4]))
Complexity: O(n) time, O(n) space
Time Complexity
The main method loops through the list once, so it takes O(n) time where n is the list size.
Space Complexity
It uses extra sets to store seen items and duplicates, so space is O(n) in the worst case.
Which Approach is Fastest?
Using collections.Counter or tracking seen items with sets is faster than counting each item repeatedly.
| Approach | Time | Space | Best For |
|---|---|---|---|
| Tracking seen items with sets | O(n) | O(n) | Large lists, efficient |
| Using list count method | O(n²) | O(n) | Small lists, simple code |
| Using collections.Counter | O(n) | O(n) | Clean and efficient counting |