We use str.contains() to find if text data has a certain pattern or word. It helps us quickly check or filter data based on text.
str.contains() for pattern matching in Pandas
DataFrame['column_name'].str.contains(pattern, case=True, na=False, regex=True)
pattern is the text or regular expression you want to find.
case=True means matching is case sensitive. Use case=False to ignore case.
df['Name'].str.contains('John')
df['Email'].str.contains('@gmail.com', case=False)
df['Description'].str.contains('organic|natural', regex=True)
df['Phone'].str.contains('^\d{3}', regex=True)
This program creates a table of products with descriptions. It then finds all products whose description mentions 'organic' or 'natural', ignoring case. It prints the original and filtered tables.
import pandas as pd # Create a sample DataFrame products = pd.DataFrame({ 'Product': ['Apple Juice', 'Orange Juice', 'Organic Milk', 'Natural Honey', 'Regular Milk'], 'Description': ['Fresh apple juice', 'Sweet orange juice', '100% organic milk', 'Pure natural honey', None] }) print('Original DataFrame:') print(products) # Find rows where Description contains 'organic' or 'natural' ignoring case pattern = 'organic|natural' filtered = products[products['Description'].str.contains(pattern, case=False, na=False, regex=True)] print('\nFiltered DataFrame (contains "organic" or "natural"):') print(filtered)
Time complexity: O(n) where n is number of rows, because it checks each row's text.
Space complexity: O(n) for the boolean mask created during filtering.
Common mistake: Forgetting na=False causes errors if there are missing values.
Use str.contains() when you want to check if text includes a pattern. Use other methods like str.startswith() if you only want to check the start.
str.contains() helps find text patterns in columns.
It supports case sensitivity and regular expressions.
Always handle missing data with na=False to avoid errors.