0
0
Pandasdata~20 mins

str.contains() for pattern matching in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Pattern Matching Pro
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of str.contains() with case sensitivity
What is the output of the following code snippet?
Pandas
import pandas as pd

df = pd.DataFrame({'text': ['Apple', 'banana', 'Cherry', 'date']})
result = df['text'].str.contains('a')
print(result.tolist())
A[false, true, false, true]
B[true, true, true, true]
C[true, true, false, true]
D[true, false, false, true]
Attempts:
2 left
💡 Hint
Remember that str.contains() is case sensitive by default.
data_output
intermediate
2:00remaining
Filter rows using str.contains() with regex
Given the DataFrame below, which rows remain after filtering with str.contains('^b.*a$', regex=true)?
Pandas
import pandas as pd

df = pd.DataFrame({'words': ['banana', 'beta', 'bat', 'boa', 'baba', 'cab']})
filtered = df[df['words'].str.contains('^b.*a$', regex=True)]
print(filtered['words'].tolist())
A['banana', 'beta', 'baba']
B['banana', 'beta', 'bat', 'baba']
C['banana', 'beta', 'baba', 'bat', 'boa']
D['banana', 'beta', 'boa', 'baba']
Attempts:
2 left
💡 Hint
The regex '^b.*a$' means strings starting with 'b' and ending with 'a'.
🔧 Debug
advanced
2:00remaining
Identify the error in str.contains() usage
What error does the following code raise?
Pandas
import pandas as pd

df = pd.DataFrame({'col': ['abc', 'def', 'ghi']})
result = df['col'].str.contains(123)
print(result)
ATypeError: expected string or bytes-like object
BValueError: invalid regex pattern
CAttributeError: 'int' object has no attribute 'contains'
DNo error, outputs a boolean Series
Attempts:
2 left
💡 Hint
Check the type of the pattern argument passed to str.contains().
🚀 Application
advanced
2:00remaining
Use str.contains() to find rows with multiple patterns
You want to filter rows where the 'text' column contains either 'cat' or 'dog'. Which code correctly does this?
Pandas
import pandas as pd

df = pd.DataFrame({'text': ['catfish', 'dogma', 'bird', 'catalog', 'frog']})
Adf[df['text'].str.contains('cat&dog')]
Bdf[df['text'].str.contains('cat|dog')]
Cdf[df['text'].str.contains(['cat', 'dog'])]
Ddf[df['text'].str.contains('cat,dog')]
Attempts:
2 left
💡 Hint
Use regex OR operator '|' inside the pattern string.
🧠 Conceptual
expert
2:00remaining
Understanding na parameter in str.contains()
What is the output of the following code?
Pandas
import pandas as pd
import numpy as np

df = pd.DataFrame({'text': ['apple', np.nan, 'banana', np.nan]})
result = df['text'].str.contains('a', na=False)
print(result.tolist())
A[true, true, true, true]
B[true, true, false, false]
C[true, false, true, false]
D[true, false, true, true]
Attempts:
2 left
💡 Hint
The na parameter controls the output for missing values.