Challenge - 5 Problems
String Operations Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of pandas string split and expand
What is the output of this code snippet that splits a column of full names into first and last names?
Pandas
import pandas as pd df = pd.DataFrame({'Name': ['Alice Smith', 'Bob Jones', 'Carol Lee']}) df[['First', 'Last']] = df['Name'].str.split(' ', expand=True) df
Attempts:
2 left
💡 Hint
Look at how the split with expand=True creates new columns in the DataFrame.
✗ Incorrect
The str.split with expand=True splits the string into separate columns. Here, 'Name' is split into 'First' and 'Last' columns.
❓ data_output
intermediate1:30remaining
Count of rows containing a substring
Given this DataFrame, how many rows contain the substring 'cat' in the 'Animal' column?
Pandas
import pandas as pd df = pd.DataFrame({'Animal': ['cat', 'dog', 'caterpillar', 'bird', 'scatter']}) count = df['Animal'].str.contains('cat').sum() count
Attempts:
2 left
💡 Hint
Check which strings have 'cat' anywhere inside them.
✗ Incorrect
The strings 'cat', 'caterpillar', and 'scatter' all contain 'cat', so the count is 3.
🔧 Debug
advanced1:30remaining
Identify the error in string replacement
What error does this code raise when trying to replace 'dog' with 'cat' in the 'Animal' column?
Pandas
import pandas as pd df = pd.DataFrame({'Animal': ['dog', 'dogfish', 'hotdog']}) df['Animal'].str.replace('dog')
Attempts:
2 left
💡 Hint
Check the required arguments for str.replace method.
✗ Incorrect
The str.replace method requires two arguments: the pattern to replace and the replacement string. Here only one argument is given, causing a TypeError.
❓ visualization
advanced2:30remaining
Visualizing string length distribution
Which option produces a histogram of the lengths of strings in the 'Words' column?
Pandas
import pandas as pd import matplotlib.pyplot as plt df = pd.DataFrame({'Words': ['apple', 'banana', 'pear', 'kiwi', 'grape']}) lengths = df['Words'].str.len() plt.hist(lengths) plt.xlabel('Length of word') plt.ylabel('Frequency') plt.title('Histogram of word lengths') plt.show()
Attempts:
2 left
💡 Hint
Histogram groups numeric values into bins and shows frequency.
✗ Incorrect
The code calculates string lengths and plots a histogram showing how many words have each length.
🧠 Conceptual
expert2:00remaining
Why string operations are crucial in data science
Which statement best explains why string operations matter in data science?
Attempts:
2 left
💡 Hint
Think about the role of text data in real-world datasets.
✗ Incorrect
Text data is common and often messy. String operations help clean, extract, and transform this data for meaningful analysis.