Challenge - 5 Problems
Data Validation Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Check for missing values in a DataFrame
What is the output of this code that checks for missing values in the DataFrame?
Pandas
import pandas as pd df = pd.DataFrame({ 'A': [1, 2, None, 4], 'B': ['x', None, 'y', 'z'] }) result = df.isnull().sum()
Attempts:
2 left
💡 Hint
Use the isnull() method to find missing values and sum() to count them per column.
✗ Incorrect
The DataFrame has one missing value in column 'A' and one in column 'B'. The isnull() method marks missing values as True, and sum() counts them.
❓ data_output
intermediate2:00remaining
Count unique values per column
What is the output of this code that counts unique values in each column of the DataFrame?
Pandas
import pandas as pd df = pd.DataFrame({ 'Color': ['red', 'blue', 'red', 'green'], 'Shape': ['circle', 'square', 'circle', 'triangle'] }) unique_counts = df.nunique()
Attempts:
2 left
💡 Hint
The nunique() method counts distinct values per column.
✗ Incorrect
The 'Color' column has three unique values: red, blue, green. The 'Shape' column has three unique values: circle, square, triangle.
🔧 Debug
advanced2:00remaining
Identify the error in data type validation
What error does this code raise when checking if all values in column 'Age' are integers?
Pandas
import pandas as pd df = pd.DataFrame({'Age': [25, 30, 'thirty-five', 40]}) all_int = df['Age'].apply(lambda x: isinstance(x, int)).all()
Attempts:
2 left
💡 Hint
Check the data types of each value in the 'Age' column.
✗ Incorrect
The third value is a string 'thirty-five', so the isinstance check returns False for that value, making all() return False.
🚀 Application
advanced2:00remaining
Detect duplicate rows in a DataFrame
Which option correctly returns a DataFrame containing only the duplicate rows?
Pandas
import pandas as pd df = pd.DataFrame({ 'ID': [1, 2, 2, 3, 4, 4, 4], 'Value': ['a', 'b', 'b', 'c', 'd', 'd', 'd'] })
Attempts:
2 left
💡 Hint
Use duplicated() with keep=False to mark all duplicates as True.
✗ Incorrect
duplicated(keep=False) marks all occurrences of duplicates as True, so filtering returns all duplicate rows.
🧠 Conceptual
expert3:00remaining
Understanding data validation with custom rules
You want to validate a DataFrame column 'Score' to ensure all values are between 0 and 100 inclusive. Which code snippet correctly returns True if all values meet this condition?
Attempts:
2 left
💡 Hint
Use the pandas between() method for inclusive range checks.
✗ Incorrect
The between() method checks if values are between two bounds inclusive. Using .all() confirms all values meet the condition.