Challenge - 5 Problems
First Data Analysis Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of basic data filtering
What is the output of this code that filters rows where age is greater than 30?
Data Analysis Python
import pandas as pd data = {'name': ['Alice', 'Bob', 'Charlie', 'David'], 'age': [25, 35, 30, 40]} df = pd.DataFrame(data) filtered = df[df['age'] > 30] print(filtered)
Attempts:
2 left
💡 Hint
Look for rows where the age value is strictly greater than 30.
✗ Incorrect
The code filters the DataFrame to keep only rows where the 'age' column is greater than 30. Bob (35) and David (40) meet this condition.
❓ data_output
intermediate1:30remaining
Count unique values in a column
What is the output of this code that counts unique values in the 'city' column?
Data Analysis Python
import pandas as pd data = {'name': ['Anna', 'Ben', 'Cara', 'Dan'], 'city': ['NY', 'LA', 'NY', 'SF']} df = pd.DataFrame(data) count = df['city'].nunique() print(count)
Attempts:
2 left
💡 Hint
Count how many different cities appear in the list.
✗ Incorrect
The cities are NY, LA, NY, SF. Unique cities are NY, LA, SF, so count is 3.
❓ visualization
advanced2:30remaining
Identify the correct bar chart output
Which option shows the correct bar chart output for the count of fruits in this data?
Data Analysis Python
import pandas as pd import matplotlib.pyplot as plt data = {'fruit': ['apple', 'banana', 'apple', 'orange', 'banana', 'banana']} df = pd.DataFrame(data) counts = df['fruit'].value_counts() counts.plot(kind='bar') plt.show()
Attempts:
2 left
💡 Hint
Count how many times each fruit appears in the list.
✗ Incorrect
Banana appears 3 times, apple 2 times, orange 1 time. The bar chart shows these counts.
🔧 Debug
advanced2:00remaining
Identify the error in data aggregation code
What error does this code produce when trying to calculate the mean age grouped by city?
Data Analysis Python
import pandas as pd data = {'name': ['Eve', 'Frank'], 'age': [28, 33], 'city': ['NY', 'LA']} df = pd.DataFrame(data) result = df.groupby('city')['age'].mean print(result())
Attempts:
2 left
💡 Hint
Check if the mean function is called correctly.
✗ Incorrect
The code assigns the method 'mean' to result but does not call it before printing. Calling result() will work, but the code as is calls result() correctly. However, the code as given is correct and will print the mean ages. But the code has result = df.groupby('city')['age'].mean (without parentheses), so result is a method, and then print(result()) calls it. This is correct usage, so no error occurs.
🚀 Application
expert2:30remaining
Determine the number of rows after multiple filters
Given this DataFrame, how many rows remain after filtering for age >= 30 and city == 'NY'?
Data Analysis Python
import pandas as pd data = {'name': ['Gina', 'Hank', 'Ivy', 'Jack'], 'age': [29, 31, 30, 35], 'city': ['NY', 'NY', 'LA', 'NY']} df = pd.DataFrame(data) filtered = df[(df['age'] >= 30) & (df['city'] == 'NY')] print(len(filtered))
Attempts:
2 left
💡 Hint
Check each row if both conditions are true.
✗ Incorrect
Rows with age >= 30 and city 'NY' are Hank (31, NY) and Jack (35, NY), total 2 rows.