Challenge - 5 Problems
Data Analysis Agent Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of a simple data filtering step in an agent pipeline
Consider an agent pipeline that filters a dataset to include only rows where the value in column 'age' is greater than 30. What is the output DataFrame after this filtering?
Agentic AI
import pandas as pd data = pd.DataFrame({'name': ['Alice', 'Bob', 'Charlie', 'David'], 'age': [25, 35, 30, 40]}) filtered_data = data[data['age'] > 30] print(filtered_data)
Attempts:
2 left
💡 Hint
Remember the condition is strictly greater than 30, so age 30 is excluded.
✗ Incorrect
The filter selects rows where 'age' > 30, so only Bob (35) and David (40) remain.
❓ data_output
intermediate2:00remaining
Result of aggregation in a data analysis pipeline
An agent pipeline groups data by 'department' and calculates the average salary. What is the resulting DataFrame?
Agentic AI
import pandas as pd data = pd.DataFrame({'department': ['HR', 'IT', 'HR', 'IT', 'Finance'], 'salary': [50000, 60000, 55000, 65000, 70000]}) grouped = data.groupby('department')['salary'].mean().reset_index() print(grouped)
Attempts:
2 left
💡 Hint
Average salary means sum of salaries divided by count per department.
✗ Incorrect
HR average: (50000+55000)/2=52500, IT average: (60000+65000)/2=62500, Finance: 70000 alone.
❓ visualization
advanced3:00remaining
Identify the correct plot output from a data analysis agent pipeline
An agent pipeline creates a bar plot showing total sales per product category. Which option correctly describes the plot output?
Agentic AI
import pandas as pd import matplotlib.pyplot as plt data = pd.DataFrame({'category': ['A', 'B', 'A', 'C', 'B'], 'sales': [100, 200, 150, 300, 250]}) totals = data.groupby('category')['sales'].sum() totals.plot(kind='bar') plt.show()
Attempts:
2 left
💡 Hint
Sum sales per category: A=100+150=250, B=200+250=450, C=300.
✗ Incorrect
The code groups sales by category and plots a bar chart of sums: A=250, B=450, C=300.
🔧 Debug
advanced2:00remaining
Identify the error in a data cleaning step of an agent pipeline
An agent pipeline tries to fill missing values in a DataFrame column 'score' with the mean score but raises an error. What is the error?
Agentic AI
import pandas as pd data = pd.DataFrame({'score': [10, None, 30, None, 50]}) mean_score = data['score'].mean() data['score'] = data['score'].fillna(mean_score()) print(data)
Attempts:
2 left
💡 Hint
Check how mean_score is used in fillna.
✗ Incorrect
mean_score is a float value, calling mean_score() causes TypeError.
🚀 Application
expert3:00remaining
Determine the final output of a multi-step data analysis agent pipeline
An agent pipeline performs these steps on a DataFrame: 1) filters rows where 'value' > 10, 2) creates a new column 'value_squared' as square of 'value', 3) groups by 'category' and sums 'value_squared'. What is the final output DataFrame?
Agentic AI
import pandas as pd data = pd.DataFrame({'category': ['X', 'Y', 'X', 'Y', 'Z'], 'value': [5, 15, 20, 8, 25]}) filtered = data[data['value'] > 10] filtered['value_squared'] = filtered['value'] ** 2 grouped = filtered.groupby('category')['value_squared'].sum().reset_index() print(grouped)
Attempts:
2 left
💡 Hint
Calculate squares only for values > 10, then sum per category.
✗ Incorrect
Filtered rows: Y=15, X=20, Z=25; squares: 225, 400, 625; sums per category: X=400, Y=225, Z=625.
