Challenge - 5 Problems

🎖️

End-to-End Analysis Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Understanding data flow in end-to-end analysis

What is the output of the following code that simulates a simple end-to-end data analysis pipeline?

Pandas

import pandas as pd

data = {'sales': [100, 200, 300, 400], 'cost': [50, 80, 120, 150]}
df = pd.DataFrame(data)
df['profit'] = df['sales'] - df['cost']
total_profit = df['profit'].sum()
print(total_profit)

A600

B850

C1000

D900

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Result of filtering and aggregation in analysis

Given the DataFrame below, what is the output after filtering rows where 'age' > 30 and calculating the mean of 'income'?

Pandas

import pandas as pd

data = {'name': ['Anna', 'Bob', 'Cara', 'Dan'], 'age': [25, 35, 45, 28], 'income': [50000, 60000, 70000, 55000]}
df = pd.DataFrame(data)
filtered = df[df['age'] > 30]
mean_income = filtered['income'].mean()
print(mean_income)

A70000.0

B60000.0

C65000.0

D57500.0

Attempts:

2 left

❓ visualization

advanced

2:00remaining

Identifying trends with visualization in end-to-end analysis

Which plot best shows the trend of monthly sales over time from the DataFrame below?

Pandas

import pandas as pd
import matplotlib.pyplot as plt

data = {'month': ['Jan', 'Feb', 'Mar', 'Apr'], 'sales': [200, 220, 250, 270]}
df = pd.DataFrame(data)
plt.plot(df['month'], df['sales'])
plt.title('Monthly Sales Trend')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.show()

ABar plot showing sales decreasing from Jan to Apr

BPie chart showing sales distribution by month

CScatter plot showing random sales values

DLine plot showing sales increasing from Jan to Apr

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Finding the error in a data cleaning step

What error does the following code raise when trying to remove rows with missing values?

Pandas

import pandas as pd

data = {'A': [1, 2, None], 'B': [4, None, 6]}
df = pd.DataFrame(data)
df_clean = df.dropna(inplace=True)
print(df_clean)

ANone (prints None because dropna with inplace=True returns None)

BAttributeError because dropna is not a DataFrame method

CTypeError because inplace must be False

DKeyError because 'inplace' is not a valid argument

Attempts:

2 left

🚀 Application

expert

3:00remaining

Why end-to-end analysis is crucial for decision making

You have sales data with missing values and outliers. Which approach best represents an end-to-end analysis to prepare data for reliable insights?

AVisualize raw data without cleaning to avoid bias

BClean missing data, remove outliers, analyze trends, and visualize results

CAnalyze trends first, then clean missing data and remove outliers

DRemove outliers only, ignoring missing data and analysis

Attempts:

2 left