Challenge - 5 Problems

🎖️

Pandas Data Analysis Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

Why is Pandas preferred for handling tabular data?

Pandas is widely used for data analysis. Which reason best explains why Pandas is preferred for handling tabular data?

AIt provides easy-to-use data structures like DataFrame and Series that allow fast data manipulation and analysis.

BIt is a tool for writing machine learning models without any data preprocessing.

CIt is a database management system optimized for large-scale data storage.

DIt is mainly designed for creating visualizations and charts from data.

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

Output of Pandas DataFrame filtering

What is the output of this code snippet?

Pandas

import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8]})
filtered = df[df['A'] > 2]
print(filtered)

   A  B
0  1  5
1  2  6

   A  B
2  3  7
3  4  8

Empty DataFrame
Columns: [A, B]
Index: []

Attempts:

2 left

❓ data_output

advanced

2:30remaining

Result of groupby and aggregation in Pandas

Given the DataFrame below, what is the result of grouping by 'Category' and calculating the sum of 'Value'?

Pandas

import pandas as pd

df = pd.DataFrame({'Category': ['A', 'B', 'A', 'B', 'C'], 'Value': [10, 20, 30, 40, 50]})
grouped = df.groupby('Category').sum()
print(grouped)

          Value
Category       
A            40
B            60
C            50

          Value
Category       
A            30
B            40
C            50

          Value
Category       
A            10
B            20
C            50

          Value
Category       
A            70
B            60
C            50

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify the error in this Pandas code

What error does this code produce?

Pandas

import pandas as pd

df = pd.DataFrame({'X': [1, 2, 3], 'Y': [4, 5, 6]})
result = df['Z'] + 1
print(result)

ATypeError: unsupported operand type(s) for +: 'int' and 'str'

BValueError: operands could not be broadcast together

CNo error, prints a Series with values [2, 3, 4]

DKeyError: 'Z'

Attempts:

2 left

🚀 Application

expert

3:00remaining

Pandas operation to combine and summarize data

You have two DataFrames: one with sales data and one with product info. Which Pandas operation correctly merges these and calculates total sales per product?

Pandas

import pandas as pd

sales = pd.DataFrame({'ProductID': [1, 2, 1, 3], 'Quantity': [5, 3, 2, 4]})
products = pd.DataFrame({'ProductID': [1, 2, 3], 'Name': ['Pen', 'Pencil', 'Eraser']})

merged = sales.merge(products)
total = merged.groupby('ProductID')['Quantity'].count()
print(total)

merged = pd.merge(sales, products, on='ProductID')
total = merged.groupby('Name')['Quantity'].sum()
print(total)

merged = pd.concat([sales, products], axis=1)
total = merged.groupby('ProductID')['Quantity'].sum()
print(total)

merged = sales.join(products, on='ProductID')
total = merged.groupby('ProductID')['Quantity'].mean()
print(total)

Attempts:

2 left