Challenge - 5 Problems
Transform Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of group transform with mean subtraction
What is the output of the following code snippet?
Pandas
import pandas as pd df = pd.DataFrame({'Team': ['A', 'A', 'B', 'B'], 'Points': [10, 20, 30, 40]}) df['AdjPoints'] = df.groupby('Team')['Points'].transform(lambda x: x - x.mean()) print(df)
Attempts:
2 left
💡 Hint
Think about what subtracting the group mean from each value does.
✗ Incorrect
The transform applies the lambda function to each group. For each group, it subtracts the mean of that group's Points from each Points value, centering the data around zero within each group.
❓ data_output
intermediate1:30remaining
Number of items after transform with duplicated values
Given the code below, how many rows does the resulting DataFrame have?
Pandas
import pandas as pd df = pd.DataFrame({'Category': ['X', 'X', 'Y', 'Y', 'Y'], 'Value': [1, 2, 3, 4, 5]}) df['Rank'] = df.groupby('Category')['Value'].transform('rank') print(len(df))
Attempts:
2 left
💡 Hint
transform keeps the original DataFrame shape.
✗ Incorrect
The transform method returns a Series with the same length as the original DataFrame, so the number of rows remains unchanged.
🔧 Debug
advanced2:00remaining
Identify the error in transform usage
What error does the following code raise?
Pandas
import pandas as pd df = pd.DataFrame({'Group': ['G1', 'G1', 'G2'], 'Score': [10, 20, 30]}) df['Scaled'] = df.groupby('Group')['Score'].transform(lambda x: x / x.sum()) print(df)
Attempts:
2 left
💡 Hint
Check if the lambda function uses correct pandas methods.
✗ Incorrect
The code correctly divides each score by the sum of scores in its group. The transform returns a Series aligned with the original DataFrame, so no error occurs.
🚀 Application
advanced2:30remaining
Using transform to fill missing values with group mean
You have a DataFrame with missing values in the 'Sales' column. Which code snippet correctly fills missing 'Sales' values with the mean sales of their group in 'Region'?
Pandas
import pandas as pd import numpy as np df = pd.DataFrame({'Region': ['East', 'East', 'West', 'West'], 'Sales': [100, np.nan, 200, np.nan]})
Attempts:
2 left
💡 Hint
transform keeps the original index and shape, apply may change it.
✗ Incorrect
Option D uses transform with a lambda that fills missing values with the group mean, preserving the DataFrame shape and index alignment. Option D tries to fillna with a Series that does not align properly. Option D uses apply which may change the index. Option D replaces all values with the group mean, not just missing ones.
🧠 Conceptual
expert2:00remaining
Why use transform instead of apply for group-level operations?
Which statement best explains why transform is preferred over apply for group-level operations that return a Series with the same shape as the original DataFrame?
Attempts:
2 left
💡 Hint
Think about the shape and alignment of the output.
✗ Incorrect
transform returns a Series with the same length as the original DataFrame, aligned by index, which is useful for adding group-level calculations back to the original data. apply can return aggregated results or different shapes, making it less suitable for this use case.