Challenge - 5 Problems
Transform Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of transform() with group mean
What is the output of the following code snippet using
transform() to calculate the group mean?Data Analysis Python
import pandas as pd df = pd.DataFrame({'Team': ['A', 'A', 'B', 'B', 'B'], 'Points': [10, 20, 30, 40, 50]}) df['MeanPoints'] = df.groupby('Team')['Points'].transform('mean') print(df)
Attempts:
2 left
💡 Hint
Remember,
transform() returns a series with the same size as the original, repeating the group statistic for each member.✗ Incorrect
The
transform('mean') calculates the mean of each group and assigns that mean value to every row in the group. For team A, mean is (10+20)/2=15. For team B, mean is (30+40+50)/3=40.❓ data_output
intermediate1:30remaining
Number of items after transform()
After applying
transform() on a grouped DataFrame, how many rows does the resulting Series have compared to the original DataFrame?Data Analysis Python
import pandas as pd df = pd.DataFrame({'Category': ['X', 'X', 'Y', 'Y', 'Y'], 'Value': [5, 10, 15, 20, 25]}) result = df.groupby('Category')['Value'].transform('max') print(len(result))
Attempts:
2 left
💡 Hint
Think about whether
transform() changes the number of rows or just the values.✗ Incorrect
transform() returns a Series with the same length as the original DataFrame, so the length remains 5.🔧 Debug
advanced2:00remaining
Identify the error in transform() usage
What error does the following code raise?
Data Analysis Python
import pandas as pd df = pd.DataFrame({'Group': ['G1', 'G1', 'G2'], 'Score': [1, 2, 3]}) df['Result'] = df.groupby('Group')['Score'].transform(lambda x: x + x.shift()) print(df)
Attempts:
2 left
💡 Hint
Check what
shift() does inside each group and how it affects the addition.✗ Incorrect
The
shift() moves values down by one within each group, causing the first row to have NaN. Adding NaN to a number results in NaN, but no error occurs.🚀 Application
advanced2:30remaining
Using transform() to normalize data by group
You want to normalize the 'Sales' column within each 'Region' by subtracting the group mean and dividing by the group standard deviation. Which code snippet correctly does this using
transform()?Data Analysis Python
import pandas as pd df = pd.DataFrame({'Region': ['East', 'East', 'West', 'West', 'West'], 'Sales': [100, 150, 200, 250, 300]})
Attempts:
2 left
💡 Hint
Remember
transform() returns a Series aligned with the original DataFrame, while apply() may not.✗ Incorrect
Option A uses
transform() with a lambda to normalize each group, returning a Series with the same size as the original DataFrame. Option A is correct mathematically but less flexible for complex functions. Option A only divides by mean, not normalizing. Option A uses apply() which returns a grouped object, not aligned with original DataFrame.🧠 Conceptual
expert3:00remaining
Why use transform() instead of apply() for group-level operations?
Which statement best explains why
transform() is preferred over apply() when you want to add a group-level statistic as a new column to the original DataFrame?Attempts:
2 left
💡 Hint
Think about the shape and alignment of the output from both methods.
✗ Incorrect
transform() returns a Series aligned with the original DataFrame's index and length, so it can be assigned as a new column easily. apply() returns aggregated results per group, often with fewer rows, making direct assignment impossible without extra steps.