0
0
Pandasdata~20 mins

transform() for group-level operations in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Transform Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of group transform with mean subtraction
What is the output of the following code snippet?
Pandas
import pandas as pd

df = pd.DataFrame({'Team': ['A', 'A', 'B', 'B'], 'Points': [10, 20, 30, 40]})
df['AdjPoints'] = df.groupby('Team')['Points'].transform(lambda x: x - x.mean())
print(df)
A
  Team  Points  AdjPoints
0    A      10       -5.0
1    A      20        5.0
2    B      30       -5.0
3    B      40        5.0
B
  Team  Points  AdjPoints
0    A      10       10.0
1    A      20       20.0
2    B      30       30.0
3    B      40       40.0
C
  Team  Points  AdjPoints
0    A      10       15.0
1    A      20       15.0
2    B      30       35.0
3    B      40       35.0
D
  Team  Points  AdjPoints
0    A      10       -10.0
1    A      20       10.0
2    B      30       -30.0
3    B      40       30.0
Attempts:
2 left
💡 Hint
Think about what subtracting the group mean from each value does.
data_output
intermediate
1:30remaining
Number of items after transform with duplicated values
Given the code below, how many rows does the resulting DataFrame have?
Pandas
import pandas as pd

df = pd.DataFrame({'Category': ['X', 'X', 'Y', 'Y', 'Y'], 'Value': [1, 2, 3, 4, 5]})
df['Rank'] = df.groupby('Category')['Value'].transform('rank')
print(len(df))
A5
B3
C2
D1
Attempts:
2 left
💡 Hint
transform keeps the original DataFrame shape.
🔧 Debug
advanced
2:00remaining
Identify the error in transform usage
What error does the following code raise?
Pandas
import pandas as pd

df = pd.DataFrame({'Group': ['G1', 'G1', 'G2'], 'Score': [10, 20, 30]})
df['Scaled'] = df.groupby('Group')['Score'].transform(lambda x: x / x.sum())
print(df)
ATypeError: unsupported operand type(s) for /: 'int' and 'method'
BNo error, outputs scaled scores per group
CAttributeError: 'SeriesGroupBy' object has no attribute 'sum'
DValueError: Length of values does not match length of index
Attempts:
2 left
💡 Hint
Check if the lambda function uses correct pandas methods.
🚀 Application
advanced
2:30remaining
Using transform to fill missing values with group mean
You have a DataFrame with missing values in the 'Sales' column. Which code snippet correctly fills missing 'Sales' values with the mean sales of their group in 'Region'?
Pandas
import pandas as pd
import numpy as np

df = pd.DataFrame({'Region': ['East', 'East', 'West', 'West'], 'Sales': [100, np.nan, 200, np.nan]})
Adf['Sales'] = df['Sales'].fillna(df.groupby('Region')['Sales'].mean())
Bdf['Sales'] = df.groupby('Region')['Sales'].transform('mean')
Cdf['Sales'] = df.groupby('Region')['Sales'].apply(lambda x: x.fillna(x.mean()))
Ddf['Sales'] = df.groupby('Region')['Sales'].transform(lambda x: x.fillna(x.mean()))
Attempts:
2 left
💡 Hint
transform keeps the original index and shape, apply may change it.
🧠 Conceptual
expert
2:00remaining
Why use transform instead of apply for group-level operations?
Which statement best explains why transform is preferred over apply for group-level operations that return a Series with the same shape as the original DataFrame?
Aapply always returns a DataFrame, transform always returns a scalar.
Bapply is faster than transform but cannot handle group operations.
Ctransform returns a Series aligned with the original DataFrame, preserving index and shape, while apply may return aggregated or differently shaped results.
Dtransform can only be used with numeric data, apply works with all data types.
Attempts:
2 left
💡 Hint
Think about the shape and alignment of the output.