0
0
Pandasdata~20 mins

GroupBy with transform for normalization in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
GroupBy Transform Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of group normalization using transform
What is the output of this code that normalizes values within each group by subtracting the group mean?
Pandas
import pandas as pd

df = pd.DataFrame({
    'group': ['A', 'A', 'B', 'B', 'B'],
    'value': [10, 20, 30, 40, 50]
})

df['norm'] = df.groupby('group')['value'].transform(lambda x: x - x.mean())
print(df)
A
  group  value  norm
0     A     10  -5.0
1     A     20   5.0
2     B     30 -10.0
3     B     40   0.0
4     B     50  10.0
B
  group  value  norm
0     A     10  10.0
1     A     20  20.0
2     B     30  30.0
3     B     40  40.0
4     B     50  50.0
C
  group  value  norm
0     A     10  -10.0
1     A     20  -20.0
2     B     30  -30.0
3     B     40  -40.0
4     B     50  -50.0
D
  group  value  norm
0     A     10   5.0
1     A     20  -5.0
2     B     30  10.0
3     B     40   0.0
4     B     50 -10.0
Attempts:
2 left
💡 Hint
Think about how subtracting the mean affects each value within its group.
data_output
intermediate
1:30remaining
Number of normalized values above zero per group
After normalizing values within groups by subtracting the group mean, how many values in group 'B' are greater than zero?
Pandas
import pandas as pd

df = pd.DataFrame({
    'group': ['A', 'A', 'B', 'B', 'B'],
    'value': [5, 15, 10, 20, 30]
})

df['norm'] = df.groupby('group')['value'].transform(lambda x: x - x.mean())
count = df[(df['group'] == 'B') & (df['norm'] > 0)].shape[0]
print(count)
A0
B1
C2
D3
Attempts:
2 left
💡 Hint
Calculate the mean of group B and count values above it.
🔧 Debug
advanced
1:30remaining
Identify the error in group normalization code
What error does this code raise when trying to normalize values by group?
Pandas
import pandas as pd

df = pd.DataFrame({
    'group': ['X', 'X', 'Y'],
    'value': [1, 2, 3]
})

df['norm'] = df.groupby('group')['value'].transform(lambda x: x / x.mean())
print(df)
ATypeError
BKeyError
CZeroDivisionError
DNo error, outputs normalized values
Attempts:
2 left
💡 Hint
Check if dividing by mean is valid for these values.
🚀 Application
advanced
2:00remaining
Apply group normalization with standard deviation scaling
Which code snippet correctly normalizes 'score' within each 'team' by subtracting the mean and dividing by the standard deviation?
Adf['norm'] = df.groupby('team')['score'].apply(lambda x: (x - x.mean()) / x.std())
Bdf['norm'] = (df['score'] - df.groupby('team')['score'].mean()) / df.groupby('team')['score'].std()
Cdf['norm'] = df.groupby('team')['score'].transform(lambda x: (x - x.mean()) / x.std())
Ddf['norm'] = df['score'] / df.groupby('team')['score'].transform('std')
Attempts:
2 left
💡 Hint
Remember transform keeps the original index and length.
🧠 Conceptual
expert
1:30remaining
Why use transform instead of apply for group normalization?
What is the main reason to use groupby with transform instead of apply when normalizing data within groups?
ATransform returns a Series aligned with the original DataFrame, preserving index and length, enabling direct assignment.
BApply is faster and more memory efficient than transform for group operations.
CTransform can only be used with numeric data, while apply works with any data type.
DApply automatically sorts groups, while transform does not.
Attempts:
2 left
💡 Hint
Think about the shape and alignment of the output from each method.