Challenge - 5 Problems

🎖️

GroupBy Transform Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of group normalization using transform

What is the output of this code that normalizes values within each group by subtracting the group mean?

Pandas

import pandas as pd

df = pd.DataFrame({
    'group': ['A', 'A', 'B', 'B', 'B'],
    'value': [10, 20, 30, 40, 50]
})

df['norm'] = df.groupby('group')['value'].transform(lambda x: x - x.mean())
print(df)

  group  value  norm
0     A     10  -5.0
1     A     20   5.0
2     B     30 -10.0
3     B     40   0.0
4     B     50  10.0

  group  value  norm
0     A     10  10.0
1     A     20  20.0
2     B     30  30.0
3     B     40  40.0
4     B     50  50.0

  group  value  norm
0     A     10  -10.0
1     A     20  -20.0
2     B     30  -30.0
3     B     40  -40.0
4     B     50  -50.0

  group  value  norm
0     A     10   5.0
1     A     20  -5.0
2     B     30  10.0
3     B     40   0.0
4     B     50 -10.0

Attempts:

2 left

❓ data_output

intermediate

1:30remaining

Number of normalized values above zero per group

After normalizing values within groups by subtracting the group mean, how many values in group 'B' are greater than zero?

Pandas

import pandas as pd

df = pd.DataFrame({
    'group': ['A', 'A', 'B', 'B', 'B'],
    'value': [5, 15, 10, 20, 30]
})

df['norm'] = df.groupby('group')['value'].transform(lambda x: x - x.mean())
count = df[(df['group'] == 'B') & (df['norm'] > 0)].shape[0]
print(count)

Attempts:

2 left

🔧 Debug

advanced

1:30remaining

Identify the error in group normalization code

What error does this code raise when trying to normalize values by group?

Pandas

import pandas as pd

df = pd.DataFrame({
    'group': ['X', 'X', 'Y'],
    'value': [1, 2, 3]
})

df['norm'] = df.groupby('group')['value'].transform(lambda x: x / x.mean())
print(df)

ATypeError

BKeyError

CZeroDivisionError

DNo error, outputs normalized values

Attempts:

2 left

🚀 Application

advanced

2:00remaining

Apply group normalization with standard deviation scaling

Which code snippet correctly normalizes 'score' within each 'team' by subtracting the mean and dividing by the standard deviation?

Adf['norm'] = df.groupby('team')['score'].apply(lambda x: (x - x.mean()) / x.std())

Bdf['norm'] = (df['score'] - df.groupby('team')['score'].mean()) / df.groupby('team')['score'].std()

Cdf['norm'] = df.groupby('team')['score'].transform(lambda x: (x - x.mean()) / x.std())

Ddf['norm'] = df['score'] / df.groupby('team')['score'].transform('std')

Attempts:

2 left

🧠 Conceptual

expert

1:30remaining

Why use transform instead of apply for group normalization?

What is the main reason to use groupby with transform instead of apply when normalizing data within groups?

ATransform returns a Series aligned with the original DataFrame, preserving index and length, enabling direct assignment.

BApply is faster and more memory efficient than transform for group operations.

CTransform can only be used with numeric data, while apply works with any data type.

DApply automatically sorts groups, while transform does not.

Attempts:

2 left