0
0
Pandasdata~10 mins

Aggregation with agg() in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Aggregation with agg()
Start with DataFrame
Call agg() with functions
Apply each function to columns
Combine results into new DataFrame
Return aggregated DataFrame
The agg() method applies one or more aggregation functions to DataFrame columns and returns a new DataFrame with the results.
Execution Sample
Pandas
import pandas as pd

df = pd.DataFrame({
  'A': [1, 2, 3],
  'B': [4, 5, 6]
})

result = df.agg(['sum', 'mean'])
This code creates a DataFrame and uses agg() to calculate sum and mean for each column.
Execution Table
StepActionColumnFunctionResultIntermediate Output
1Start with DataFrameA, B--{'A': [1,2,3], 'B': [4,5,6]}
2Apply 'sum' to column AAsum6{'sum': {'A': 6}}
3Apply 'sum' to column BBsum15{'sum': {'A': 6, 'B': 15}}
4Apply 'mean' to column AAmean2.0{'sum': {'A': 6, 'B': 15}, 'mean': {'A': 2.0}}
5Apply 'mean' to column BBmean5.0{'sum': {'A': 6, 'B': 15}, 'mean': {'A': 2.0, 'B': 5.0}}
6Combine results into DataFrame---DataFrame with rows 'sum' and 'mean', columns 'A' and 'B'
7Return aggregated DataFrame--- A B sum 6 15 mean 2.0 5.0
💡 All specified aggregation functions applied to all columns; aggregation complete.
Variable Tracker
VariableStartAfter Step 2After Step 4Final
df{'A': [1,2,3], 'B': [4,5,6]}{'A': [1,2,3], 'B': [4,5,6]}{'A': [1,2,3], 'B': [4,5,6]}{'A': [1,2,3], 'B': [4,5,6]}
resultNone{'sum': {'A': 6}}{'sum': {'A': 6, 'B': 15}, 'mean': {'A': 2.0}}DataFrame with sum and mean rows
Key Moments - 2 Insights
Why does agg() return a DataFrame with function names as row labels?
Because agg() applies each function to all columns and combines results, it labels rows by function names to show which result belongs to which aggregation (see execution_table rows 6 and 7).
Can agg() apply different functions to different columns at once?
Yes, but in this example we used the same functions for all columns. To apply different functions per column, you pass a dictionary to agg(), which is not shown here.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 3, what is the sum of column B?
A15
B5.0
C6
D4
💡 Hint
Check the 'Result' column in execution_table row 3 for sum of column B.
At which step does the mean of column A get calculated?
AStep 2
BStep 4
CStep 5
DStep 6
💡 Hint
Look at the 'Function' column in execution_table to find when 'mean' is applied to column A.
If we add 'max' to the agg() functions, how would the execution_table change?
ANo change, max is ignored
BFewer rows because max replaces sum
CMore rows for steps applying 'max' to each column
DThe DataFrame would have fewer columns
💡 Hint
Adding a function means agg() applies it to all columns, adding steps for each.
Concept Snapshot
agg() method in pandas:
- Applies one or more aggregation functions to DataFrame columns
- Syntax: df.agg(['func1', 'func2']) or df.agg({'col1': 'func1', 'col2': 'func2'})
- Returns a DataFrame with functions as row labels and columns as original columns
- Useful for quick summary statistics
- Supports built-in and custom aggregation functions
Full Transcript
This visual execution trace shows how pandas agg() method works on a DataFrame. Starting with a DataFrame with columns A and B, agg() applies the functions 'sum' and 'mean' to each column. Step by step, it calculates sum for A and B, then mean for A and B, storing results in an intermediate dictionary. Finally, it combines these results into a new DataFrame with rows labeled by function names and columns by original column names. Variables df and result are tracked through the steps. Key moments clarify why agg() returns a DataFrame with function names as rows and that different functions can be applied per column. The quiz tests understanding of specific steps and effects of adding functions. The snapshot summarizes agg() usage and behavior.