0
0
Data Analysis Pythondata~5 mins

agg() for multiple aggregations in Data Analysis Python

Choose your learning style9 modes available
Introduction

We use agg() to quickly get many summary numbers from data. It helps us understand data by showing different calculations at once.

You want to find the average and maximum sales for each product.
You need to see the total and count of orders by customer.
You want to get multiple statistics like mean, min, and max for a dataset column.
You want to summarize data in one step instead of many separate calculations.
Syntax
Data Analysis Python
dataframe.agg({'column1': ['func1', 'func2'], 'column2': ['func3']})

You pass a dictionary where keys are column names and values are lists of functions.

Functions can be strings like 'mean', 'sum', or your own functions.

Examples
This calculates mean and max of 'age', and min and sum of 'salary'.
Data Analysis Python
df.agg({'age': ['mean', 'max'], 'salary': ['min', 'sum']})
This calculates min, max, and mean of the 'score' column only.
Data Analysis Python
df['score'].agg(['min', 'max', 'mean'])
This calculates max of 'height' and mean of 'weight'.
Data Analysis Python
df.agg({'height': 'max', 'weight': 'mean'})
Sample Program

This program creates a small table of products with sales and quantity. Then it uses agg() to find total and average sales, and smallest and largest quantity.

Data Analysis Python
import pandas as pd

data = {'product': ['A', 'A', 'B', 'B', 'C'],
        'sales': [100, 150, 200, 250, 300],
        'quantity': [1, 2, 3, 4, 5]}
df = pd.DataFrame(data)

result = df.agg({'sales': ['sum', 'mean'], 'quantity': ['min', 'max']})
print(result)
OutputSuccess
Important Notes

If you use multiple functions, the result shows a MultiIndex Series with each function's result for each column.

You can also use built-in functions like sum or your own custom functions.

Summary

agg() helps get many summary numbers in one step.

You give it a dictionary with columns and functions to apply.

It works well to quickly understand data with multiple calculations.