0
0
Pandasdata~5 mins

describe() for statistical summary in Pandas

Choose your learning style9 modes available
Introduction

The describe() function quickly shows key statistics about your data. It helps you understand your data's shape and spread without complex calculations.

You want to see the average, minimum, and maximum values of a dataset.
You need a quick summary of data before deeper analysis.
You want to check if there are missing or unusual values in your data.
You want to compare basic statistics across different columns.
You want to understand the distribution of numeric data in a table.
Syntax
Pandas
DataFrame.describe(percentiles=None, include=None, exclude=None, datetime_is_numeric=False)

describe() works on numeric columns by default.

You can include other data types by using the include parameter.

Examples
Shows summary statistics for all numeric columns in the DataFrame.
Pandas
df.describe()
Shows summary statistics for all columns, including non-numeric ones.
Pandas
df.describe(include='all')
Shows summary statistics including the 10th and 90th percentiles.
Pandas
df.describe(percentiles=[0.1, 0.9])
Sample Program

This code creates a small table with age, height, and weight columns. Then it uses describe() to get a quick statistical summary of these columns.

Pandas
import pandas as pd

data = {
    'age': [23, 45, 31, 35, 22],
    'height': [165, 180, 175, 170, 160],
    'weight': [65, 85, 70, 72, 60]
}
df = pd.DataFrame(data)

summary = df.describe()
print(summary)
OutputSuccess
Important Notes

describe() ignores missing values automatically.

For categorical data, describe() shows count, unique values, top (most frequent) value, and frequency.

Summary

describe() gives a quick overview of your data's main statistics.

It helps spot data issues and understand data distribution fast.

You can customize it to include different data types or percentiles.