0
0
Pandasdata~5 mins

Category codes and labels in Pandas

Choose your learning style9 modes available
Introduction

Category codes and labels help us work with data that has fixed groups or categories. They make data smaller and faster to use.

When you have a column with repeated text values like colors or types.
When you want to save memory by storing categories as numbers instead of long text.
When you want to sort or compare categories easily.
When you want to see the numeric codes behind category labels for analysis.
When you want to change or rename category labels.
Syntax
Pandas
df['column'].cat.codes
# To get category labels:
df['column'].cat.categories

cat.codes gives the numeric codes for each category in the column.

cat.categories shows the list of category labels in order.

Examples
This converts a text column to categories and shows the numeric codes for each color.
Pandas
import pandas as pd

colors = pd.Series(['red', 'blue', 'green', 'blue', 'red'])
colors = colors.astype('category')
print(colors.cat.codes)
This shows the list of unique category labels in the column.
Pandas
print(colors.cat.categories)
This changes the order and names of the categories.
Pandas
colors.cat.categories = ['green', 'blue', 'red']
print(colors)
Sample Program

This program shows how to convert text data to categories, get numeric codes, see labels, and change labels.

Pandas
import pandas as pd

# Create a Series with repeated fruit names
fruits = pd.Series(['apple', 'banana', 'apple', 'orange', 'banana', 'apple'])

# Convert to category type
fruits = fruits.astype('category')

# Show the category codes (numbers)
print('Category codes:')
print(fruits.cat.codes)

# Show the category labels
print('\nCategory labels:')
print(fruits.cat.categories)

# Change category labels order
fruits.cat.categories = ['banana', 'orange', 'apple']

print('\nAfter changing category labels:')
print(fruits)
OutputSuccess
Important Notes

Category codes start at 0 and go up to the number of categories minus one.

Changing category labels changes how categories are ordered and named.

Using categories can save memory and speed up operations on repeated text data.

Summary

Categories store repeated text as numbers for efficiency.

Use cat.codes to see numeric codes and cat.categories for labels.

You can rename or reorder categories by assigning to cat.categories.