Category codes and labels help us work with data that has fixed groups or categories. They make data smaller and faster to use.
0
0
Category codes and labels in Pandas
Introduction
When you have a column with repeated text values like colors or types.
When you want to save memory by storing categories as numbers instead of long text.
When you want to sort or compare categories easily.
When you want to see the numeric codes behind category labels for analysis.
When you want to change or rename category labels.
Syntax
Pandas
df['column'].cat.codes # To get category labels: df['column'].cat.categories
cat.codes gives the numeric codes for each category in the column.
cat.categories shows the list of category labels in order.
Examples
This converts a text column to categories and shows the numeric codes for each color.
Pandas
import pandas as pd colors = pd.Series(['red', 'blue', 'green', 'blue', 'red']) colors = colors.astype('category') print(colors.cat.codes)
This shows the list of unique category labels in the column.
Pandas
print(colors.cat.categories)This changes the order and names of the categories.
Pandas
colors.cat.categories = ['green', 'blue', 'red'] print(colors)
Sample Program
This program shows how to convert text data to categories, get numeric codes, see labels, and change labels.
Pandas
import pandas as pd # Create a Series with repeated fruit names fruits = pd.Series(['apple', 'banana', 'apple', 'orange', 'banana', 'apple']) # Convert to category type fruits = fruits.astype('category') # Show the category codes (numbers) print('Category codes:') print(fruits.cat.codes) # Show the category labels print('\nCategory labels:') print(fruits.cat.categories) # Change category labels order fruits.cat.categories = ['banana', 'orange', 'apple'] print('\nAfter changing category labels:') print(fruits)
OutputSuccess
Important Notes
Category codes start at 0 and go up to the number of categories minus one.
Changing category labels changes how categories are ordered and named.
Using categories can save memory and speed up operations on repeated text data.
Summary
Categories store repeated text as numbers for efficiency.
Use cat.codes to see numeric codes and cat.categories for labels.
You can rename or reorder categories by assigning to cat.categories.