0
0
Pandasdata~10 mins

Category codes and labels in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Category codes and labels
Create categorical data
Assign categories with labels
Access category codes
Use codes for analysis or storage
Map codes back to labels if needed
This flow shows how to create categorical data, assign labels, access their integer codes, and use them for efficient analysis.
Execution Sample
Pandas
import pandas as pd
colors = pd.Series(['red', 'blue', 'green', 'blue', 'red'])
cats = colors.astype('category')
codes = cats.cat.codes
labels = cats.cat.categories
This code creates a categorical series from color names, then extracts integer codes and category labels.
Execution Table
StepActionData/VariableResult/Value
1Create Series with colorscolors0:red,1:blue,2:green,3:blue,4:red
2Convert to categoricalcatsCategories: ['blue', 'green', 'red'] with codes [2,0,1,0,2]
3Access category codescodes[2,0,1,0,2]
4Access category labelslabelsIndex(['blue', 'green', 'red'])
5Use codes for indexing or storagecodesInteger array representing categories
6Map codes back to labelscodes + labels2->red,0->blue,1->green
7EndAll categories and codes assigned correctly
💡 All categorical codes and labels extracted and mapped successfully
Variable Tracker
VariableStartAfter Step 2After Step 3After Step 4Final
colorsEmpty['red','blue','green','blue','red']SameSameSame
catsN/ACategorical with categories ['blue','green','red']SameSameSame
codesN/AN/A[2,0,1,0,2]SameSame
labelsN/AN/AN/A['blue','green','red']Same
Key Moments - 2 Insights
Why do the category codes not match the original order of colors?
Category codes are assigned based on sorted unique categories (alphabetical order), not the original order. See execution_table step 2 where categories are ['blue', 'green', 'red'] sorted alphabetically.
How can I get the original color names back from the codes?
Use the category labels with the codes to map back. For example, code 0 corresponds to 'blue' as shown in execution_table step 6.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 3, what is the code for the color 'green'?
A1
B2
C0
D3
💡 Hint
Check the codes array in step 3 and the category order in step 2.
At which step do we convert the original Series into categorical data?
AStep 3
BStep 1
CStep 2
DStep 4
💡 Hint
Look for the action 'Convert to categorical' in the execution_table.
If a new color 'yellow' is added, what happens to the category codes?
AAll codes shift by one
BCodes remain the same for existing colors, 'yellow' gets a new code
CCodes become strings instead of integers
DCodes reset to zero for all colors
💡 Hint
Think about how pandas assigns codes only to new categories without changing existing ones.
Concept Snapshot
Category codes and labels in pandas:
- Convert a Series to 'category' dtype.
- Categories are sorted unique values.
- Codes are integer indexes for categories.
- Use .cat.codes for codes, .cat.categories for labels.
- Codes help efficient storage and analysis.
Full Transcript
We start by creating a pandas Series with color names. Then, we convert this Series to a categorical type, which organizes unique colors as categories sorted alphabetically. Each category gets an integer code starting from zero. We extract these codes using .cat.codes and the category labels using .cat.categories. These codes represent the original data efficiently and can be mapped back to the labels when needed. This process helps in saving memory and speeding up analysis when working with repeated categorical data.