0
0
Pandasdata~10 mins

Ordered categories in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Ordered categories
Create categorical data
Define category order
Assign ordered categories
Use categories in analysis
Compare or sort using order
We start with data, define an order for categories, assign this order, then use it to compare or sort data.
Execution Sample
Pandas
import pandas as pd
cats = pd.Categorical(['low', 'medium', 'high', 'medium'],
                      categories=['low', 'medium', 'high'],
                      ordered=True)
print(cats > 'low')
This code creates an ordered categorical variable and compares each value to 'low'.
Execution Table
StepActionInput/ConditionResult/Output
1Create categorical data['low', 'medium', 'high', 'medium']Categorical object with values ['low', 'medium', 'high', 'medium']
2Define categories and ordercategories=['low', 'medium', 'high'], ordered=TrueCategories set with order: low < medium < high
3Assign ordered categoriesAssign to variable 'cats'cats is ordered categorical with given categories
4Compare cats > 'low'Compare each element to 'low'[False, True, True, True]
5Print resultOutput comparison array[False, True, True, True]
💡 Comparison done element-wise using the defined order, execution ends after print.
Variable Tracker
VariableStartAfter 1After 2After 3After 4Final
catsNone['low', 'medium', 'high', 'medium']Ordered categories setAssigned to cats variableCompared to 'low'Ordered categorical with given categories
Key Moments - 3 Insights
Why does comparing categories like 'cats > "low"' work?
Because 'cats' is an ordered categorical with a defined order low < medium < high, pandas can compare each element to 'low' using this order (see execution_table step 4).
What happens if the categories are not ordered?
Without ordering, comparisons like 'cats > "low"' will raise an error because pandas cannot determine which category is greater (not shown in execution_table but important to know).
Why do we need to specify categories explicitly?
Specifying categories and their order ensures pandas knows the full set and the order, even if some categories don't appear in data (see execution_table step 2).
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table at step 4, what is the comparison result of 'cats > "low"' for the first element?
AFalse
BTrue
CError
DNone
💡 Hint
Check the 'Result/Output' column in step 4 of the execution_table.
At which step in the execution table is the order of categories defined?
AStep 1
BStep 3
CStep 2
DStep 4
💡 Hint
Look for where 'Categories set with order' is mentioned in the execution_table.
If we remove 'ordered=True' from the code, what will happen when comparing 'cats > "low"'?
AComparison returns all True
BRaises an error
CComparison returns all False
DComparison ignores order and works
💡 Hint
Recall the key moment about comparisons without ordering.
Concept Snapshot
Ordered categories in pandas:
- Use pd.Categorical(data, categories=[...], ordered=True)
- Defines a fixed order for categories
- Enables comparisons and sorting
- Without ordered=True, comparisons fail
- Useful for data with natural order like ratings
Full Transcript
This example shows how to create ordered categorical data in pandas. We start by creating a categorical variable with values like 'low', 'medium', and 'high'. We explicitly define the categories and set ordered=True to tell pandas the order low < medium < high. This allows us to compare each category value to another, for example checking which are greater than 'low'. The comparison returns a boolean array showing True for 'medium' and 'high' and False for 'low'. This ordering is important because without it, pandas cannot compare categories. Specifying categories also ensures pandas knows all possible categories and their order, even if some are missing in the data. This technique is useful for data science tasks where categories have a natural order, like ratings or sizes.