0
0
Pandasdata~10 mins

Adding and removing categories in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Adding and removing categories
Start with Categorical Data
Add new category
Check categories updated?
NoError or Retry
Yes
Remove category
Check categories updated?
NoError or Retry
Yes
End
Start with categorical data, add new categories, verify update, then remove categories and verify again.
Execution Sample
Pandas
import pandas as pd
cat = pd.Categorical(['a', 'b', 'a'], categories=['a', 'b'])
cat = cat.add_categories(['c'])
cat = cat.remove_categories(['b'])
print(cat.categories)
Create a categorical variable, add a new category 'c', remove category 'b', then print remaining categories.
Execution Table
StepActionCategories BeforeCategories AfterNotes
1Create categorical with categories ['a', 'b'][]['a', 'b']Initial categories set
2Add category 'c'['a', 'b']['a', 'b', 'c']New category 'c' added
3Remove category 'b'['a', 'b', 'c']['a', 'c']Category 'b' removed
4Print categories['a', 'c']['a', 'c']Final categories displayed
5End['a', 'c']['a', 'c']No more steps
💡 All categories updated correctly; execution ends.
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 3Final
cat.categories[]['a', 'b']['a', 'b', 'c']['a', 'c']['a', 'c']
Key Moments - 2 Insights
Why does the category list change after adding a new category?
Because add_categories adds the new category to the existing list without changing existing data, as shown in step 2 of the execution_table.
What happens to data points with a removed category?
Data points with the removed category become NaN or missing, since the category no longer exists, as implied after step 3 in the execution_table.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, what are the categories after step 2?
A['a', 'c']
B['a', 'b']
C['a', 'b', 'c']
D['b', 'c']
💡 Hint
Check the 'Categories After' column for step 2 in the execution_table.
At which step is the category 'b' removed?
AStep 1
BStep 3
CStep 2
DStep 4
💡 Hint
Look at the 'Action' column in the execution_table to find when 'b' is removed.
If we add category 'd' after step 3, what will be the categories?
A['a', 'c', 'd']
B['a', 'b', 'c', 'd']
C['a', 'b', 'd']
D['a', 'c']
💡 Hint
After step 3, categories are ['a', 'c'], adding 'd' appends it.
Concept Snapshot
pandas Categorical categories can be changed.
Use add_categories(['new_cat']) to add.
Use remove_categories(['old_cat']) to remove.
Removing a category affects data points with it.
Always check categories after changes.
Full Transcript
We start with a pandas Categorical variable with categories 'a' and 'b'. Then we add a new category 'c' using add_categories, which updates the categories list to include 'c'. Next, we remove the category 'b' using remove_categories, which removes 'b' from the categories list. Finally, we print the categories to see the updated list, which is now ['a', 'c']. This process shows how to add and remove categories step-by-step in pandas.