0
0
Pandasdata~5 mins

Converting to categorical in Pandas - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What does converting a column to categorical type in pandas do?
It changes the column to a special type that uses less memory and can speed up operations by storing data as categories instead of raw values.
Click to reveal answer
beginner
How do you convert a pandas DataFrame column named 'color' to categorical?
Use df['color'] = df['color'].astype('category') to convert the 'color' column to categorical type.
Click to reveal answer
beginner
Why might you want to convert a text column with repeated values to categorical?
Because it saves memory and can make filtering and grouping faster, just like organizing similar items into labeled boxes instead of keeping them loose.
Click to reveal answer
intermediate
What is the difference between ordered and unordered categorical data?
Ordered categorical data has a meaningful order (like small < medium < large), while unordered categorical data has no order (like red, blue, green).
Click to reveal answer
intermediate
How can you create an ordered categorical column in pandas?
Use df['size'] = pd.Categorical(df['size'], categories=['small', 'medium', 'large'], ordered=True) to create an ordered categorical column.
Click to reveal answer
What is the main benefit of converting a column to categorical in pandas?
ASaves memory and speeds up operations
BChanges numbers to strings
CDeletes duplicate rows
DSorts the data automatically
Which pandas method converts a column to categorical type?
Adf['col'].fillna()
Bdf['col'].to_string()
Cdf['col'].sort_values()
Ddf['col'].astype('category')
What does ordered=True do when creating a categorical column?
ASorts the DataFrame
BDefines a meaningful order for categories
CConverts numbers to strings
DRemoves duplicates
Which of these is NOT a reason to use categorical data?
AAutomatically fix missing data
BReduce memory usage
CSpeed up filtering and grouping
DRepresent repeated text values efficiently
If you have a column with values 'red', 'blue', 'green', what type is best to convert it to?
AFloat
BInteger
CCategorical
DBoolean
Explain why and how you would convert a text column with repeated values to categorical in pandas.
Think about how repeated labels can be stored more efficiently.
You got /4 concepts.
    Describe the difference between ordered and unordered categorical data and give an example of each.
    Consider sizes for ordered and colors for unordered.
    You got /4 concepts.