0
0
Data Analysis Pythondata~30 mins

Encoding categorical variables in Data Analysis Python - Mini Project: Build & Apply

Choose your learning style9 modes available
Encoding categorical variables
📖 Scenario: You work in a small online store. You have a list of products with their categories. You want to prepare this data for a computer program that only understands numbers, not words.
🎯 Goal: You will convert the product categories from words into numbers using encoding. This helps computers understand and analyze the data better.
📋 What You'll Learn
Create a dictionary with product names as keys and their categories as values
Create a list of unique categories
Create a dictionary that assigns a unique number to each category
Create a new dictionary with products and their encoded category numbers
Print the new encoded dictionary
💡 Why This Matters
🌍 Real World
Stores and businesses often have product categories as words. Encoding them into numbers helps in building recommendation systems and sales analysis.
💼 Career
Data scientists and analysts frequently encode categorical data to prepare datasets for machine learning models.
Progress0 / 4 steps
1
Create the product data dictionary
Create a dictionary called products with these exact entries: 'T-shirt': 'Clothing', 'Jeans': 'Clothing', 'Coffee Mug': 'Kitchen', 'Notebook': 'Stationery', 'Pen': 'Stationery'.
Data Analysis Python
Hint

Use curly braces {} to create a dictionary. Each entry has a product name as key and category as value.

2
Create a list of unique categories
Create a list called categories that contains the unique category names from the products dictionary values.
Data Analysis Python
Hint

Use products.values() to get all categories, then set() to get unique ones, and convert back to list.

3
Create the encoding dictionary
Create a dictionary called category_encoding that assigns a unique number to each category in the categories list. Use a for loop with variables index and category and enumerate(categories).
Data Analysis Python
Hint

Use enumerate() to get index and category, then assign index as value in category_encoding.

4
Create and print the encoded product categories
Create a new dictionary called encoded_products where each product name from products is a key and its value is the encoded number from category_encoding. Use a for loop with variables product and category iterating over products.items(). Then print encoded_products.
Data Analysis Python
Hint

Use a for loop over products.items() to build encoded_products. Then print it.