Data Analysis Pythondata~15 mins

Label encoding in Data Analysis Python - Mini Project: Build & Apply

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Label Encoding for Categorical Data

📖 Scenario: You work in a company that collects customer feedback. The feedback includes a column with customer satisfaction levels as words like 'Low', 'Medium', and 'High'. You want to convert these words into numbers so a computer can understand and analyze them easily.

🎯 Goal: Learn how to convert categorical text data into numbers using label encoding in Python. You will create a list of satisfaction levels, set up a label encoder, apply it to the data, and print the encoded results.

📋 What You'll Learn

Create a list of categorical data with exact values

Create a label encoder object

Use the label encoder to transform the list

Print the encoded numeric values

💡 Why This Matters

🌍 Real World

Label encoding is used in data science to convert words or categories into numbers so computers can analyze data like customer feedback, survey answers, or product categories.

💼 Career

Data analysts and data scientists often need to prepare data by encoding categories before building machine learning models or performing statistical analysis.

Progress0 / 4 steps

Create the list of satisfaction levels

Create a list called satisfaction_levels with these exact values: 'Low', 'Medium', 'High', 'Medium', 'Low'.

Data Analysis Python

# Create the list satisfaction_levels with the exact values
# Your code here

Hint

Use square brackets to create a list and separate values with commas.

Import and create the label encoder

Import LabelEncoder from sklearn.preprocessing and create a variable called encoder that is an instance of LabelEncoder().

Data Analysis Python

satisfaction_levels = ['Low', 'Medium', 'High', 'Medium', 'Low']
# Import LabelEncoder and create encoder
# Your code here

Hint

Use from sklearn.preprocessing import LabelEncoder to import.

Apply label encoding to the list

Use the encoder variable to fit and transform the satisfaction_levels list into numeric labels. Store the result in a variable called encoded_labels.

Data Analysis Python

from sklearn.preprocessing import LabelEncoder

satisfaction_levels = ['Low', 'Medium', 'High', 'Medium', 'Low']
encoder = LabelEncoder()
# Encode the satisfaction_levels list and save to encoded_labels
# Your code here

Hint

Use fit_transform() method on the encoder with the list as input.

Print the encoded numeric labels

Print the variable encoded_labels to display the numeric values of the satisfaction levels.

Data Analysis Python

from sklearn.preprocessing import LabelEncoder

satisfaction_levels = ['Low', 'Medium', 'High', 'Medium', 'Low']
encoder = LabelEncoder()
encoded_labels = encoder.fit_transform(satisfaction_levels)
# Print the encoded_labels
# Your code here

Hint

Use print(encoded_labels) to show the result.