0
0
Pandasdata~30 mins

Dropping columns and rows in Pandas - Mini Project: Build & Apply

Choose your learning style9 modes available
Dropping columns and rows
📖 Scenario: You work as a data analyst for a small online store. You have a table of sales data, but some columns and rows are not needed for your analysis.
🎯 Goal: You will learn how to remove unwanted columns and rows from a table using pandas. This helps clean your data and focus on what matters.
📋 What You'll Learn
Create a pandas DataFrame with given sales data
Create a list of columns to drop
Drop specified columns from the DataFrame
Drop specified rows from the DataFrame
Print the final cleaned DataFrame
💡 Why This Matters
🌍 Real World
Cleaning data by removing unnecessary columns and rows is a common step before analysis or visualization.
💼 Career
Data analysts and data scientists often need to clean datasets to focus on relevant information and improve model accuracy.
Progress0 / 4 steps
1
Create the sales data DataFrame
Create a pandas DataFrame called sales with these exact columns and rows:
'OrderID': [101, 102, 103, 104, 105],
'Product': ['Shirt', 'Pants', 'Hat', 'Shoes', 'Socks'],
'Quantity': [2, 1, 4, 1, 3],
'Price': [20, 40, 15, 60, 5],
'Discount': [0, 5, 0, 10, 0]
Pandas
Need a hint?

Use pd.DataFrame with a dictionary of lists for each column.

2
Create a list of columns to drop
Create a list called drop_columns containing the column names 'Discount' and 'Price' that you want to remove from the DataFrame.
Pandas
Need a hint?

Use a list with the exact column names to drop.

3
Drop the specified columns and rows
Create a new DataFrame called cleaned_sales by dropping the columns in drop_columns from sales. Then drop the rows with index 1 and 3 from cleaned_sales.
Pandas
Need a hint?

Use drop(columns=...) to remove columns and drop(index=[...]) to remove rows.

4
Print the cleaned DataFrame
Print the cleaned_sales DataFrame to see the final table after dropping columns and rows.
Pandas
Need a hint?

Use print(cleaned_sales) to display the DataFrame.