0
0
Data Analysis Pythondata~15 mins

describe() for statistics in Data Analysis Python - Mini Project: Build & Apply

Choose your learning style9 modes available
Using describe() for Statistics
📖 Scenario: You work in a small bakery. You have sales data for different types of bread sold each day. You want to understand the basic statistics of daily sales to plan better.
🎯 Goal: Learn how to use the describe() method in pandas to get quick statistics like mean, median, min, max, and quartiles for your sales data.
📋 What You'll Learn
Create a pandas DataFrame with daily sales data
Use the describe() method to get summary statistics
Print the summary statistics
💡 Why This Matters
🌍 Real World
Businesses often need to quickly understand their sales or production data to make decisions. Using describe() helps get a quick overview of key statistics.
💼 Career
Data analysts and scientists use describe() to summarize datasets before deeper analysis or reporting.
Progress0 / 4 steps
1
Create the sales data DataFrame
Import pandas as pd and create a DataFrame called sales_data with these exact columns and values:
'Day': ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday'],
'Bread': [30, 45, 50, 40, 60],
'Pastry': [20, 25, 30, 22, 28]
Data Analysis Python
Need a hint?

Use pd.DataFrame() with a dictionary where keys are column names and values are lists of data.

2
Set the Day column as index
Create a new DataFrame called sales_indexed by setting the 'Day' column as the index of sales_data using the set_index() method.
Data Analysis Python
Need a hint?

Use set_index('Day') on the DataFrame to make 'Day' the index.

3
Get summary statistics with describe()
Use the describe() method on sales_indexed and save the result to a variable called summary_stats.
Data Analysis Python
Need a hint?

Call describe() on the DataFrame to get statistics like mean, min, max, and quartiles.

4
Print the summary statistics
Print the variable summary_stats to display the statistics of the sales data.
Data Analysis Python
Need a hint?

Use print(summary_stats) to show the statistics.