0
0
Pandasdata~10 mins

Box plots in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Box plots
Start with DataFrame
Select numeric column
Calculate quartiles and median
Calculate whiskers (min/max within 1.5*IQR)
Identify outliers
Draw box (Q1 to Q3) and median line
Draw whiskers and plot outliers
Display box plot visualization
The flow starts from data, calculates key statistics, identifies outliers, and then draws the box plot step-by-step.
Execution Sample
Pandas
import pandas as pd
import matplotlib.pyplot as plt

data = {'Scores': [55, 67, 45, 70, 90, 85, 100, 40, 60, 75]}
df = pd.DataFrame(data)
df.boxplot(column='Scores')
plt.show()
This code creates a box plot for the 'Scores' column in the DataFrame.
Execution Table
StepActionCalculation/ConditionResult/Output
1Start with DataFrameDataFrame with 'Scores' column[55, 67, 45, 70, 90, 85, 100, 40, 60, 75]
2Calculate Q1 (25th percentile)Q1 = 56.2556.25
3Calculate Median (50th percentile)Median = 67.567.5
4Calculate Q3 (75th percentile)Q3 = 82.582.5
5Calculate IQRIQR = Q3 - Q1 = 26.2526.25
6Calculate lower whisker limitLower limit = Q1 - 1.5*IQR = 56.25 - 39.375 = 16.87516.875
7Calculate upper whisker limitUpper limit = Q3 + 1.5*IQR = 82.5 + 39.375 = 121.875121.875
8Identify whiskersMin data >= 16.875 is 40, Max data <= 121.875 is 100Whiskers at 40 and 100
9Identify outliersValues <16.875 or >121.875? NoneNo outliers
10Draw box from Q1 to Q3Box from 56.25 to 82.5Box drawn
11Draw median lineMedian at 67.5Median line drawn
12Draw whiskersWhiskers at 40 and 100Whiskers drawn
13Plot outliersNo outliersNo outliers plotted
14Display plotShow box plotBox plot displayed
💡 All steps completed, box plot displayed successfully.
Variable Tracker
VariableStartAfter Step 2After Step 3After Step 4After Step 5After Step 6After Step 7After Step 8After Step 9Final
Q1N/A56.2556.2556.2556.2556.2556.2556.2556.2556.25
MedianN/AN/A67.567.567.567.567.567.567.567.5
Q3N/AN/AN/A82.582.582.582.582.582.582.5
IQRN/AN/AN/AN/A26.2526.2526.2526.2526.2526.25
Lower whisker limitN/AN/AN/AN/AN/A16.87516.87516.87516.87516.875
Upper whisker limitN/AN/AN/AN/AN/AN/A121.875121.875121.875121.875
Lower whiskerN/AN/AN/AN/AN/AN/AN/A404040
Upper whiskerN/AN/AN/AN/AN/AN/AN/A100100100
OutliersN/AN/AN/AN/AN/AN/AN/ANoneNoneNone
Key Moments - 3 Insights
Why is the lower whisker at 40 and not at the minimum value 40?
The lower whisker is the smallest data point greater than or equal to the lower limit (Q1 - 1.5*IQR). Since 40 is the minimum and it is above 16.875, the whisker is at 40 (see execution_table step 8).
Why are there no outliers plotted even though 40 seems far from Q1?
Outliers are points outside the whisker limits. Since 40 is above the lower limit 16.875, it is not an outlier (see execution_table step 9).
What does the box represent in the box plot?
The box shows the middle 50% of data between Q1 and Q3, with a line at the median (see execution_table steps 10 and 11).
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the value of the median at step 3?
A82.5
B52.5
C68.5
D70
💡 Hint
Check the 'Median' row at step 3 in the execution_table.
At which step does the condition for identifying outliers get checked?
AStep 7
BStep 9
CStep 5
DStep 12
💡 Hint
Look for the step mentioning 'Identify outliers' in the execution_table.
If the maximum score was 110 instead of 100, what would happen to the upper whisker?
AIt would move to 110
BIt would stay at 100
CIt would move to 121.875
DThere would be no upper whisker
💡 Hint
Recall that whiskers are the max data point within the upper limit (121.875), see execution_table step 8.
Concept Snapshot
Box plots show data spread using quartiles.
Box covers Q1 to Q3, with a line at median.
Whiskers extend to data within 1.5*IQR from quartiles.
Points outside whiskers are outliers.
Use pandas.DataFrame.boxplot() to create easily.
Full Transcript
Box plots visualize data distribution by showing quartiles and outliers. We start with a DataFrame, select a numeric column, and calculate Q1, median, and Q3. The interquartile range (IQR) is Q3 minus Q1. Whiskers extend to the smallest and largest data points within 1.5 times the IQR from the quartiles. Points outside these whiskers are outliers. The box plot draws a box from Q1 to Q3 with a line at the median, whiskers, and marks outliers. In the example, the 'Scores' column is used to create a box plot with no outliers. This step-by-step process helps understand how box plots summarize data spread and detect unusual values.