0
0
Data Analysis Pythondata~10 mins

Dropping missing values (dropna) in Data Analysis Python - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Dropping missing values (dropna)
Start with DataFrame
Check each row/column for missing values
Decide axis: rows or columns
Remove rows/columns with missing values
Return cleaned DataFrame
End
The process checks for missing values in rows or columns, removes those containing missing data, and returns a cleaned DataFrame.
Execution Sample
Data Analysis Python
import pandas as pd

df = pd.DataFrame({
  'A': [1, 2, None],
  'B': [4, None, 6]
})

clean_df = df.dropna()
This code creates a DataFrame with missing values and removes rows that contain any missing values.
Execution Table
StepDataFrame StateActionResulting DataFrame
1{'A': [1, 2, None], 'B': [4, None, 6]}Initial DataFrame with missing valuesSame as input
2Check row 0: A=1, B=4No missing values in row 0Row 0 kept
3Check row 1: A=2, B=NoneMissing value in B at row 1Row 1 dropped
4Check row 2: A=None, B=6Missing value in A at row 2Row 2 dropped
5Final DataFrame after dropna()Rows with missing values removed{'A': [1.0], 'B': [4.0]}
💡 All rows with any missing values are removed, leaving only complete rows.
Variable Tracker
VariableStartAfter Step 2After Step 3After Step 4Final
df{'A': [1, 2, None], 'B': [4, None, 6]}SameSameSameSame
clean_dfUndefinedUndefinedUndefinedUndefined{'A': [1.0], 'B': [4.0]}
Key Moments - 3 Insights
Why does dropna() remove entire rows instead of just the missing values?
dropna() removes rows (or columns) that contain any missing values to keep data complete, as shown in execution_table rows 3 and 4 where rows with None are dropped entirely.
What happens if no axis is specified in dropna()?
By default, dropna() removes rows with missing values (axis=0), as demonstrated in the example where rows with missing data are dropped.
Can dropna() remove columns instead of rows?
Yes, by setting axis=1, dropna() removes columns with missing values. This is not shown here but works similarly to row removal.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the state of the DataFrame after step 3?
AOnly row 0 remains
BRows 0 and 1 remain
CRows 1 and 2 remain
DAll rows remain
💡 Hint
Refer to execution_table row 3 where row 1 is dropped due to missing value.
At which step does the DataFrame lose row 2?
AStep 2
BStep 4
CStep 3
DStep 5
💡 Hint
Check execution_table row 4 where row 2 is dropped for missing value in column A.
If dropna(axis=1) was used instead, what would happen to the DataFrame?
ARows with missing values would be dropped
BNothing would change
CColumns with missing values would be dropped
DAll data would be removed
💡 Hint
dropna with axis=1 removes columns containing missing values, not rows.
Concept Snapshot
dropna() removes missing data from DataFrames.
By default, it drops rows with any missing values.
Use axis=1 to drop columns instead.
Returns a new DataFrame without missing data.
Helps clean data for analysis.
Full Transcript
This visual execution shows how dropna() works in pandas. We start with a DataFrame containing missing values. Step by step, each row is checked for missing data. Rows with any missing values are removed. The final DataFrame contains only complete rows. Variables track the original and cleaned DataFrames. Key moments clarify why entire rows are dropped and how axis affects behavior. The quiz tests understanding of these steps and outcomes.