0
0
Pandasdata~10 mins

Missing data strategies decision in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Missing data strategies decision
Start with DataFrame
Check for missing values
Decide strategy
Drop
Resulting DataFrame
End
Start with a DataFrame, check for missing values, decide to drop, fill, or leave them, then get the final DataFrame.
Execution Sample
Pandas
import pandas as pd

df = pd.DataFrame({'A': [1, None, 3], 'B': [4, 5, None]})

# Strategy: fill missing with 0
df_filled = df.fillna(0)
print(df_filled)
This code creates a DataFrame with missing values and fills them with zero.
Execution Table
StepActionDataFrame StateMissing CountResult
1Create DataFrame{'A': [1, None, 3], 'B': [4, 5, None]}2Initial data with 2 missing values
2Check missingSame as step 12Found 2 missing values
3Fill missing with 0{'A': [1, 0, 3], 'B': [4, 5, 0]}0Missing values replaced by 0
4Print filled DataFrameSame as step 30Output shows no missing values
5EndFinal DataFrame0Process complete
💡 All missing values handled by filling with zero, no missing values remain.
Variable Tracker
VariableStartAfter fillnaFinal
df{'A': [1, None, 3], 'B': [4, 5, None]}SameSame
df_filledN/A{'A': [1, 0, 3], 'B': [4, 5, 0]}Same
Key Moments - 2 Insights
Why do missing values still appear in 'df' after filling in 'df_filled'?
Because 'fillna' returns a new DataFrame and does not change 'df' unless assigned back. See execution_table step 3 where 'df_filled' is created separately.
What happens if we drop missing values instead of filling?
Rows with missing values are removed, reducing data size. This is a different strategy not shown here but would change DataFrame shape.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, how many missing values are in the DataFrame after step 3?
A0
B1
C2
D3
💡 Hint
Check the 'Missing Count' column at step 3 in the execution_table.
According to variable_tracker, what is the value of 'df_filled' after fillna?
A{'A': [1, None, 3], 'B': [4, 5, None]}
B{'A': [1, 0, 3], 'B': [4, 5, 0]}
CN/A
DEmpty DataFrame
💡 Hint
Look at the 'After fillna' column for 'df_filled' in variable_tracker.
If we changed the strategy to drop missing values, what would happen to the DataFrame size?
AIt would increase
BIt would stay the same
CIt would decrease
DIt would become empty
💡 Hint
Dropping missing values removes rows with missing data, reducing size.
Concept Snapshot
Missing data strategies:
- Check missing values with isna()
- Decide: drop rows, fill values, or leave
- fillna() fills missing values
- dropna() removes missing rows
- Choose strategy based on data and goal
Full Transcript
We start with a DataFrame that has missing values. We check how many missing values exist. Then we decide how to handle them. One way is to fill missing values with a number like zero. This creates a new DataFrame with no missing values. The original DataFrame stays the same unless we overwrite it. Another way is to drop rows with missing values, which reduces the data size. Choosing the right strategy depends on the data and what you want to do next.