0
0
Data Analysis Pythondata~5 mins

Dropping missing values (dropna) in Data Analysis Python

Choose your learning style9 modes available
Introduction

Sometimes data has empty spots called missing values. Dropping them helps clean the data so we can analyze it better.

When you want to remove rows with missing data before analysis.
When missing values in columns make calculations incorrect.
When preparing data for machine learning models that can't handle missing values.
When you want to quickly clean a dataset without filling missing spots.
When missing data is rare and dropping it won't lose much information.
Syntax
Data Analysis Python
DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)

axis=0 drops rows; axis=1 drops columns.

how='any' drops if any missing; how='all' drops if all missing.

Examples
Drop all rows that have any missing values.
Data Analysis Python
df.dropna()
Drop columns that have any missing values.
Data Analysis Python
df.dropna(axis=1)
Drop rows only if all values are missing.
Data Analysis Python
df.dropna(how='all')
Drop rows where 'Age' or 'Salary' columns have missing values.
Data Analysis Python
df.dropna(subset=['Age', 'Salary'])
Sample Program

This code creates a table with some missing ages and salaries. Then it removes any row that has a missing value. Finally, it shows the cleaned table.

Data Analysis Python
import pandas as pd

# Create a sample DataFrame with missing values
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, None, 30, None],
    'Salary': [50000, 60000, None, 45000]
}
df = pd.DataFrame(data)

print('Original DataFrame:')
print(df)

# Drop rows with any missing values
clean_df = df.dropna()

print('\nDataFrame after dropping rows with missing values:')
print(clean_df)
OutputSuccess
Important Notes

Using inplace=True changes the original DataFrame without needing to assign it again.

Be careful dropping too many rows; you might lose important data.

You can use thresh to keep rows with a minimum number of non-missing values.

Summary

Use dropna() to remove missing data from your dataset.

You can drop rows or columns depending on your needs.

Dropping missing values helps keep your data clean for analysis.