What is Interpolation for missing values in Pandas?

Pandasdata~5 mins

Interpolation for missing values in Pandas

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Introduction

Interpolation helps fill in missing data points by estimating values between known data. This keeps your data complete and useful for analysis.

You have a time series with some missing days and want to estimate the missing values.

Your sensor data has gaps and you want to smooth out the missing readings.

You want to fill missing values in a dataset before running calculations or machine learning.

You want to create a continuous line in a plot even if some data points are missing.

Syntax

Pandas

DataFrame.interpolate(method='linear', axis=0, limit=None, inplace=False, limit_direction='forward', limit_area=None, downcast=None, **kwargs)

method chooses how to estimate missing values (default is 'linear').

inplace=True changes the original data, otherwise it returns a new DataFrame.

Examples

Fill missing values using linear interpolation along columns (default).

Pandas

df.interpolate()

Use time-based interpolation if the index is datetime.

Pandas

df.interpolate(method='time')

Use polynomial interpolation of order 2 for smoother estimates.

Pandas

df.interpolate(method='polynomial', order=2)

Fill up to 2 consecutive missing values going backward.

Pandas

df.interpolate(limit=2, limit_direction='backward')

Sample Program

This code creates a small table with missing temperatures on days 2 and 3. Then it fills those missing values by estimating numbers between the known temperatures on days 1 and 4 using linear interpolation.

Pandas

import pandas as pd
import numpy as np

# Create sample data with missing values
data = {'Day': [1, 2, 3, 4, 5], 'Temperature': [22.0, np.nan, np.nan, 28.0, 30.0]}
df = pd.DataFrame(data)

print('Original DataFrame:')
print(df)

# Interpolate missing values linearly
df_interpolated = df['Temperature'].interpolate()

df['Temperature'] = df_interpolated

print('\nDataFrame after interpolation:')
print(df)

OutputSuccess

Important Notes

Interpolation only fills missing values between existing data points, not at the start or end if missing.

Choosing the right method depends on your data type and pattern.

Always check the results to make sure interpolation makes sense for your data.

Summary

Interpolation estimates missing values using nearby known data.

Use DataFrame.interpolate() with different methods like linear or time.

It helps keep data complete for better analysis and visualization.