0
0
Pandasdata~5 mins

Interpolation for missing values in Pandas

Choose your learning style9 modes available
Introduction

Interpolation helps fill in missing data points by estimating values between known data. This keeps your data complete and useful for analysis.

You have a time series with some missing days and want to estimate the missing values.
Your sensor data has gaps and you want to smooth out the missing readings.
You want to fill missing values in a dataset before running calculations or machine learning.
You want to create a continuous line in a plot even if some data points are missing.
Syntax
Pandas
DataFrame.interpolate(method='linear', axis=0, limit=None, inplace=False, limit_direction='forward', limit_area=None, downcast=None, **kwargs)

method chooses how to estimate missing values (default is 'linear').

inplace=True changes the original data, otherwise it returns a new DataFrame.

Examples
Fill missing values using linear interpolation along columns (default).
Pandas
df.interpolate()
Use time-based interpolation if the index is datetime.
Pandas
df.interpolate(method='time')
Use polynomial interpolation of order 2 for smoother estimates.
Pandas
df.interpolate(method='polynomial', order=2)
Fill up to 2 consecutive missing values going backward.
Pandas
df.interpolate(limit=2, limit_direction='backward')
Sample Program

This code creates a small table with missing temperatures on days 2 and 3. Then it fills those missing values by estimating numbers between the known temperatures on days 1 and 4 using linear interpolation.

Pandas
import pandas as pd
import numpy as np

# Create sample data with missing values
data = {'Day': [1, 2, 3, 4, 5], 'Temperature': [22.0, np.nan, np.nan, 28.0, 30.0]}
df = pd.DataFrame(data)

print('Original DataFrame:')
print(df)

# Interpolate missing values linearly
df_interpolated = df['Temperature'].interpolate()

df['Temperature'] = df_interpolated

print('\nDataFrame after interpolation:')
print(df)
OutputSuccess
Important Notes

Interpolation only fills missing values between existing data points, not at the start or end if missing.

Choosing the right method depends on your data type and pattern.

Always check the results to make sure interpolation makes sense for your data.

Summary

Interpolation estimates missing values using nearby known data.

Use DataFrame.interpolate() with different methods like linear or time.

It helps keep data complete for better analysis and visualization.