0
0
Data Analysis Pythondata~5 mins

Shift and lag operations in Data Analysis Python

Choose your learning style9 modes available
Introduction

Shift and lag operations help you compare data points in a sequence by moving values up or down. This is useful to see changes over time or between steps.

To compare today's sales with yesterday's sales in a store.
To find the difference in temperature between consecutive days.
To analyze stock prices by comparing current price with previous day's price.
To detect trends or changes in time series data like website visits over days.
Syntax
Data Analysis Python
df['new_column'] = df['existing_column'].shift(n)

n is the number of positions to move. Positive n shifts down (lag), negative n shifts up (lead).

Missing values appear where data is shifted beyond the start or end.

Examples
This creates a new column with values shifted down by 1, showing the previous row's value.
Data Analysis Python
df['lag_1'] = df['value'].shift(1)
This shifts values up by 1, showing the next row's value.
Data Analysis Python
df['lead_1'] = df['value'].shift(-1)
This shifts values down by 2 rows, showing the value two steps before.
Data Analysis Python
df['lag_2'] = df['value'].shift(2)
Sample Program

This code creates a table of sales over 5 days. It adds two new columns: one with the previous day's sales and one with the next day's sales, using shift operations.

Data Analysis Python
import pandas as pd

data = {'day': [1, 2, 3, 4, 5], 'sales': [100, 150, 130, 170, 160]}
df = pd.DataFrame(data)

df['previous_day_sales'] = df['sales'].shift(1)
df['next_day_sales'] = df['sales'].shift(-1)

print(df)
OutputSuccess
Important Notes

Shift does not change the order of rows, it only moves values within a column.

Missing values (NaN) appear where there is no data to fill after shifting.

You can use shift with any numeric or categorical column to compare rows.

Summary

Shift moves data up or down to compare rows in sequence.

Positive shift values create lag (previous rows), negative create lead (next rows).

Useful for time series and sequential data analysis.