0
0
Pandasdata~5 mins

Outer join behavior in Pandas

Choose your learning style9 modes available
Introduction

An outer join helps combine two tables so you keep all data from both, even if some parts don't match.

You want to see all customers and their orders, even if some customers have no orders.
You have two lists of employees from different departments and want to see everyone, showing missing info where needed.
You want to combine two datasets but keep all rows from both, filling missing spots with empty values.
You want to compare two lists and find which items are only in one list or both.
Syntax
Pandas
pd.merge(left_df, right_df, how='outer', on='key_column')

left_df and right_df are the two tables (dataframes) you want to join.

how='outer' means keep all rows from both tables.

Examples
Join two dataframes on the 'id' column, keeping all rows from both.
Pandas
pd.merge(df1, df2, how='outer', on='id')
Join using different column names from each dataframe, keeping all rows.
Pandas
pd.merge(df1, df2, how='outer', left_on='emp_id', right_on='id')
Join on all common columns, keeping all rows from both dataframes.
Pandas
pd.merge(df1, df2, how='outer')
Sample Program

This example joins two tables on the 'id' column. It keeps all rows from both tables. Missing values are shown as NaN.

Pandas
import pandas as pd

# Create first dataframe
df1 = pd.DataFrame({
    'id': [1, 2, 3],
    'name': ['Alice', 'Bob', 'Charlie']
})

# Create second dataframe
df2 = pd.DataFrame({
    'id': [2, 3, 4],
    'age': [30, 25, 40]
})

# Outer join on 'id'
result = pd.merge(df1, df2, how='outer', on='id')
print(result)
OutputSuccess
Important Notes

Missing values appear as NaN after an outer join.

Outer join is useful to find unmatched rows in either table.

Summary

Outer join keeps all rows from both tables.

Missing data is filled with NaN.

Use how='outer' in pd.merge() to do an outer join.