0
0
Data Analysis Pythondata~5 mins

Outer join in Data Analysis Python

Choose your learning style9 modes available
Introduction
An outer join helps you combine two tables and keep all the data from one or both tables, even if some parts don't match.
You want to see all customers and their orders, even if some customers have no orders.
You need a full list of employees and projects, showing employees without projects and projects without employees.
You want to merge two lists of products from different stores, keeping all products from both stores.
You want to find unmatched records between two datasets, like students registered and students who attended.
Syntax
Data Analysis Python
pd.merge(left_df, right_df, how='outer', on='key_column')
Use 'how="outer"' to keep all rows from both tables.
The 'on' parameter specifies the column to join on.
Examples
Combine df1 and df2 keeping all rows from both, matching on 'id'.
Data Analysis Python
pd.merge(df1, df2, how='outer', on='id')
Show all customers and orders, even if some customers have no orders or some orders have no customer.
Data Analysis Python
pd.merge(customers, orders, how='outer', on='customer_id')
Sample Program
This code joins two tables on 'id'. It keeps all rows from both tables. Rows without matches get NaN in missing columns.
Data Analysis Python
import pandas as pd

# Create first table
left = pd.DataFrame({'id': [1, 2, 3], 'name': ['Alice', 'Bob', 'Charlie']})

# Create second table
right = pd.DataFrame({'id': [2, 3, 4], 'score': [85, 90, 75]})

# Outer join on 'id'
result = pd.merge(left, right, how='outer', on='id')
print(result)
OutputSuccess
Important Notes
Outer join keeps all data from both tables, filling missing parts with NaN.
If you want only matching rows, use 'inner' join instead.
Make sure the join column exists in both tables to avoid errors.
Summary
Outer join combines two tables keeping all rows from both.
Missing matches get filled with NaN values.
Use it to see full data coverage from both sources.