We use merge() to combine two tables based on common columns, just like joining tables in a database. This helps us see related information together.
0
0
merge() for SQL-style joins in Data Analysis Python
Introduction
You have two lists of customer data and want to see all details in one place.
You want to combine sales data with product details using product IDs.
You need to find matching records between two datasets, like students and their grades.
You want to keep all records from one table and add matching info from another.
You want to find records that appear in both tables or only in one.
Syntax
Data Analysis Python
pd.merge(left, right, how='inner', on=None, left_on=None, right_on=None)
left and right are the two tables (DataFrames) to join.
how decides the type of join: 'inner', 'left', 'right', or 'outer'.
Examples
Join two tables on the column 'id' keeping only matching rows (inner join).
Data Analysis Python
pd.merge(df1, df2, on='id')Keep all rows from
df1 and add matching rows from df2.Data Analysis Python
pd.merge(df1, df2, how='left', on='id')
Keep all rows from both tables, filling missing matches with
NaN.Data Analysis Python
pd.merge(df1, df2, how='outer', on='id')
Join on different column names from each table.
Data Analysis Python
pd.merge(df1, df2, left_on='key1', right_on='key2')
Sample Program
This code shows three types of joins between customers and their orders using merge(). It helps see which customers have orders and which don't.
Data Analysis Python
import pandas as pd # Create first table with customer info df_customers = pd.DataFrame({ 'customer_id': [1, 2, 3, 4], 'name': ['Alice', 'Bob', 'Charlie', 'David'] }) # Create second table with orders info df_orders = pd.DataFrame({ 'order_id': [101, 102, 103, 104], 'customer_id': [2, 2, 3, 5], 'product': ['Book', 'Pen', 'Notebook', 'Pencil'] }) # Inner join: only customers with orders inner_join = pd.merge(df_customers, df_orders, on='customer_id', how='inner') print('Inner Join Result:') print(inner_join) # Left join: all customers, with orders if any left_join = pd.merge(df_customers, df_orders, on='customer_id', how='left') print('\nLeft Join Result:') print(left_join) # Outer join: all customers and all orders outer_join = pd.merge(df_customers, df_orders, on='customer_id', how='outer') print('\nOuter Join Result:') print(outer_join)
OutputSuccess
Important Notes
If columns have different names, use left_on and right_on to specify them.
Missing matches appear as NaN in the result.
Inner join keeps only rows with matching keys in both tables.
Summary
merge() combines tables like database joins.
Use how to choose join type: inner, left, right, or outer.
Specify columns to join on with on, or left_on and right_on.