0
0
Pandasdata~5 mins

Inner join behavior in Pandas

Choose your learning style9 modes available
Introduction
An inner join helps you combine two tables by keeping only the rows that match in both tables. It is useful to find common information between two sets of data.
You want to find customers who made orders by combining customer and order lists.
You need to match employee details with their department info where both exist.
You want to see products that have sales records by joining product and sales tables.
You want to compare two lists and keep only the common items.
Syntax
Pandas
pd.merge(left_dataframe, right_dataframe, how='inner', on='common_column')
The 'how' parameter set to 'inner' keeps only rows with matching keys in both tables.
The 'on' parameter specifies the column(s) to join on, which must exist in both dataframes.
Examples
Joins df1 and df2 on the 'id' column, keeping only rows where 'id' exists in both.
Pandas
pd.merge(df1, df2, how='inner', on='id')
Joins on multiple columns 'id' and 'date', keeping rows matching both columns.
Pandas
pd.merge(df1, df2, how='inner', on=['id', 'date'])
Sample Program
This code joins customers and orders on 'customer_id'. Only customers with orders appear in the result.
Pandas
import pandas as pd

# Create first dataframe
customers = pd.DataFrame({
    'customer_id': [1, 2, 3, 4],
    'name': ['Alice', 'Bob', 'Charlie', 'David']
})

# Create second dataframe
orders = pd.DataFrame({
    'order_id': [101, 102, 103],
    'customer_id': [2, 4, 5],
    'product': ['Book', 'Pen', 'Notebook']
})

# Perform inner join on 'customer_id'
result = pd.merge(customers, orders, how='inner', on='customer_id')
print(result)
OutputSuccess
Important Notes
Inner join drops rows that do not have matching keys in both dataframes.
If the join column has duplicates in either dataframe, the result will have all combinations of matches.
Use 'how'='inner' explicitly to avoid confusion with default merge behavior.
Summary
Inner join keeps only rows with matching keys in both tables.
Use pd.merge() with how='inner' and specify the join column(s).
It helps find common data between two datasets.