0
0
PandasHow-ToBeginner · 3 min read

How to Use Inner Merge in pandas for DataFrames

Use pandas.merge() with the parameter how='inner' to combine two DataFrames by keeping only rows with matching keys in both. This performs an inner join, returning the intersection of the data based on specified columns.
📐

Syntax

The basic syntax for an inner merge in pandas is:

  • left: The first DataFrame.
  • right: The second DataFrame.
  • on: Column name(s) to join on. Must be present in both DataFrames.
  • how='inner': Specifies an inner join to keep only matching rows.
python
pd.merge(left, right, on='key_column', how='inner')
💻

Example

This example shows how to merge two DataFrames on a common column using an inner join. Only rows with matching keys in both DataFrames are kept.

python
import pandas as pd

# Create first DataFrame
left = pd.DataFrame({
    'key': ['A', 'B', 'C', 'D'],
    'value_left': [1, 2, 3, 4]
})

# Create second DataFrame
right = pd.DataFrame({
    'key': ['B', 'C', 'E', 'F'],
    'value_right': [5, 6, 7, 8]
})

# Perform inner merge on 'key'
result = pd.merge(left, right, on='key', how='inner')
print(result)
Output
key value_left value_right 0 B 2 5 1 C 3 6
⚠️

Common Pitfalls

Common mistakes when using inner merge include:

  • Not specifying the on parameter correctly, leading to unexpected merges or errors.
  • Using columns with different names in each DataFrame without specifying left_on and right_on.
  • Assuming inner merge keeps all rows; it only keeps rows with keys present in both DataFrames.
python
import pandas as pd

# Wrong: columns have different names but 'on' is used
left = pd.DataFrame({'key1': ['A', 'B'], 'val': [1, 2]})
right = pd.DataFrame({'key2': ['B', 'C'], 'val': [3, 4]})

# This will raise an error because 'key1' and 'key2' differ
# pd.merge(left, right, on='key1', how='inner')  # Error

# Correct way: specify left_on and right_on
correct_merge = pd.merge(left, right, left_on='key1', right_on='key2', how='inner')
print(correct_merge)
Output
key1 val_x key2 val_y 0 B 2 B 3
📊

Quick Reference

ParameterDescriptionExample
leftFirst DataFrame to mergepd.merge(left, right, ...)
rightSecond DataFrame to mergepd.merge(left, right, ...)
onColumn(s) to join on (must exist in both)'key'
left_onColumn(s) from left DataFrame if names differ'key1'
right_onColumn(s) from right DataFrame if names differ'key2'
howType of merge: 'inner' keeps only matching rows'inner'

Key Takeaways

Use pd.merge() with how='inner' to keep only rows with matching keys in both DataFrames.
Specify the 'on' parameter to define the join column(s) when names match in both DataFrames.
Use 'left_on' and 'right_on' if join columns have different names in each DataFrame.
Inner merge returns the intersection of data, excluding non-matching rows.
Always check column names and data types to avoid merge errors.