0
0
PandasHow-ToBeginner · 3 min read

How to Merge DataFrames in pandas: Syntax and Examples

Use the pandas.merge() function to combine two DataFrames based on common columns or indices. Specify the on parameter for the key column(s) and the how parameter to choose the type of merge like 'inner', 'left', 'right', or 'outer'.
📐

Syntax

The basic syntax of pandas.merge() is:

  • left: The first DataFrame.
  • right: The second DataFrame.
  • on: Column name(s) to join on. Must be present in both DataFrames.
  • how: Type of merge - 'inner' (default), 'left', 'right', or 'outer'.
  • left_on and right_on: Use these if the key columns have different names.
python
pandas.merge(left, right, on=None, how='inner', left_on=None, right_on=None)
💻

Example

This example shows how to merge two DataFrames on a common column using an inner join, which returns only matching rows.

python
import pandas as pd

# Create first DataFrame
left = pd.DataFrame({
    'key': ['A', 'B', 'C', 'D'],
    'value_left': [1, 2, 3, 4]
})

# Create second DataFrame
right = pd.DataFrame({
    'key': ['B', 'D', 'E', 'F'],
    'value_right': [5, 6, 7, 8]
})

# Merge on 'key' column with inner join
merged = pd.merge(left, right, on='key', how='inner')
print(merged)
Output
key value_left value_right 0 B 2 5 1 D 4 6
⚠️

Common Pitfalls

Common mistakes when merging DataFrames include:

  • Not specifying the on parameter when the key columns have different names.
  • Using the wrong how parameter, leading to unexpected rows in the result.
  • Forgetting that merge() does not modify DataFrames in place.

Example of a wrong merge and the correct way:

python
# Wrong: keys have different names, no left_on/right_on specified
import pandas as pd
left = pd.DataFrame({'key1': ['A', 'B'], 'val1': [1, 2]})
right = pd.DataFrame({'key2': ['A', 'B'], 'val2': [3, 4]})

# This will raise a KeyError
# merged_wrong = pd.merge(left, right, on='key1')

# Correct way specifying left_on and right_on
merged_correct = pd.merge(left, right, left_on='key1', right_on='key2')
print(merged_correct)
Output
key1 val1 key2 val2 0 A 1 A 3 1 B 2 B 4
📊

Quick Reference

ParameterDescriptionExample Values
leftFirst DataFrame to mergedf1
rightSecond DataFrame to mergedf2
onColumn(s) to join on (same name in both)'key'
left_onColumn(s) in left DataFrame to join on'key1'
right_onColumn(s) in right DataFrame to join on'key2'
howType of merge'inner', 'left', 'right', 'outer'

Key Takeaways

Use pandas.merge() to combine DataFrames on common columns or indices.
Specify the 'on' parameter for matching column names or 'left_on' and 'right_on' for different names.
Choose the 'how' parameter to control which rows appear in the result.
Merging does not change original DataFrames unless you assign the result.
Check column names carefully to avoid KeyErrors during merge.