How to Merge DataFrames in pandas: Syntax and Examples
Use the
pandas.merge() function to combine two DataFrames based on common columns or indices. Specify the on parameter for the key column(s) and the how parameter to choose the type of merge like 'inner', 'left', 'right', or 'outer'.Syntax
The basic syntax of pandas.merge() is:
left: The first DataFrame.right: The second DataFrame.on: Column name(s) to join on. Must be present in both DataFrames.how: Type of merge -'inner'(default),'left','right', or'outer'.left_onandright_on: Use these if the key columns have different names.
python
pandas.merge(left, right, on=None, how='inner', left_on=None, right_on=None)
Example
This example shows how to merge two DataFrames on a common column using an inner join, which returns only matching rows.
python
import pandas as pd # Create first DataFrame left = pd.DataFrame({ 'key': ['A', 'B', 'C', 'D'], 'value_left': [1, 2, 3, 4] }) # Create second DataFrame right = pd.DataFrame({ 'key': ['B', 'D', 'E', 'F'], 'value_right': [5, 6, 7, 8] }) # Merge on 'key' column with inner join merged = pd.merge(left, right, on='key', how='inner') print(merged)
Output
key value_left value_right
0 B 2 5
1 D 4 6
Common Pitfalls
Common mistakes when merging DataFrames include:
- Not specifying the
onparameter when the key columns have different names. - Using the wrong
howparameter, leading to unexpected rows in the result. - Forgetting that
merge()does not modify DataFrames in place.
Example of a wrong merge and the correct way:
python
# Wrong: keys have different names, no left_on/right_on specified import pandas as pd left = pd.DataFrame({'key1': ['A', 'B'], 'val1': [1, 2]}) right = pd.DataFrame({'key2': ['A', 'B'], 'val2': [3, 4]}) # This will raise a KeyError # merged_wrong = pd.merge(left, right, on='key1') # Correct way specifying left_on and right_on merged_correct = pd.merge(left, right, left_on='key1', right_on='key2') print(merged_correct)
Output
key1 val1 key2 val2
0 A 1 A 3
1 B 2 B 4
Quick Reference
| Parameter | Description | Example Values |
|---|---|---|
| left | First DataFrame to merge | df1 |
| right | Second DataFrame to merge | df2 |
| on | Column(s) to join on (same name in both) | 'key' |
| left_on | Column(s) in left DataFrame to join on | 'key1' |
| right_on | Column(s) in right DataFrame to join on | 'key2' |
| how | Type of merge | 'inner', 'left', 'right', 'outer' |
Key Takeaways
Use pandas.merge() to combine DataFrames on common columns or indices.
Specify the 'on' parameter for matching column names or 'left_on' and 'right_on' for different names.
Choose the 'how' parameter to control which rows appear in the result.
Merging does not change original DataFrames unless you assign the result.
Check column names carefully to avoid KeyErrors during merge.