0
0
Pandasdata~10 mins

Merging on different column names in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Merging on different column names
Start with two DataFrames
Identify columns to merge
Specify left_on and right_on
Perform merge operation
Get merged DataFrame with combined data
End
We start with two tables having different column names for the same data. We tell pandas which columns to match, then merge them into one combined table.
Execution Sample
Pandas
import pandas as pd

left = pd.DataFrame({'key1': ['A', 'B'], 'val': [1, 2]})
right = pd.DataFrame({'key2': ['A', 'B'], 'val2': [3, 4]})
merged = pd.merge(left, right, left_on='key1', right_on='key2')
print(merged)
This code merges two DataFrames on columns with different names: 'key1' in left and 'key2' in right.
Execution Table
StepActionleft DataFrameright DataFrameMerge ColumnsResulting DataFrame
1Create left DataFrame{'key1': ['A', 'B'], 'val': [1, 2]}N/AN/AN/A
2Create right DataFrameN/A{'key2': ['A', 'B'], 'val2': [3, 4]}N/AN/A
3Specify merge columnsleft_on='key1'right_on='key2'key1 and key2N/A
4Perform mergeleftrightkey1=key2{'key1': ['A', 'B'], 'val': [1, 2], 'key2': ['A', 'B'], 'val2': [3, 4]}
5Print merged DataFrameN/AN/AN/A key1 val key2 val2 0 A 1 A 3 1 B 2 B 4
💡 Merge completed by matching 'key1' from left with 'key2' from right DataFrame.
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 4Final
leftundefined{'key1': ['A', 'B'], 'val': [1, 2]}{'key1': ['A', 'B'], 'val': [1, 2]}{'key1': ['A', 'B'], 'val': [1, 2]}{'key1': ['A', 'B'], 'val': [1, 2]}
rightundefinedundefined{'key2': ['A', 'B'], 'val2': [3, 4]}{'key2': ['A', 'B'], 'val2': [3, 4]}{'key2': ['A', 'B'], 'val2': [3, 4]}
mergedundefinedundefinedundefined{'key1': ['A', 'B'], 'val': [1, 2], 'key2': ['A', 'B'], 'val2': [3, 4]}{'key1': ['A', 'B'], 'val': [1, 2], 'key2': ['A', 'B'], 'val2': [3, 4]}
Key Moments - 2 Insights
Why do we need to use left_on and right_on instead of just on?
Because the columns to join have different names in each DataFrame, we must specify left_on and right_on to tell pandas which columns to match. See execution_table step 3.
What happens if the values in the merge columns don't match?
Rows without matching keys will be excluded by default in an inner merge. Here, all keys match, so all rows appear. This is shown in execution_table step 4.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 4, what columns does the merged DataFrame contain?
Akey1, val
Bkey1, val, key2, val2
Ckey2, val2
Dval, val2
💡 Hint
Check the 'Resulting DataFrame' column at step 4 in execution_table.
At which step do we specify which columns to merge on?
AStep 2
BStep 4
CStep 3
DStep 5
💡 Hint
Look at the 'Action' column in execution_table for when merge columns are specified.
If the left DataFrame had a column named 'key2' instead of 'key1', how would the merge call change?
AUse left_on='key2', right_on='key2'
BUse left_on='key1', right_on='key2'
CUse on='key2'
DNo change needed
💡 Hint
Refer to variable_tracker and think about matching column names for merge.
Concept Snapshot
Merging on different column names:
Use pd.merge(left, right, left_on='left_col', right_on='right_col')
This joins two DataFrames on columns with different names.
Result includes all columns from both DataFrames.
Default merge type is inner join (only matching rows).
Full Transcript
We start with two DataFrames that have columns with different names but represent the same data to join on. We specify which columns to match using left_on and right_on parameters in pandas merge function. The merge combines rows where these columns have matching values, producing a new DataFrame with columns from both inputs. This method is useful when column names differ but data is related. The example shows merging on 'key1' from left and 'key2' from right, resulting in a combined table with all columns.