Data Analysis Pythondata~10 mins

merge() for SQL-style joins in Data Analysis Python - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concept Flow - merge() for SQL-style joins

Start with two DataFrames

↓

Choose join key(s)

↓

Select join type: inner, left, right, outer

↓

Match rows based on keys

↓

Combine matched rows

↓

Handle unmatched rows per join type

↓

Return merged DataFrame

The merge() function takes two tables, matches rows by keys, combines them based on join type, and returns the joined table.

Execution Sample

Data Analysis Python

import pandas as pd

left = pd.DataFrame({'key': ['A', 'B', 'C'], 'left_val': [1, 2, 3]})
right = pd.DataFrame({'key': ['B', 'C', 'D'], 'right_val': [4, 5, 6]})

result = pd.merge(left, right, on='key', how='inner')
print(result)

This code merges two DataFrames on 'key' using an inner join, keeping only matching keys.

Execution Table

Step	Action	Left DataFrame Rows	Right DataFrame Rows	Matching Keys	Result Rows
1	Start with left and right DataFrames	[A:1, B:2, C:3]	[B:4, C:5, D:6]	-	-
2	Choose join key 'key'	-	-	-	-
3	Select join type 'inner'	-	-	-	-
4	Find matching keys in both: B, C	-	-	[B, C]	-
5	Combine rows with keys B and C	-	-	-	[B:2,4; C:3,5]
6	Return merged DataFrame with 2 rows	-	-	-	[B:2,4; C:3,5]

💡 Inner join keeps only keys present in both DataFrames (B and C).

Variable Tracker

Variable	Start	After Step 4	After Step 5	Final
left	[A:1, B:2, C:3]	[A:1, B:2, C:3]	[A:1, B:2, C:3]	[A:1, B:2, C:3]
right	[B:4, C:5, D:6]	[B:4, C:5, D:6]	[B:4, C:5, D:6]	[B:4, C:5, D:6]
matching_keys	None	[B, C]	[B, C]	[B, C]
result	None	None	[B:2,4; C:3,5]	[B:2,4; C:3,5]

Key Moments - 2 Insights

Why does the merged result only have keys B and C, not A or D?

What happens if we change 'how' to 'left'?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table at step 5, how many rows does the result have?

A1 row

B3 rows

C2 rows

D4 rows

Concept Snapshot

merge(left, right, on=key, how=join_type)
- Joins two DataFrames on key(s)
- how options: inner (default), left, right, outer
- inner: keep keys in both
- left: keep all left keys
- right: keep all right keys
- outer: keep all keys from both

Full Transcript

The merge() function in pandas combines two tables by matching rows on specified keys. You pick which keys to join on and the type of join: inner, left, right, or outer. Inner join keeps only rows with keys in both tables. Left join keeps all rows from the left table and matches from the right. Right join keeps all from right and matches from left. Outer join keeps all rows from both tables, filling missing data with NaN. The example merges two small tables on the 'key' column using inner join, resulting in rows only for keys present in both tables. This step-by-step trace shows how keys are matched and rows combined.