Data Analysis Pythondata~10 mins

Inner join in Data Analysis Python - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concept Flow - Inner join

Start with Table A

↓

Find matching keys in Table B

↓

Combine rows where keys match

↓

Output rows with matched data only

↓

End

Inner join takes two tables and combines rows where the key values match in both tables, leaving out rows without matches.

Execution Sample

Data Analysis Python

import pandas as pd

df1 = pd.DataFrame({'id': [1, 2, 3], 'name': ['Alice', 'Bob', 'Charlie']})
df2 = pd.DataFrame({'id': [2, 3, 4], 'age': [25, 30, 40]})

result = pd.merge(df1, df2, on='id', how='inner')
print(result)

This code merges two dataframes on the 'id' column, keeping only rows with matching 'id's in both.

Execution Table

Step	Action	df1 Rows Considered	df2 Rows Considered	Matching Keys Found	Result Rows Added
1	Start with df1 and df2	[{1, Alice}, {2, Bob}, {3, Charlie}]	[{2, 25}, {3, 30}, {4, 40}]	N/A	N/A
2	Check id=1 in df1 against df2	{1, Alice}	[{2, 25}, {3, 30}, {4, 40}]	No match	No row added
3	Check id=2 in df1 against df2	{2, Bob}	[{2, 25}, {3, 30}, {4, 40}]	Match found	Add {2, Bob, 25}
4	Check id=3 in df1 against df2	{3, Charlie}	[{2, 25}, {3, 30}, {4, 40}]	Match found	Add {3, Charlie, 30}
5	No more rows in df1	N/A	N/A	N/A	End with 2 rows in result

💡 All rows in df1 checked; only rows with matching ids in df2 included in result.

Variable Tracker

Variable	Start	After Step 2	After Step 3	After Step 4	Final
result	empty	empty	[{2, Bob, 25}]	[{2, Bob, 25}, {3, Charlie, 30}]	[{2, Bob, 25}, {3, Charlie, 30}]

Key Moments - 2 Insights

Why is the row with id=1 from df1 not in the result?

Does the row with id=4 from df2 appear in the result?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table, what rows are added to the result after Step 3?

A[{2, Bob, 25}]

B[{1, Alice, 25}]

C[{3, Charlie, 30}]

DNo rows added

Concept Snapshot

Inner join combines two tables by matching rows on a key.
Only rows with keys present in both tables appear in the result.
Syntax in pandas: pd.merge(df1, df2, on='key', how='inner')
Useful to find common data between tables.
Rows without matches are excluded.

Full Transcript

Inner join is a way to combine two tables by matching rows that share the same key value. We start with two tables, look for keys that appear in both, and then combine those rows into one result table. Rows that don't have a matching key in the other table are left out. For example, if we have a table of people with ids and names, and another table with ids and ages, an inner join on the id will give us only the people who have both a name and an age recorded. The code example uses pandas merge with how='inner' to do this. Step by step, we check each id in the first table against the second, add matching rows to the result, and skip those without matches. This helps us focus on data that is common to both tables.