Given two DataFrames df1 and df2 indexed by id, what is the result of merging them with how='inner' on their indexes?
import pandas as pd df1 = pd.DataFrame({'value1': [10, 20, 30]}, index=['a', 'b', 'c']) df2 = pd.DataFrame({'value2': [100, 200, 300]}, index=['b', 'c', 'd']) result = df1.merge(df2, left_index=True, right_index=True, how='inner') print(result)
Inner join keeps only the indexes present in both DataFrames.
Only indexes 'b' and 'c' are common to both DataFrames, so the merged DataFrame contains rows for these indexes with columns from both.
What is the output of merging df1 and df2 on their indexes using how='left'?
import pandas as pd df1 = pd.DataFrame({'value1': [1, 2, 3]}, index=['x', 'y', 'z']) df2 = pd.DataFrame({'value2': [10, 20]}, index=['y', 'z']) result = df1.merge(df2, left_index=True, right_index=True, how='left') print(result)
Left join keeps all indexes from the left DataFrame and fills missing values from the right with NaN.
Index 'x' is only in df1, so value2 is NaN there. Indexes 'y' and 'z' have matching rows in both DataFrames.
Which option contains a syntax error when merging two DataFrames on their indexes?
import pandas as pd df1 = pd.DataFrame({'A': [1,2]}, index=['a','b']) df2 = pd.DataFrame({'B': [3,4]}, index=['a','b'])
Check for missing commas between arguments.
Option B is missing a comma between left_index=True and right_index=True, causing a syntax error.
You have two large DataFrames indexed by the same column. Which approach is fastest to merge them on their indexes?
Merging directly on indexes is usually faster than resetting indexes.
Option C merges directly on indexes, which is optimized in pandas. Resetting indexes or concatenating may add overhead.
Consider two DataFrames with duplicate indexes. What happens when you merge them on their indexes?
import pandas as pd df1 = pd.DataFrame({'val1': [1,2,3]}, index=['a','a','b']) df2 = pd.DataFrame({'val2': [10,20]}, index=['a','b']) result = df1.merge(df2, left_index=True, right_index=True, how='inner') print(result)
Think about how pandas handles many-to-one merges on indexes.
Pandas performs a many-to-one merge, matching each duplicate index row in df1 with the single matching row in df2, resulting in multiple rows for 'a'.