Given two dataframes df1 and df2:
df1 = pd.DataFrame({"id": [1, 2, 3], "value1": [10, 20, 30]})
df2 = pd.DataFrame({"id": [2, 3, 4], "value2": [200, 300, 400]})
result = pd.merge(df1, df2, on="id", how="left")
print(result)What will be printed?
import pandas as pd df1 = pd.DataFrame({"id": [1, 2, 3], "value1": [10, 20, 30]}) df2 = pd.DataFrame({"id": [2, 3, 4], "value2": [200, 300, 400]}) result = pd.merge(df1, df2, on="id", how="left") print(result)
Remember, a left join keeps all rows from the left dataframe and matches rows from the right dataframe where possible.
The left join keeps all rows from df1. For id=1, there is no matching id in df2, so value2 is NaN. For id=2 and id=3, matching rows exist, so values from df2 are included.
Given two dataframes df1 and df2:
df1 = pd.DataFrame({"key": ["A", "B", "C"], "val1": [1, 2, 3]})
df2 = pd.DataFrame({"key": ["B", "C", "D"], "val2": [20, 30, 40]})
result = pd.merge(df1, df2, on="key", how="outer")
print(result)What will be printed?
import pandas as pd df1 = pd.DataFrame({"key": ["A", "B", "C"], "val1": [1, 2, 3]}) df2 = pd.DataFrame({"key": ["B", "C", "D"], "val2": [20, 30, 40]}) result = pd.merge(df1, df2, on="key", how="outer") print(result)
Full outer join keeps all rows from both dataframes, filling missing values with NaN.
The full outer join includes all keys from both df1 and df2. For key='A', only df1 has a value, so val2 is NaN. For key='D', only df2 has a value, so val1 is NaN. Keys B and C appear in both.
Consider the following code snippet to perform an outer join in pandas:
pd.merge(df1, df2, on='id', how='outer')
Which of the following options will cause a syntax error?
import pandas as pd pd.merge(df1, df2, on='id', how='outer')
Check for missing commas between arguments.
Option C is missing a comma between on='id' and how='outer', causing a syntax error. The other options have correct syntax.
You have two large dataframes df1 and df2 with millions of rows. You want to perform a full outer join on column key. Which option is the most efficient?
Sorting during merge can slow down performance on large data.
Option D disables sorting during the merge, which improves performance on large dataframes. Option D sorts during merge, slowing it down. Option D sorts after merge, adding extra cost. Option D uses inner join, which is not the requested full outer join.
Given two tables T1 and T2 with unique keys, the number of rows in a full outer join on the key column is:
Think about how full outer join combines all unique keys from both tables.
A full outer join includes all rows from both tables. Rows with matching keys appear once, so the total rows equal the sum of rows minus the overlap (common keys).