Challenge - 5 Problems

🎖️

Master of Merging Data Frames

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of inner join with duplicate keys

What is the output of the following R code that merges two data frames with duplicate keys using merge()?

R Programming

df1 <- data.frame(id = c(1, 2, 2, 3), value1 = c('A', 'B', 'C', 'D'))
df2 <- data.frame(id = c(2, 2, 3, 4), value2 = c('X', 'Y', 'Z', 'W'))
result <- merge(df1, df2, by = 'id')
print(result)

  id value1 value2
1  1      A   &lt;NA&gt;
2  2      B      X
3  2      C      Y
4  3      D      Z
5  4   &lt;NA&gt;      W

  id value1 value2
1  2      B      X
2  2      B      Y
3  2      C      X
4  2      C      Y
5  3      D      Z
6  4   &lt;NA&gt;      W

  id value1 value2
1  2      B      X
2  2      C      Y
3  3      D      Z

  id value1 value2
1  2      B      X
2  2      B      Y
3  2      C      X
4  2      C      Y
5  3      D      Z

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

Result of left join with missing keys

What will be the output of this R code that performs a left join using merge()?

R Programming

df1 <- data.frame(id = c(1, 2, 3), value1 = c('A', 'B', 'C'))
df2 <- data.frame(id = c(2, 3, 4), value2 = c('X', 'Y', 'Z'))
result <- merge(df1, df2, by = 'id', all.x = TRUE)
print(result)

  id value1 value2
1  1      A   &lt;NA&gt;
2  2      B      X
3  3      C      Y

  id value1 value2
1  1      A      X
2  2      B      X
3  3      C      Y
4  4   &lt;NA&gt;      Z

  id value1 value2
1  2      B      X
2  3      C      Y

  id value1 value2
1  1      A      Z
2  2      B      X
3  3      C      Y

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify the error in merge syntax

What error does this R code produce?

R Programming

df1 <- data.frame(id = 1:3, val = c('A', 'B', 'C'))
df2 <- data.frame(id = 2:4, val = c('X', 'Y', 'Z'))
result <- merge(df1, df2, by.x = 'id', by.y = 'ID')
print(result)

AError: by.x and by.y must be the same length

BError: object 'ID' not found

CNo error, prints merged data frame

DError: duplicate column names in result

Attempts:

2 left

❓ Predict Output

advanced

2:00remaining

Output of full outer join with NA values

What is the output of this R code performing a full outer join?

R Programming

df1 <- data.frame(id = c(1, 2, 3), val1 = c('A', 'B', 'C'))
df2 <- data.frame(id = c(2, 3, 4), val2 = c('X', 'Y', 'Z'))
result <- merge(df1, df2, by = 'id', all = TRUE)
print(result)

  id val1 val2
1  1    A    X
2  2    B    X
3  3    C    Y

  id val1 val2
1  2    B    X
2  3    C    Y
3  4 &lt;NA&gt;    Z

  id val1 val2
1  1    A &lt;NA&gt;
2  2    B    X
3  3    C    Y
4  4 &lt;NA&gt;    Z

  id val1 val2
1  1    A &lt;NA&gt;
2  2    B &lt;NA&gt;
3  3    C &lt;NA&gt;
4  4 &lt;NA&gt;    Z

Attempts:

2 left

🧠 Conceptual

expert

2:00remaining

Number of rows after merging with multiple keys

Given these data frames, how many rows will the result have after merging by id and group?

df1 <- data.frame(id = c(1,1,2), group = c('A','B','A'), val1 = c(10,20,30))
df2 <- data.frame(id = c(1,1,2,2), group = c('A','A','A','B'), val2 = c(100,200,300,400))
result <- merge(df1, df2, by = c('id', 'group'))

Attempts:

2 left