0
0
R Programmingprogramming~5 mins

Merging data frames in R Programming

Choose your learning style9 modes available
Introduction

Merging data frames helps you combine information from two tables into one, based on matching data.

You have two lists of people with different details and want one complete list.
You want to add sales data to a list of products by matching product IDs.
You need to combine survey answers from two groups based on participant IDs.
You want to join customer info with their purchase history.
Syntax
R Programming
merge(x, y, by = NULL, by.x = NULL, by.y = NULL, all = FALSE, all.x = FALSE, all.y = FALSE)

x and y are the two data frames to merge.

by specifies the column(s) to match on. If NULL, merges on common column names.

Examples
Merges on all common columns between df1 and df2.
R Programming
merge(df1, df2)
Merges using the column named 'id' in both data frames.
R Programming
merge(df1, df2, by = "id")
Merges where df1's 'id1' matches df2's 'id2'.
R Programming
merge(df1, df2, by.x = "id1", by.y = "id2")
Performs a full outer join, keeping all rows from both data frames.
R Programming
merge(df1, df2, all = TRUE)
Sample Program

This program merges two data frames by the 'id' column. It keeps all rows from both tables, filling missing values with NA.

R Programming
df1 <- data.frame(id = c(1, 2, 3), name = c("Anna", "Ben", "Cara"))
df2 <- data.frame(id = c(2, 3, 4), score = c(88, 92, 75))
merged_df <- merge(df1, df2, by = "id", all = TRUE)
print(merged_df)
OutputSuccess
Important Notes

Use all = TRUE for a full join, all.x = TRUE for left join, and all.y = TRUE for right join.

If columns have different names, use by.x and by.y to specify them.

Missing matches will show as NA in the result.

Summary

Merging combines two data frames by matching columns.

You can control which columns to match and which rows to keep.

Missing data appears as NA after merging.