0
0
R Programmingprogramming~5 mins

join functions (left_join, inner_join) in R Programming

Choose your learning style9 modes available
Introduction

Join functions help you combine two tables by matching rows based on common columns. This lets you bring related information together easily.

You have two lists of people and want to combine their details based on their ID.
You want to keep all records from one table and add matching info from another.
You only want rows that appear in both tables with matching keys.
You are cleaning data and need to merge information from different sources.
You want to compare two datasets and find common or unmatched entries.
Syntax
R Programming
left_join(x, y, by = NULL)
inner_join(x, y, by = NULL)

x and y are data frames you want to join.

by specifies the column(s) to match on. If NULL, it uses columns with the same names.

Examples
Keep all rows from df1 and add matching rows from df2 based on the id column.
R Programming
left_join(df1, df2, by = "id")
Keep only rows where both id and date match in df1 and df2.
R Programming
inner_join(df1, df2, by = c("id", "date"))
Join on all columns with the same names by default.
R Programming
left_join(df1, df2)
Sample Program

This program shows how left_join keeps all students and adds scores where possible. inner_join keeps only students who have scores.

R Programming
library(dplyr)

# Create first data frame
students <- data.frame(
  id = c(1, 2, 3, 4),
  name = c("Alice", "Bob", "Carol", "David")
)

# Create second data frame
scores <- data.frame(
  id = c(2, 3, 5),
  score = c(88, 92, 75)
)

# Left join: keep all students, add scores if available
left_result <- left_join(students, scores, by = "id")
print("Left Join Result:")
print(left_result)

# Inner join: keep only students with scores
inner_result <- inner_join(students, scores, by = "id")
print("Inner Join Result:")
print(inner_result)
OutputSuccess
Important Notes

If there is no matching row in y, left_join fills with NA.

inner_join only keeps rows with matches in both tables.

You can join on multiple columns by passing a vector to by.

Summary

left_join keeps all rows from the first table and adds matching data from the second.

inner_join keeps only rows that match in both tables.

Use join functions to combine related data easily and clearly.