0
0
R Programmingprogramming~5 mins

separate and unite in R Programming

Choose your learning style9 modes available
Introduction

We use separate to split one column into multiple columns, and unite to join multiple columns into one. This helps organize data better.

You have a column with full names and want to split it into first and last names.
You want to combine separate date columns (year, month, day) into one date column.
You need to clean data by breaking apart or joining columns for easier analysis.
Syntax
R Programming
separate(data, col, into, sep = " ", remove = TRUE, convert = FALSE)
unite(data, col, ..., sep = "_", remove = TRUE)

data is your data frame.

col is the column to split or the new column name when joining.

Examples
This splits the 'name' column into 'first' and 'last' using underscore as separator.
R Programming
library(tidyr)
data <- data.frame(name = c("John_Doe", "Jane_Smith"))
data_separated <- separate(data, name, into = c("first", "last"), sep = "_")
This joins 'first' and 'last' columns into one 'full_name' column separated by space.
R Programming
library(tidyr)
data <- data.frame(first = c("John", "Jane"), last = c("Doe", "Smith"))
data_united <- unite(data, full_name, first, last, sep = " ")
Sample Program

This program first splits the 'name' column into two columns, then joins them back with a space.

R Programming
library(tidyr)
# Original data with full names
people <- data.frame(name = c("Alice_Wonderland", "Bob_Builder"))

# Separate 'name' into 'first' and 'last'
people_sep <- separate(people, name, into = c("first", "last"), sep = "_")

# Unite 'first' and 'last' back into 'full_name'
people_united <- unite(people_sep, full_name, first, last, sep = " ")

print(people_sep)
print(people_united)
OutputSuccess
Important Notes

By default, separate removes the original column after splitting.

You can keep the original column by setting remove = FALSE.

unite also removes the original columns by default after joining.

Summary

separate splits one column into many.

unite joins many columns into one.

Both help tidy your data for easier work.