0
0
R-programmingHow-ToBeginner · 3 min read

How to Use pivot_longer in tidyr for Data Reshaping

Use pivot_longer() from the tidyr package to convert wide data into a longer format by specifying columns to gather and naming the new key and value columns. It simplifies reshaping data for easier analysis and visualization.
📐

Syntax

The basic syntax of pivot_longer() is:

  • data: Your data frame.
  • cols: Columns to gather into longer format.
  • names_to: Name of the new column that will hold the original column names.
  • values_to: Name of the new column that will hold the values from the gathered columns.
r
pivot_longer(data, cols, names_to = "name", values_to = "value")
💻

Example

This example shows how to convert a wide data frame with separate columns for years into a long format with year and value columns.

r
library(tidyr)
library(dplyr)

# Sample wide data frame
wide_data <- tibble(
  country = c("USA", "Canada"),
  `2019` = c(100, 80),
  `2020` = c(110, 90),
  `2021` = c(120, 95)
)

# Convert to long format
long_data <- pivot_longer(
  wide_data,
  cols = c(`2019`, `2020`, `2021`),
  names_to = "year",
  values_to = "value"
)

print(long_data)
Output
# A tibble: 6 × 3 country year value <chr> <chr> <dbl> 1 USA 2019 100 2 USA 2020 110 3 USA 2021 120 4 Canada 2019 80 5 Canada 2020 90 6 Canada 2021 95
⚠️

Common Pitfalls

Common mistakes when using pivot_longer() include:

  • Not specifying the correct columns in cols, which can cause unexpected columns to be gathered.
  • Forgetting to set names_to and values_to, leading to default column names that may be unclear.
  • Using non-standard column names without backticks, causing syntax errors.
r
library(tidyr)

# Wrong: missing backticks for numeric column names
# pivot_longer(data, cols = 2019:2021) # This causes error

# Correct usage with backticks
pivot_longer(data, cols = c(`2019`, `2020`, `2021`))
📊

Quick Reference

ArgumentDescription
dataThe data frame to reshape
colsColumns to pivot into longer format
names_toName of the new column for original column names
values_toName of the new column for values
names_prefixOptional prefix to remove from names
values_drop_naRemove rows with NA values if TRUE

Key Takeaways

Use pivot_longer() to reshape wide data into long format by specifying columns to gather.
Always specify names_to and values_to for clear new column names.
Use backticks around non-standard column names like numbers.
Check that cols argument targets only the columns you want to pivot.
pivot_longer() helps prepare data for easier analysis and plotting.