0
0
R-programmingHow-ToBeginner · 3 min read

How to Use pivot_wider in tidyr for Data Reshaping

Use pivot_wider() from the tidyr package to convert data from long to wide format by specifying names_from for new column names and values_from for the values to fill those columns. This function spreads key-value pairs across multiple columns, making data easier to analyze in wide form.
📐

Syntax

The basic syntax of pivot_wider() includes:

  • data: Your input data frame or tibble.
  • names_from: The column whose values will become new column names.
  • values_from: The column whose values will fill the new columns.
  • values_fill (optional): A value to replace missing entries after widening.
r
pivot_wider(data, names_from = column_to_use_for_names, values_from = column_to_use_for_values, values_fill = list(column_to_use_for_values = fill_value))
💻

Example

This example shows how to convert a long data frame of fruit sales into a wide format where each fruit becomes a column with its sales values.

r
library(tidyr)
library(dplyr)

# Sample long data frame
sales <- tibble(
  store = c("A", "A", "B", "B"),
  fruit = c("apple", "banana", "apple", "banana"),
  sales = c(10, 5, 8, 7)
)

# Use pivot_wider to spread fruit types into columns
sales_wide <- sales %>%
  pivot_wider(names_from = fruit, values_from = sales)

print(sales_wide)
Output
# A tibble: 2 × 3 store apple banana <chr> <dbl> <dbl> 1 A 10 5 2 B 8 7
⚠️

Common Pitfalls

Common mistakes when using pivot_wider() include:

  • Not specifying names_from or values_from, which causes errors.
  • Having duplicate combinations of the id_cols and names_from columns, leading to multiple values for one cell.
  • Missing values after widening, which can be handled with values_fill.
r
library(tidyr)

# Example of duplicate keys causing error
long_data <- tibble(
  id = c(1, 1, 1),
  key = c("A", "A", "B"),
  value = c(10, 20, 30)
)

# This will cause an error because id=1 and key=A appear twice
# pivot_wider(long_data, names_from = key, values_from = value)

# Correct approach: summarize or choose one value before pivoting
library(dplyr)
long_data_unique <- long_data %>%
  group_by(id, key) %>%
  summarize(value = mean(value), .groups = "drop")

pivot_wider(long_data_unique, names_from = key, values_from = value)
Output
# A tibble: 1 × 3 id A B <dbl> <dbl> <dbl> 1 1 15 30
📊

Quick Reference

ArgumentDescription
dataInput data frame or tibble
names_fromColumn to use for new column names
values_fromColumn to use for filling new columns
values_fillValue to replace missing cells after widening (optional)
id_colsColumns to keep as identifiers (optional, usually inferred)

Key Takeaways

Use pivot_wider() to reshape data from long to wide by specifying names_from and values_from.
Ensure unique combinations of id and names_from columns to avoid errors.
Use values_fill to handle missing values after widening.
Summarize or clean duplicates before pivoting to prevent conflicts.
pivot_wider() is part of the tidyr package and works well with dplyr pipelines.