How to Use pivot_longer in tidyr for Data Reshaping
Use
pivot_longer() from the tidyr package to convert wide data into a longer format by specifying columns to gather and naming the new key and value columns. It simplifies reshaping data for easier analysis and visualization.Syntax
The basic syntax of pivot_longer() is:
data: Your data frame.cols: Columns to gather into longer format.names_to: Name of the new column that will hold the original column names.values_to: Name of the new column that will hold the values from the gathered columns.
r
pivot_longer(data, cols, names_to = "name", values_to = "value")
Example
This example shows how to convert a wide data frame with separate columns for years into a long format with year and value columns.
r
library(tidyr) library(dplyr) # Sample wide data frame wide_data <- tibble( country = c("USA", "Canada"), `2019` = c(100, 80), `2020` = c(110, 90), `2021` = c(120, 95) ) # Convert to long format long_data <- pivot_longer( wide_data, cols = c(`2019`, `2020`, `2021`), names_to = "year", values_to = "value" ) print(long_data)
Output
# A tibble: 6 × 3
country year value
<chr> <chr> <dbl>
1 USA 2019 100
2 USA 2020 110
3 USA 2021 120
4 Canada 2019 80
5 Canada 2020 90
6 Canada 2021 95
Common Pitfalls
Common mistakes when using pivot_longer() include:
- Not specifying the correct columns in
cols, which can cause unexpected columns to be gathered. - Forgetting to set
names_toandvalues_to, leading to default column names that may be unclear. - Using non-standard column names without backticks, causing syntax errors.
r
library(tidyr) # Wrong: missing backticks for numeric column names # pivot_longer(data, cols = 2019:2021) # This causes error # Correct usage with backticks pivot_longer(data, cols = c(`2019`, `2020`, `2021`))
Quick Reference
| Argument | Description |
|---|---|
| data | The data frame to reshape |
| cols | Columns to pivot into longer format |
| names_to | Name of the new column for original column names |
| values_to | Name of the new column for values |
| names_prefix | Optional prefix to remove from names |
| values_drop_na | Remove rows with NA values if TRUE |
Key Takeaways
Use pivot_longer() to reshape wide data into long format by specifying columns to gather.
Always specify names_to and values_to for clear new column names.
Use backticks around non-standard column names like numbers.
Check that cols argument targets only the columns you want to pivot.
pivot_longer() helps prepare data for easier analysis and plotting.