0
0
R-programmingHow-ToBeginner ยท 3 min read

How to Use across in dplyr for Multiple Column Operations

In dplyr, use across() inside verbs like mutate() or summarise() to apply a function to multiple columns at once. It lets you select columns and apply one or more functions cleanly without repeating code.
๐Ÿ“

Syntax

The basic syntax of across() is:

  • across(.cols, .fns, ...)

Where:

  • .cols selects columns to operate on (e.g., starts_with("x"), c(col1, col2)).
  • .fns is the function or list of functions to apply (e.g., mean, ~ .x + 1).
  • Additional arguments can be passed to the function.

You typically use across() inside mutate(), summarise(), or filter().

r
mutate(data, across(.cols, .fns, ...))
๐Ÿ’ป

Example

This example shows how to add 1 to all numeric columns in a data frame using mutate() and across().

r
library(dplyr)

data <- tibble(
  a = 1:3,
  b = 4:6,
  c = letters[1:3]
)

result <- data %>%
  mutate(across(where(is.numeric), ~ .x + 1))

print(result)
Output
# A tibble: 3 ร— 3 a b c <int> <int> <chr> 1 2 5 a 2 3 6 b 3 4 7 c
โš ๏ธ

Common Pitfalls

Common mistakes when using across() include:

  • Not using across() inside a dplyr verb like mutate() or summarise().
  • Forgetting to select columns properly, which can cause errors or unexpected results.
  • Using functions that do not work element-wise or expecting across() to work outside tidyverse verbs.

Example of wrong and right usage:

r
library(dplyr)

data <- tibble(x = 1:3, y = 4:6)

# Wrong: using across() alone
# across(where(is.numeric), mean) # Error: must be inside mutate or summarise

# Right: inside summarise
result <- data %>% summarise(across(where(is.numeric), mean))
print(result)
Output
# A tibble: 1 ร— 2 x y <dbl> <dbl> 1 2 5
๐Ÿ“Š

Quick Reference

ArgumentDescriptionExample
.colsSelect columns to apply functionwhere(is.numeric), starts_with('a')
.fnsFunction(s) to applymean, ~ .x + 1
Additional argsExtra parameters for functionsna.rm = TRUE
UsageUsed inside dplyr verbsmutate(across(...)), summarise(across(...))
โœ…

Key Takeaways

Use across() inside dplyr verbs like mutate() or summarise() to apply functions to multiple columns.
Select columns clearly with helpers like where(), starts_with(), or column names.
You can apply one or multiple functions with across() cleanly and efficiently.
Avoid using across() outside of dplyr verbs to prevent errors.
across() helps write concise and readable code for column-wise operations.