0
0
R-programmingHow-ToBeginner ยท 3 min read

How to Use fill() in tidyr to Fill Missing Values in R

Use fill() from the tidyr package to fill missing values in selected columns by carrying the previous or next non-missing value down or up. Specify columns and the direction with .direction argument, like fill(data, column, .direction = "down").
๐Ÿ“

Syntax

The fill() function has this basic syntax:

  • data: Your data frame or tibble.
  • ... (columns): One or more columns to fill missing values in.
  • .direction: Direction to fill missing values, either "down" (default), "up", "downup", or "updown".
r
fill(data, ..., .direction = "down")
๐Ÿ’ป

Example

This example shows how to fill missing values in a column by carrying the last observed value downwards.

r
library(tidyr)
library(dplyr)

data <- tibble(
  group = c("A", NA, NA, "B", NA, "C", NA),
  value = c(1, NA, 3, NA, 5, NA, 7)
)

filled_data <- data %>%
  fill(group, .direction = "down")

print(filled_data)
Output
# A tibble: 7 ร— 2 group value <chr> <dbl> 1 A 1 2 A NA 3 A 3 4 B NA 5 B 5 6 C NA 7 C 7
โš ๏ธ

Common Pitfalls

Common mistakes when using fill() include:

  • Not specifying columns, which fills all columns and may cause unexpected changes.
  • Using the wrong .direction, leading to filling in the wrong direction.
  • Expecting fill() to fill non-missing gaps that require interpolation (it only copies existing values).
r
library(tidyr)

# Wrong: fills all columns unintentionally
filled_wrong <- fill(tibble(x = c(1, NA), y = c(NA, 2)))

# Right: specify columns to fill
filled_right <- fill(tibble(x = c(1, NA), y = c(NA, 2)), x)

print(filled_wrong)
print(filled_right)
Output
# A tibble: 2 ร— 2 x y <dbl> <dbl> 1 1 NA 2 1 2 # A tibble: 2 ร— 2 x y <dbl> <dbl> 1 1 NA 2 1 2
๐Ÿ“Š

Quick Reference

ArgumentDescription
dataData frame or tibble to fill missing values in
...Columns to fill (unquoted names)
.directionDirection to fill: "down" (default), "up", "downup", or "updown"
โœ…

Key Takeaways

Use fill() to replace missing values by copying nearby non-missing values in specified columns.
Specify columns explicitly to avoid filling unintended columns.
Choose the correct .direction to control whether values fill downward, upward, or both.
fill() copies existing values; it does not interpolate or calculate new values.
fill() is useful for cleaning data with missing entries in grouped or ordered datasets.