How to Use fill() in tidyr to Fill Missing Values in R
Use
fill() from the tidyr package to fill missing values in selected columns by carrying the previous or next non-missing value down or up. Specify columns and the direction with .direction argument, like fill(data, column, .direction = "down").Syntax
The fill() function has this basic syntax:
data: Your data frame or tibble.... (columns): One or more columns to fill missing values in..direction: Direction to fill missing values, either"down"(default),"up","downup", or"updown".
r
fill(data, ..., .direction = "down")Example
This example shows how to fill missing values in a column by carrying the last observed value downwards.
r
library(tidyr) library(dplyr) data <- tibble( group = c("A", NA, NA, "B", NA, "C", NA), value = c(1, NA, 3, NA, 5, NA, 7) ) filled_data <- data %>% fill(group, .direction = "down") print(filled_data)
Output
# A tibble: 7 ร 2
group value
<chr> <dbl>
1 A 1
2 A NA
3 A 3
4 B NA
5 B 5
6 C NA
7 C 7
Common Pitfalls
Common mistakes when using fill() include:
- Not specifying columns, which fills all columns and may cause unexpected changes.
- Using the wrong
.direction, leading to filling in the wrong direction. - Expecting
fill()to fill non-missing gaps that require interpolation (it only copies existing values).
r
library(tidyr) # Wrong: fills all columns unintentionally filled_wrong <- fill(tibble(x = c(1, NA), y = c(NA, 2))) # Right: specify columns to fill filled_right <- fill(tibble(x = c(1, NA), y = c(NA, 2)), x) print(filled_wrong) print(filled_right)
Output
# A tibble: 2 ร 2
x y
<dbl> <dbl>
1 1 NA
2 1 2
# A tibble: 2 ร 2
x y
<dbl> <dbl>
1 1 NA
2 1 2
Quick Reference
| Argument | Description |
|---|---|
| data | Data frame or tibble to fill missing values in |
| ... | Columns to fill (unquoted names) |
| .direction | Direction to fill: "down" (default), "up", "downup", or "updown" |
Key Takeaways
Use fill() to replace missing values by copying nearby non-missing values in specified columns.
Specify columns explicitly to avoid filling unintended columns.
Choose the correct .direction to control whether values fill downward, upward, or both.
fill() copies existing values; it does not interpolate or calculate new values.
fill() is useful for cleaning data with missing entries in grouped or ordered datasets.