0
0
R Programmingprogramming~5 mins

Handling missing values (drop_na, fill) in R Programming

Choose your learning style9 modes available
Introduction

Sometimes data has empty spots called missing values. We need to fix or remove them to keep our data clean and useful.

When you want to remove rows that have missing data before analysis.
When you want to fill missing spots with a specific value like zero or the last known value.
When preparing data for charts or models that can't handle missing values.
When cleaning survey or experiment data that has incomplete answers.
Syntax
R Programming
library(tidyr)
drop_na(data, columns)

library(tidyr)
fill(data, columns, .direction = c("down", "up"))

drop_na() removes rows with missing values in specified columns.

fill() fills missing values by carrying the previous or next non-missing value.

Examples
Remove rows where column1 has missing values.
R Programming
drop_na(df, column1)
Remove rows where any column has missing values.
R Programming
drop_na(df)
Fill missing values in column2 by copying the last known value downwards.
R Programming
fill(df, column2, .direction = "down")
Fill missing values in column3 by copying the next known value upwards.
R Programming
fill(df, column3, .direction = "up")
Sample Program

This program shows how to remove rows with missing score values and how to fill missing grade values by copying the last known grade downwards.

R Programming
library(tidyr)
library(dplyr)

# Create example data frame with missing values
df <- data.frame(
  id = 1:5,
  score = c(10, NA, 15, NA, 20),
  grade = c("A", NA, "B", NA, "C")
)

print("Original data frame:")
print(df)

# Remove rows with missing values in 'score'
df_no_na <- drop_na(df, score)
print("\nData frame after drop_na on 'score':")
print(df_no_na)

# Fill missing values in 'grade' by carrying last observation forward
df_filled <- fill(df, grade, .direction = "down")
print("\nData frame after fill on 'grade' (down):")
print(df_filled)
OutputSuccess
Important Notes

Remember to load tidyr and dplyr packages before using drop_na() and fill().

drop_na() removes whole rows, so use carefully if you want to keep some data.

fill() only works on columns with missing values and copies existing values up or down.

Summary

drop_na() removes rows with missing values in chosen columns.

fill() fills missing values by copying nearby known values.

Use these to clean data and prepare it for analysis or visualization.