Sometimes data has empty spots called missing values. We need to fix or remove them to keep our data clean and useful.
0
0
Handling missing values (drop_na, fill) in R Programming
Introduction
When you want to remove rows that have missing data before analysis.
When you want to fill missing spots with a specific value like zero or the last known value.
When preparing data for charts or models that can't handle missing values.
When cleaning survey or experiment data that has incomplete answers.
Syntax
R Programming
library(tidyr) drop_na(data, columns) library(tidyr) fill(data, columns, .direction = c("down", "up"))
drop_na() removes rows with missing values in specified columns.
fill() fills missing values by carrying the previous or next non-missing value.
Examples
Remove rows where
column1 has missing values.R Programming
drop_na(df, column1)
Remove rows where any column has missing values.
R Programming
drop_na(df)
Fill missing values in
column2 by copying the last known value downwards.R Programming
fill(df, column2, .direction = "down")Fill missing values in
column3 by copying the next known value upwards.R Programming
fill(df, column3, .direction = "up")Sample Program
This program shows how to remove rows with missing score values and how to fill missing grade values by copying the last known grade downwards.
R Programming
library(tidyr) library(dplyr) # Create example data frame with missing values df <- data.frame( id = 1:5, score = c(10, NA, 15, NA, 20), grade = c("A", NA, "B", NA, "C") ) print("Original data frame:") print(df) # Remove rows with missing values in 'score' df_no_na <- drop_na(df, score) print("\nData frame after drop_na on 'score':") print(df_no_na) # Fill missing values in 'grade' by carrying last observation forward df_filled <- fill(df, grade, .direction = "down") print("\nData frame after fill on 'grade' (down):") print(df_filled)
OutputSuccess
Important Notes
Remember to load tidyr and dplyr packages before using drop_na() and fill().
drop_na() removes whole rows, so use carefully if you want to keep some data.
fill() only works on columns with missing values and copies existing values up or down.
Summary
drop_na() removes rows with missing values in chosen columns.
fill() fills missing values by copying nearby known values.
Use these to clean data and prepare it for analysis or visualization.