R Programmingprogramming~10 mins

Handling missing values (drop_na, fill) in R Programming - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concept Flow - Handling missing values (drop_na, fill)

Start with data frame

↓

Check for missing values

↓

drop_na()

↓

Remove rows

↓

Cleaned data frame

Start with a data frame, check for missing values, then either remove rows with missing data using drop_na() or fill missing values using fill(), resulting in a cleaned data frame.

Execution Sample

R Programming

library(tidyr)
data <- data.frame(x = c(1, NA, 3), y = c(NA, 2, 3))
data_clean <- drop_na(data)
data_filled <- fill(data, x, .direction = "down")

This code creates a data frame with missing values, then removes rows with any NA using drop_na(), and fills missing x values downward using fill().

Execution Table

Step	Data Frame State	Action	Resulting Data Frame
1	x: 1, NA, 3; y: NA, 2, 3	Initial data frame with missing values	x: 1, NA, 3; y: NA, 2, 3
2	x: 1, NA, 3; y: NA, 2, 3	Apply drop_na() to remove rows with any NA	x: 3; y: 3 (only row 3 remains)
3	x: 1, NA, 3; y: NA, 2, 3	Apply fill(x, .direction = "down") to fill NA in x	x: 1, 1, 3; y: NA, 2, 3
4	x: 1, 1, 3; y: NA, 2, 3	No further action	Final cleaned data frames ready

💡 All missing values handled either by removal or filling; process complete.

Variable Tracker

Variable	Start	After drop_na	After fill	Final
data	x: 1, NA, 3; y: NA, 2, 3	Unchanged	Unchanged	Unchanged
data_clean	NA	x: 3; y: 3	NA	x: 3; y: 3
data_filled	NA	NA	x: 1, 1, 3; y: NA, 2, 3	x: 1, 1, 3; y: NA, 2, 3

Key Moments - 2 Insights

Why does drop_na() remove the second row but keep the third row even though the third row has no missing x value?

How does fill() decide what value to use to replace NA in the x column?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table at step 2. What rows remain after drop_na() is applied?

AFirst and third rows

BOnly the third row

COnly the first row

DAll rows

Concept Snapshot

Handling missing values in R with tidyr:
- drop_na(data): removes rows with any NA
- fill(data, column, .direction): fills NA with nearby values
- drop_na removes rows; fill replaces NA
- Use fill direction 'down' or 'up' to control filling
- Clean data frames ready for analysis

Full Transcript

We start with a data frame containing missing values (NA). We use drop_na() to remove any rows that have missing values in any column, leaving only complete rows. Alternatively, we use fill() to replace missing values in a specific column by carrying forward or backward the last known value. In the example, drop_na() removes the second row because it has NA, and fill() replaces the NA in the second row's x column with the value from the first row. This way, we clean the data by either removing or filling missing values, preparing it for further use.