0
0
R-programmingHow-ToBeginner · 3 min read

How to Use complete.cases in R for Missing Data Handling

In R, use complete.cases() to find rows in a data frame or vector that have no missing values (NA). It returns a logical vector indicating which rows are complete, allowing you to filter out incomplete data easily.
📐

Syntax

The basic syntax of complete.cases() is:

  • complete.cases(x)

Where x can be a vector, matrix, or data frame. It returns a logical vector of the same length as the number of rows (or elements) in x, with TRUE for rows without any NA values and FALSE otherwise.

r
complete.cases(x)
💻

Example

This example shows how to use complete.cases() to filter out rows with missing values from a data frame.

r
data <- data.frame(
  name = c("Alice", "Bob", "Carol", "David"),
  age = c(25, NA, 30, 22),
  score = c(88, 92, NA, 75)
)

# Show original data
print(data)

# Use complete.cases to find rows without NA
complete_rows <- complete.cases(data)
print(complete_rows)

# Filter data to keep only complete rows
clean_data <- data[complete_rows, ]
print(clean_data)
Output
name age score 1 Alice 25 88 2 Bob NA 92 3 Carol 30 NA 4 David 22 75 [1] TRUE FALSE FALSE TRUE name age score 1 Alice 25 88 4 David 22 75
⚠️

Common Pitfalls

One common mistake is to assume complete.cases() removes rows automatically. It only returns a logical vector; you must subset your data explicitly.

Another pitfall is using it on a vector with missing values expecting a data frame output.

r
data <- data.frame(x = c(1, NA, 3), y = c(NA, 2, 3))

# Wrong: just calling complete.cases does not remove rows
complete.cases(data)

# Right: subset data using complete.cases
clean_data <- data[complete.cases(data), ]
print(clean_data)
Output
[1] FALSE FALSE TRUE x y 3 3 3
📊

Quick Reference

Tips for using complete.cases():

  • Use it to identify rows without any NA values.
  • Subset your data frame with it to remove incomplete rows.
  • Works on vectors, matrices, and data frames.
  • Returns a logical vector matching the number of rows or elements.

Key Takeaways

Use complete.cases(x) to get a logical vector marking rows without missing values.
Subset your data frame with complete.cases to remove rows containing NA.
complete.cases works on vectors, matrices, and data frames alike.
It does not remove rows by itself; you must subset your data explicitly.
Check the logical output before subsetting to understand which rows are complete.