0
0
R-programmingHow-ToBeginner · 3 min read

How to Use filter in dplyr: Syntax and Examples

Use filter() from the dplyr package to select rows in a data frame that meet specific conditions. Inside filter(), write logical expressions to keep only the rows where the condition is TRUE.
📐

Syntax

The basic syntax of filter() is:

  • data %>% filter(condition): Select rows where condition is TRUE.
  • data: Your data frame or tibble.
  • condition: A logical test using column names.
r
library(dplyr)
data %>% filter(condition)
💻

Example

This example shows how to filter rows where the mpg column is greater than 20 in the built-in mtcars dataset.

r
library(dplyr)
filtered_data <- mtcars %>% filter(mpg > 20)
print(filtered_data)
Output
mpg cyl disp hp drat wt qsec vs am gear carb Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
⚠️

Common Pitfalls

Common mistakes when using filter() include:

  • Using = instead of == inside filter() for comparison (use == for comparison).
  • Not loading dplyr with library(dplyr).
  • Using filter() on non-data frame objects.
  • Confusing filter() with select() (filter chooses rows, select chooses columns).

Example of wrong and right usage:

r
# Wrong: assignment instead of comparison
# mtcars %>% filter(mpg = 20)  # This causes an error

# Right: comparison
mtcars %>% filter(mpg == 20)
Output
mpg cyl disp hp drat wt qsec vs am gear carb Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
📊

Quick Reference

Tips for using filter() effectively:

  • Use multiple conditions separated by commas (AND logic).
  • Use | for OR conditions inside filter().
  • Use parentheses to group conditions.
  • Combine with other dplyr verbs using the pipe %>%.

Key Takeaways

Use filter() to keep rows where conditions are TRUE in your data frame.
Write logical expressions inside filter() using column names directly.
Load dplyr with library(dplyr) before using filter().
Use commas for AND and | for OR conditions inside filter().
filter() selects rows, not columns—use select() for columns.