How to Use filter in dplyr: Syntax and Examples
Use
filter() from the dplyr package to select rows in a data frame that meet specific conditions. Inside filter(), write logical expressions to keep only the rows where the condition is TRUE.Syntax
The basic syntax of filter() is:
data %>% filter(condition): Select rows whereconditionis TRUE.data: Your data frame or tibble.condition: A logical test using column names.
r
library(dplyr)
data %>% filter(condition)Example
This example shows how to filter rows where the mpg column is greater than 20 in the built-in mtcars dataset.
r
library(dplyr) filtered_data <- mtcars %>% filter(mpg > 20) print(filtered_data)
Output
mpg cyl disp hp drat wt qsec vs am gear carb
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
Common Pitfalls
Common mistakes when using filter() include:
- Using
=instead of==insidefilter()for comparison (use==for comparison). - Not loading
dplyrwithlibrary(dplyr). - Using
filter()on non-data frame objects. - Confusing
filter()withselect()(filter chooses rows, select chooses columns).
Example of wrong and right usage:
r
# Wrong: assignment instead of comparison # mtcars %>% filter(mpg = 20) # This causes an error # Right: comparison mtcars %>% filter(mpg == 20)
Output
mpg cyl disp hp drat wt qsec vs am gear carb
Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
Quick Reference
Tips for using filter() effectively:
- Use multiple conditions separated by commas (AND logic).
- Use
|for OR conditions insidefilter(). - Use parentheses to group conditions.
- Combine with other
dplyrverbs using the pipe%>%.
Key Takeaways
Use filter() to keep rows where conditions are TRUE in your data frame.
Write logical expressions inside filter() using column names directly.
Load dplyr with library(dplyr) before using filter().
Use commas for AND and | for OR conditions inside filter().
filter() selects rows, not columns—use select() for columns.