Challenge - 5 Problems
Filter Mastery in R
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of filter() with multiple conditions
What is the output of the following R code using
filter() from dplyr?R Programming
library(dplyr) df <- tibble(x = 1:5, y = c(10, 20, 30, 40, 50)) result <- filter(df, x > 2, y < 50) print(result)
Attempts:
2 left
💡 Hint
Remember that
filter() keeps rows where all conditions are TRUE.✗ Incorrect
The filter keeps rows where x > 2 and y < 50. Rows with x=3,y=30 and x=4,y=40 satisfy both conditions.
❓ Predict Output
intermediate2:00remaining
filter() with logical OR condition
What will be the output of this R code using
filter()?R Programming
library(dplyr) df <- tibble(a = c('red', 'blue', 'green', 'red'), b = 1:4) result <- filter(df, a == 'red' | b == 3) print(result)
Attempts:
2 left
💡 Hint
The
| operator means OR, so rows matching either condition are kept.✗ Incorrect
Rows where a is 'red' or b is 3 are kept. That includes rows 1 (red,1), 3 (green,3), and 4 (red,4).
🔧 Debug
advanced2:00remaining
Identify the error in filter() usage
What error will this R code produce when run?
R Programming
library(dplyr) df <- tibble(x = 1:3, y = c(5, 6, 7)) filter(df, x > 1 & y < 7, z == 2)
Attempts:
2 left
💡 Hint
Check if all variables used in filter exist in the data frame.
✗ Incorrect
The variable 'z' does not exist in df, so filter throws an error about 'z' not found.
🧠 Conceptual
advanced2:00remaining
Understanding filter() with NA values
Given this data frame, what will
filter(df, x > 2) return?R Programming
library(dplyr) df <- tibble(x = c(1, 2, NA, 4)) result <- filter(df, x > 2) print(result)
Attempts:
2 left
💡 Hint
Remember how comparisons with NA behave in R.
✗ Incorrect
Comparisons with NA return NA, which filter treats as FALSE, so only rows with x > 2 and not NA are kept.
❓ Predict Output
expert3:00remaining
Complex filter() with grouped data
What is the output of this R code using
filter() on grouped data?R Programming
library(dplyr) df <- tibble(group = c('A', 'A', 'B', 'B'), value = c(10, 20, 30, 40)) result <- df %>% group_by(group) %>% filter(value == max(value)) print(result)
Attempts:
2 left
💡 Hint
filter() inside group_by keeps rows where the condition is TRUE per group.
✗ Incorrect
For each group, filter keeps rows where value equals the maximum value in that group.