Challenge - 5 Problems
Summary Statistics Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of summary statistics with NA values
What is the output of the following R code that calculates summary statistics on a vector with missing values?
R Programming
x <- c(5, 7, NA, 3, 9, NA, 4) summary(x)
Attempts:
2 left
💡 Hint
Remember that summary() shows count of NA values separately.
✗ Incorrect
The summary() function in R calculates statistics ignoring NA values for numeric vectors but reports the count of NA values separately.
❓ data_output
intermediate1:30remaining
Number of rows after filtering by summary statistics
Given a data frame df with a numeric column 'score', how many rows remain after filtering out rows where 'score' is below the median?
R Programming
df <- data.frame(score = c(10, 20, 15, 30, 25)) median_score <- median(df$score) df_filtered <- df[df$score >= median_score, ] nrow(df_filtered)
Attempts:
2 left
💡 Hint
Calculate the median and count how many values are equal or above it.
✗ Incorrect
Median is 20; values >= 20 are 20, 30, 25, so 3 rows remain.
❓ visualization
advanced2:00remaining
Interpreting boxplot summary statistics
Which statement correctly describes the boxplot summary statistics shown for a numeric vector in R?
R Programming
x <- c(2, 4, 5, 6, 8, 9, 10, 11, 12) boxplot(x)
Attempts:
2 left
💡 Hint
Recall median is the middle value; IQR is Q3 minus Q1.
✗ Incorrect
Median is the middle value 8; Q1 is 5, Q3 is 11, so IQR = 11 - 5 = 6.
🧠 Conceptual
advanced1:30remaining
Understanding skewness from summary statistics
If a numeric dataset has mean > median > mode, what can you infer about its skewness?
Attempts:
2 left
💡 Hint
Think about how mean, median, and mode relate in skewed data.
✗ Incorrect
When mean > median > mode, the tail is longer on the right side, indicating positive skew.
🔧 Debug
expert2:00remaining
Identify the error in summary statistics calculation
What error will this R code produce when calculating the mean of a data frame column?
R Programming
df <- data.frame(a = c(1, 2, 3), b = c('x', 'y', 'z')) mean(df)
Attempts:
2 left
💡 Hint
mean() expects a numeric vector, not a data frame.
✗ Incorrect
mean() cannot operate on a data frame directly; it requires a numeric vector.