0
0
R Programmingprogramming~5 mins

summarise() with group_by() in R Programming - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What does the group_by() function do in R's dplyr package?

group_by() splits the data into groups based on one or more variables. It prepares the data so that you can perform operations on each group separately.

Click to reveal answer
beginner
What is the purpose of summarise() in combination with group_by()?

summarise() creates a summary statistic for each group created by group_by(). For example, it can calculate the average or total for each group.

Click to reveal answer
beginner
How do you calculate the average of a column score for each group of team in a dataframe df?
df %>%
  group_by(team) %>%
  summarise(avg_score = mean(score))

This groups the data by team and calculates the average score for each team.

Click to reveal answer
beginner
What happens if you use summarise() without group_by()?

summarise() will calculate the summary statistic for the entire dataset, not by groups.

Click to reveal answer
intermediate
Can you use multiple summary functions inside summarise() after grouping?

Yes, you can calculate many summaries at once. For example:

df %>%
  group_by(team) %>%
  summarise(avg_score = mean(score), max_score = max(score))
Click to reveal answer
What does group_by() do before using summarise()?
ASplits data into groups based on variables
BDeletes rows from the data
CSorts the data alphabetically
DChanges data types of columns
What will summarise() do if used without group_by()?
AThrow an error
BCalculate summary for each row
CCalculate summary for the whole dataset
DCreate new groups automatically
Which of these is a valid way to calculate the total sales per region using dplyr?
Adf %>% filter(region) %>% summarise(total_sales = sum(sales))
Bdf %>% summarise(total_sales = sum(sales)) %>% group_by(region)
Cdf %>% group_by(sales) %>% summarise(total_region = sum(region))
Ddf %>% group_by(region) %>% summarise(total_sales = sum(sales))
Can you use multiple summary calculations inside one summarise() call?
AYes, you can calculate many summaries at once
BNo, only one summary is allowed
COnly if you use <code>group_by()</code> twice
DOnly with special packages
What does this code do?<br>
df %>% group_by(category) %>% summarise(count = n())
ACalculates mean of category
BCounts rows in each category group
CFilters rows with category
DCreates new categories
Explain how group_by() and summarise() work together to summarize data.
Think about how you might count or average scores for each team separately.
You got /3 concepts.
    Describe a real-life example where you would use group_by() with summarise().
    Imagine you have sales data and want to know total sales per store.
    You got /3 concepts.