We use group_by() with summarise() to quickly find summary information for different groups in our data. It helps us see patterns or totals for each group separately.
0
0
summarise() with group_by() in R Programming
Introduction
You want to find the average score for each class in a school.
You need to count how many sales happened in each region.
You want to find the total hours worked by each employee.
You want to see the maximum temperature recorded each day.
Syntax
R Programming
data %>%
group_by(group_column) %>%
summarise(
summary_name = summary_function(column)
)group_by() splits the data into groups based on one or more columns.
summarise() creates a new smaller table with summary values for each group.
Examples
This groups data by
Category and sums the Value column for each group.R Programming
library(dplyr)
data %>%
group_by(Category) %>%
summarise(Total = sum(Value))This finds the average
Score for each Team.R Programming
data %>% group_by(Team) %>% summarise(AverageScore = mean(Score))
This finds the highest temperature recorded on each
Date.R Programming
data %>%
group_by(Date) %>%
summarise(MaxTemp = max(Temperature))Sample Program
This program groups the sales data by Region and calculates the total sales for each region. Then it prints the summary table.
R Programming
library(dplyr) # Sample data frame sales <- data.frame( Region = c("North", "South", "North", "East", "South", "East"), Sales = c(100, 150, 200, 130, 170, 120) ) # Group by Region and sum Sales summary <- sales %>% group_by(Region) %>% summarise(TotalSales = sum(Sales)) print(summary)
OutputSuccess
Important Notes
Always load dplyr library before using group_by() and summarise().
You can group by multiple columns by adding them inside group_by(), like group_by(Col1, Col2).
If you want to keep the grouping after summarising, use summarise(.groups = 'keep').
Summary
group_by() splits data into groups based on columns.
summarise() calculates summary values for each group.
Use them together to get quick summaries for different parts of your data.