How to Use count in dplyr for Data Summarization
In
dplyr, use count() to count the number of rows for each group in a data frame. It groups data by one or more columns and returns a new data frame with counts. The syntax is count(data, group_column).Syntax
The basic syntax of count() in dplyr is:
data: Your data frame or tibble.vars: One or more columns to group by (unquoted).wt: Optional, a column to weight counts instead of counting rows.sort: Optional, ifTRUE, sorts the result by count descending.
r
count(data, ..., wt = NULL, sort = FALSE)
Example
This example shows how to count the number of cars by the number of cylinders in the built-in mtcars dataset.
r
library(dplyr) # Count cars by cylinders mtcars %>% count(cyl)
Output
cyl n
1 4 11
2 6 7
3 8 14
Common Pitfalls
One common mistake is forgetting to load dplyr before using count(). Another is passing column names as strings instead of unquoted names. Also, using count() without grouping columns will just count total rows.
Wrong usage example:
count(mtcars, "cyl") # Incorrect: column name as string
Correct usage:
count(mtcars, cyl) # Correct: unquoted column name
Quick Reference
| Argument | Description |
|---|---|
| data | Data frame or tibble to count rows from |
| ... | One or more unquoted columns to group by |
| wt | Optional column to weight counts instead of counting rows |
| sort | Logical, if TRUE sorts output by count descending |
Key Takeaways
Use count() to quickly group and count rows by one or more columns in dplyr.
Pass column names unquoted inside count(), not as strings.
Load dplyr library before using count() to avoid errors.
Use the sort = TRUE argument to get counts sorted from highest to lowest.
You can weight counts with the wt argument instead of simple row counts.