How to Use Count in GroupBy in pandas: Simple Guide
Use
groupby() on your DataFrame to group data by one or more columns, then apply count() to count non-null values in each group. For example, df.groupby('column').count() returns counts of non-null entries per group.Syntax
The basic syntax to count values in groups is:
df.groupby('column').count(): Groups data bycolumnand counts non-null values in each group for all other columns.df.groupby(['col1', 'col2']).count(): Groups by multiple columns and counts non-null values.
The count() function counts only non-null values, so missing data is ignored.
python
df.groupby('column_name').count()Example
This example shows how to group a DataFrame by a column and count non-null values in each group.
python
import pandas as pd data = { 'Team': ['A', 'A', 'B', 'B', 'B', 'C', 'C'], 'Player': ['John', 'Mike', 'Anna', None, 'Tom', 'Sara', 'Bob'], 'Score': [10, 15, 10, 20, None, 5, 7] } df = pd.DataFrame(data) # Group by 'Team' and count non-null values in each group count_df = df.groupby('Team').count() print(count_df)
Output
Player Score
Team
A 2 2
B 2 2
C 2 2
Common Pitfalls
Common mistakes when using count() with groupby() include:
- Expecting
count()to count all rows including nulls. It only counts non-null values. - Using
size()when you want total rows per group, including nulls. - Not specifying the correct column to count if you want counts for a specific column.
Example of wrong vs right usage:
python
# Wrong: expecting count() to count nulls wrong_count = df.groupby('Team')['Score'].count() # Right: use size() to count all rows including nulls right_size = df.groupby('Team').size() print('Count (non-null):\n', wrong_count) print('\nSize (all rows):\n', right_size)
Output
Count (non-null):
Team
A 2
B 2
C 2
Name: Score, dtype: int64
Size (all rows):
Team
A 2
B 3
C 2
dtype: int64
Quick Reference
| Method | Description | Counts Nulls? |
|---|---|---|
| groupby().count() | Counts non-null values per group | No |
| groupby().size() | Counts total rows per group including nulls | No |
| groupby()['col'].count() | Counts non-null values in specific column per group | No |
| groupby()['col'].size() | Counts total rows per group for specific column | No |
Key Takeaways
Use groupby().count() to count non-null values in each group.
count() ignores missing (null) values, so counts may be less than total rows.
Use groupby().size() to count all rows including nulls if needed.
Specify the column inside groupby() if you want counts for a specific column.
Always check if your data has nulls to choose count() or size() correctly.