Overview - COUNT(*) vs COUNT(column) difference

What is it?

COUNT(*) and COUNT(column) are two ways to count rows in a database table. COUNT(*) counts all rows, including those with NULL values in any column. COUNT(column) counts only rows where the specified column is not NULL. Both are used to find how many records meet certain conditions.

Why it matters

Knowing the difference helps you get accurate counts in your data. Without this, you might count rows incorrectly, leading to wrong decisions or reports. For example, counting customers with or without missing phone numbers requires different counts. This distinction ensures your data analysis matches reality.

Where it fits

Before this, you should understand basic SQL SELECT queries and what NULL means in databases. After this, you can learn about filtering with WHERE, grouping with GROUP BY, and aggregate functions like SUM and AVG.

Mental Model

Core Idea

COUNT(*) counts every row, while COUNT(column) counts only rows where that column has a value (not NULL).

Think of it like...

Imagine counting apples in a basket: COUNT(*) is like counting every apple regardless of condition, while COUNT(column) is like counting only apples that are not rotten.

┌───────────────┐
│   Table Rows  │
├───────────────┤
│ Row 1: col=5  │
│ Row 2: col=NULL│
│ Row 3: col=10 │
└───────────────┘

COUNT(*) = 3 (all rows)
COUNT(col) = 2 (only rows with col not NULL)

Build-Up - 6 Steps

1

FoundationUnderstanding COUNT(*) Basics

Concept: COUNT(*) counts all rows in a table or result set, including those with NULLs.

When you write SELECT COUNT(*) FROM table;, the database counts every row returned, no matter what values the columns have. NULL values do not affect this count because it counts rows, not column values.

Result

The output is the total number of rows in the table or query result.

Understanding that COUNT(*) counts rows regardless of content helps you get total row counts quickly and reliably.

2

FoundationUnderstanding COUNT(column) Basics

3

IntermediateEffect of NULL Values on Counts

4

IntermediateUsing COUNT with WHERE Clauses

5

AdvancedCOUNT(column) with Multiple Columns and Expressions

6

ExpertPerformance Differences Between COUNT(*) and COUNT(column)

Under the Hood

COUNT(*) counts rows by scanning the table or index entries without checking column values. COUNT(column) scans the specified column's data to check for NULLs and counts only non-NULL entries. Internally, NULL is a special marker meaning 'no value', so COUNT(column) excludes these during counting.

Why designed this way?

SQL was designed to distinguish between counting all rows and counting meaningful data entries. COUNT(*) provides total row counts quickly, while COUNT(column) allows ignoring missing data. This design supports flexible data analysis and accurate reporting.

┌───────────────┐       ┌───────────────┐
│   Table Rows  │       │  Column Data  │
├───────────────┤       ├───────────────┤
│ Row 1: col=5  │       │ 5             │
│ Row 2: col=NULL│─────▶│ NULL (ignored)│
│ Row 3: col=10 │       │ 10            │
└───────────────┘       └───────────────┘

COUNT(*) counts all rows → 3
COUNT(column) counts non-NULL → 2

Myth Busters - 3 Common Misconceptions

Quick: Does COUNT(column) count rows where the column is NULL? Commit to yes or no.

Common Belief:COUNT(column) counts all rows, just like COUNT(*).

Tap to reveal reality

Quick: Is COUNT(*) slower than COUNT(column) because it counts more? Commit to yes or no.

Common Belief:COUNT(*) is always slower because it counts every row.

Tap to reveal reality

Quick: Does COUNT(column1, column2) count rows where both columns are non-NULL? Commit to yes or no.

Common Belief:You can use COUNT with multiple columns to count rows with non-NULL values in all those columns.

Tap to reveal reality

Expert Zone

1

COUNT(*) can use index-only scans in some databases, returning counts without accessing full row data.

2

COUNT(column) behavior changes if the column is an expression or function, counting non-NULL results of that expression.

3

In some SQL dialects, COUNT(1) behaves like COUNT(*), but this is a convention, not a standard guarantee.

When NOT to use

Avoid COUNT(column) when you want total row counts regardless of NULLs; use COUNT(*) instead. For counting distinct values, use COUNT(DISTINCT column). When performance is critical, test which COUNT variant is faster on your database.

Production Patterns

In real systems, COUNT(*) is used for total record counts like total users. COUNT(column) is used to count valid entries, like number of users with email addresses. Combining COUNT with WHERE filters and GROUP BY clauses helps generate detailed reports.

Connections

NULL Handling in SQL

COUNT(column) behavior depends on how NULL values are treated in SQL.

Understanding NULLs is essential to correctly interpret COUNT(column) results and avoid counting missing data.

Indexing in Databases

COUNT(*) performance can be improved by indexes that allow counting rows without full scans.

Knowing how indexes work helps optimize COUNT queries for large datasets.

Set Theory in Mathematics

COUNT(*) corresponds to counting all elements in a set, while COUNT(column) counts elements with a property (non-NULL).

This connection clarifies why COUNT(column) excludes NULLs, similar to filtering elements in a set.

Common Pitfalls

#1Counting rows with COUNT(column) expecting total rows including NULLs.

Wrong approach:SELECT COUNT(phone_number) FROM customers;

Correct approach:SELECT COUNT(*) FROM customers;

Root cause:Misunderstanding that COUNT(column) excludes NULL values, leading to undercounting.

#2Using COUNT with multiple columns directly.

Wrong approach:SELECT COUNT(first_name, last_name) FROM employees;

Correct approach:SELECT COUNT(*) FROM employees WHERE first_name IS NOT NULL AND last_name IS NOT NULL;

Root cause:Incorrect syntax and misunderstanding of COUNT's single-argument requirement.

#3Assuming COUNT(*) is always slower than COUNT(column).

Wrong approach:Avoid using COUNT(*) for performance reasons without testing.

Correct approach:Test both COUNT(*) and COUNT(column) on your database to choose the faster option.

Root cause:Lack of knowledge about database optimizations for COUNT(*) queries.

Key Takeaways

COUNT(*) counts every row in the result, including those with NULL values in any column.

COUNT(column) counts only rows where the specified column is not NULL, ignoring NULLs.

NULL values affect COUNT(column) but not COUNT(*), so their results can differ.

Performance of COUNT(*) and COUNT(column) depends on database optimizations and indexing.

Understanding these differences helps write accurate and efficient SQL queries.