0
0
SQLquery~15 mins

Why NULL is not a value in SQL - Why It Works This Way

Choose your learning style9 modes available
Overview - Why NULL is not a value
What is it?
NULL in databases means 'unknown' or 'missing' information, not a regular value like a number or text. It shows that we don't know what the data is, rather than it being zero or empty. NULL is a special marker used to represent this unknown state. It helps databases handle incomplete or missing data safely.
Why it matters
Without NULL, databases would have to guess or use fake values for missing data, which can cause wrong answers and confusion. For example, treating missing ages as zero would make average age calculations incorrect. NULL lets us clearly separate 'no data' from real data, making queries and reports more accurate and trustworthy.
Where it fits
Before learning about NULL, you should understand basic data types and how databases store data. After this, you can learn about how SQL handles NULL in comparisons, functions, and joins, and how to write queries that correctly manage missing information.
Mental Model
Core Idea
NULL is not a value but a marker that means 'we don't know what this is yet'.
Think of it like...
Imagine a form where someone leaves a question blank because they don't know the answer. That blank spot is not zero or empty; it means 'unknown'. NULL in databases works the same way.
┌───────────────┐
│   Database    │
│   Table Row   │
│ ┌───────────┐ │
│ │ Column A  │ │
│ │ Value: 10 │ │
│ └───────────┘ │
│ ┌───────────┐ │
│ │ Column B  │ │
│ │ Value: NULL│ │
│ └───────────┘ │
└───────────────┘

NULL means 'unknown', not zero or empty.
Build-Up - 6 Steps
1
FoundationUnderstanding NULL as Unknown
🤔
Concept: NULL represents unknown or missing data, not a real value.
In a database, NULL means we do not know the value for a field. It is different from zero, empty string, or any other value. For example, if a person's birthdate is unknown, the database stores NULL instead of guessing or leaving it blank.
Result
NULL fields indicate missing or unknown data clearly.
Understanding NULL as 'unknown' prevents confusing it with actual data values like zero or empty.
2
FoundationNULL is Not Equal to Any Value
🤔
Concept: NULL does not behave like normal values in comparisons.
When you compare NULL to anything, even NULL itself, the result is not true or false but unknown. For example, 'NULL = NULL' does not return true because both are unknown, so we can't say they are equal.
Result
Comparisons with NULL return unknown, affecting query results.
Knowing NULL comparisons return unknown helps avoid wrong assumptions in queries.
3
IntermediateHow NULL Affects SQL Conditions
🤔Before reading on: Do you think 'WHERE column = NULL' finds rows with NULL? Commit to your answer.
Concept: SQL treats NULL specially in WHERE clauses and conditions.
Using 'WHERE column = NULL' does not find rows with NULL because NULL is not a value. Instead, SQL uses 'IS NULL' to check for unknown values. This distinction is important to get correct query results.
Result
'WHERE column IS NULL' correctly finds rows with unknown data.
Understanding the special syntax for NULL checks prevents missing or wrong data in queries.
4
IntermediateNULL in Aggregate Functions
🤔Before reading on: Does NULL count as zero in averages and sums? Commit to your answer.
Concept: Aggregate functions ignore NULL values instead of treating them as zero.
Functions like SUM, AVG, COUNT behave differently with NULL. SUM and AVG skip NULLs, so they don't affect totals or averages. COUNT(column) counts only non-NULL values, while COUNT(*) counts all rows.
Result
Aggregates give accurate results by ignoring unknown data.
Knowing how NULL affects aggregates helps write correct reports and summaries.
5
AdvancedThree-Valued Logic with NULL
🤔Before reading on: Is SQL logic with NULL just true or false? Commit to your answer.
Concept: SQL uses three-valued logic: true, false, and unknown (NULL).
Because NULL means unknown, SQL conditions can be true, false, or unknown. Unknown results cause rows to be excluded in WHERE clauses unless explicitly handled. This logic is called three-valued logic and is key to understanding NULL behavior.
Result
Queries behave differently because of the unknown truth value.
Understanding three-valued logic explains why NULL can cause unexpected query filtering.
6
ExpertNULL Storage and Indexing Internals
🤔Before reading on: Do you think NULL takes the same storage space as normal values? Commit to your answer.
Concept: Databases store NULL efficiently and handle indexing specially for NULLs.
Internally, NULL is stored as a special marker, often using a bitmap to track which columns are NULL. Indexes may treat NULLs differently, sometimes excluding them or storing them separately. This affects performance and query plans.
Result
NULL storage and indexing impact database efficiency and query speed.
Knowing NULL internals helps optimize database design and troubleshoot performance.
Under the Hood
NULL is stored as a special flag or marker in the database storage engine, not as a normal data value. When the database engine evaluates expressions, it uses three-valued logic (true, false, unknown) where NULL causes unknown results. This affects how comparisons, joins, and filters work internally. Indexes may store NULL presence separately to optimize lookups.
Why designed this way?
NULL was designed to represent missing or unknown data clearly, avoiding confusion with real values like zero or empty strings. Early database systems needed a way to handle incomplete data without corrupting results. Three-valued logic was introduced to handle the uncertainty NULL brings, balancing correctness and usability.
┌───────────────┐
│   Query Input │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Expression    │
│ Evaluation    │
│ (3-valued)    │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Result: True, │
│ False, Unknown│
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Filter Rows   │
│ (Exclude if   │
│ Unknown)      │
└───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Does 'NULL = NULL' return true or false? Commit to your answer.
Common Belief:NULL equals NULL, so comparing two NULLs returns true.
Tap to reveal reality
Reality:NULL compared to NULL returns unknown, not true, because both are unknown values.
Why it matters:Assuming NULL equals NULL can cause wrong query results and logic errors.
Quick: Does 'WHERE column = NULL' find rows with NULL? Commit to your answer.
Common Belief:Using '=' with NULL works like with other values to find NULL rows.
Tap to reveal reality
Reality:You must use 'IS NULL' to find NULLs; '=' does not work with NULL.
Why it matters:Using '=' with NULL causes queries to miss rows with unknown data.
Quick: Does COUNT(column) count NULL values? Commit to your answer.
Common Belief:COUNT counts all rows, including those with NULL in the column.
Tap to reveal reality
Reality:COUNT(column) counts only non-NULL values; COUNT(*) counts all rows.
Why it matters:Misunderstanding COUNT leads to wrong row counts and reports.
Quick: Is NULL the same as an empty string or zero? Commit to your answer.
Common Belief:NULL is just another value like empty string or zero.
Tap to reveal reality
Reality:NULL means unknown or missing, not a value at all.
Why it matters:Treating NULL as a value causes incorrect calculations and logic.
Expert Zone
1
NULL propagation in expressions means any operation with NULL often results in NULL, which can silently hide data issues.
2
Some databases allow special functions or settings to treat NULLs differently, like treating NULL as zero in calculations, but this can cause subtle bugs.
3
Indexing strategies for NULLs vary; some indexes exclude NULLs, affecting query plans and performance in unexpected ways.
When NOT to use
Avoid relying on NULL for mandatory fields where data must always exist; use constraints instead. For boolean logic, consider using explicit tri-state enums or flags to represent unknown states. In some cases, use default values or sentinel values instead of NULL to simplify logic.
Production Patterns
In real systems, NULL is used to represent optional data, missing user input, or incomplete records. Developers write queries with careful NULL checks using IS NULL and COALESCE functions. Data validation layers often handle NULLs explicitly to avoid surprises. Indexes and query plans are tuned considering NULL distribution.
Connections
Three-Valued Logic
NULL behavior in SQL is a direct application of three-valued logic (true, false, unknown).
Understanding three-valued logic clarifies why NULL comparisons don't behave like normal true/false logic.
Optional Types in Programming
NULL in databases is similar to optional or nullable types in programming languages that represent missing values.
Knowing how programming languages handle optional values helps understand NULL's role and handling in databases.
Incomplete Information Theory
NULL represents incomplete information, a concept studied in information theory and decision making.
Recognizing NULL as incomplete information connects database design to broader theories about uncertainty and knowledge.
Common Pitfalls
#1Using '=' to check for NULL values.
Wrong approach:SELECT * FROM users WHERE age = NULL;
Correct approach:SELECT * FROM users WHERE age IS NULL;
Root cause:Misunderstanding that NULL is not a value and requires special syntax for comparison.
#2Assuming NULL equals NULL in conditions.
Wrong approach:SELECT * FROM orders WHERE discount = NULL OR discount = NULL;
Correct approach:SELECT * FROM orders WHERE discount IS NULL OR discount IS NULL;
Root cause:Confusing NULL as a value rather than an unknown marker.
#3Counting NULLs with COUNT(column).
Wrong approach:SELECT COUNT(age) FROM people;
Correct approach:SELECT COUNT(*) FROM people;
Root cause:Not knowing COUNT(column) excludes NULLs, leading to undercounting.
Key Takeaways
NULL is a special marker meaning 'unknown' or 'missing', not a real data value.
Comparisons with NULL do not return true or false but unknown, requiring special handling in SQL.
To check for NULL, use 'IS NULL' or 'IS NOT NULL' instead of '=' or '<>'.
Aggregate functions ignore NULLs to avoid skewing results, so COUNT(column) counts only known values.
Understanding NULL's three-valued logic and storage helps write correct queries and optimize database performance.