Overview - Keys (primary, candidate, foreign, super)

What is it?

Keys in a database are special attributes or sets of attributes used to identify and relate data uniquely. A primary key uniquely identifies each record in a table. Candidate keys are possible choices for primary keys. Foreign keys link records between tables by referring to primary keys. Super keys are sets of attributes that can uniquely identify records, possibly with extra attributes.

Why it matters

Without keys, databases would struggle to organize, find, and connect data correctly. Imagine a phone book without unique names or numbers; it would be chaotic to find someone. Keys ensure data integrity, prevent duplicates, and enable relationships between tables, making databases reliable and efficient.

Where it fits

Before learning about keys, you should understand basic database tables and attributes. After mastering keys, you can explore database normalization, relationships, and constraints to design efficient databases.

Mental Model

Core Idea

Keys are unique identifiers or links in a database that ensure each record is distinct and related correctly.

Think of it like...

Think of keys like ID cards and family ties: a primary key is like a unique ID number for a person, candidate keys are all possible ID numbers they could have, foreign keys are like references to a family member's ID, and super keys are like a full set of personal details that still uniquely identify them.

┌───────────────┐       ┌───────────────┐
│   Table A     │       │   Table B     │
│───────────────│       │───────────────│
│ PK: ID        │◄──────│ FK: ID        │
│ Name          │       │ Description   │
│ Age           │       └───────────────┘
└───────────────┘

PK = Primary Key
FK = Foreign Key

Candidate Keys: Possible PKs in Table A
Super Key: Any set including PK that uniquely identifies rows

Build-Up - 7 Steps

1

FoundationUnderstanding Database Tables and Records

Concept: Introduce what tables and records are in a database context.

A database stores data in tables, which look like spreadsheets with rows and columns. Each row is a record representing one item or entity, and each column is an attribute describing that item. For example, a table of students might have columns for student ID, name, and age, and each row is one student.

Result

You can see how data is organized in rows and columns, setting the stage for identifying unique records.

Understanding tables and records is essential because keys work by uniquely identifying these rows.

2

FoundationWhat Makes a Key in Databases

3

IntermediateCandidate Keys and Choosing a Primary Key

4

IntermediateSuper Keys: More Than Needed

5

IntermediateForeign Keys and Table Relationships

6

AdvancedEnforcing Data Integrity with Keys

7

ExpertComplex Key Scenarios and Performance Impacts

Under the Hood

At the database engine level, keys create indexes that allow fast searching and retrieval of records. Primary keys automatically create unique indexes, ensuring no duplicates. Foreign keys create constraints that the engine checks on insert, update, or delete operations to maintain valid links between tables. Internally, the database uses these indexes and constraints to optimize queries and enforce data rules.

Why designed this way?

Keys were designed to solve the problem of uniquely identifying data and maintaining relationships in large datasets. Early databases struggled with duplicates and inconsistent references. By enforcing keys and constraints, databases ensure data integrity and efficient access. Alternatives like no keys or loose references led to unreliable and slow systems, so keys became fundamental.

┌───────────────┐       ┌───────────────┐
│   Table A     │       │   Table B     │
│───────────────│       │───────────────│
│ PK: ID  ◄─────┤──────▶│ FK: ID        │
│ Index created │       │ Constraint    │
│ Unique check  │       │ Validates FK  │
└───────────────┘       └───────────────┘

Database engine uses indexes and constraints to enforce keys.

Myth Busters - 4 Common Misconceptions

Quick: Is a foreign key always unique in its table? Commit to yes or no.

Common Belief:Foreign keys must be unique like primary keys.

Tap to reveal reality

Quick: Can a primary key contain null values? Commit to yes or no.

Common Belief:Primary keys can have null values if the data is missing.

Tap to reveal reality

Quick: Does adding more columns to a key always make it better? Commit to yes or no.

Common Belief:Adding more columns to a key always improves uniqueness and is good practice.

Tap to reveal reality

Quick: Are candidate keys always obvious and easy to find? Commit to yes or no.

Common Belief:Candidate keys are always clear and simple to identify.

Tap to reveal reality

Expert Zone

1

Composite keys can cause subtle bugs if one part changes unexpectedly, breaking relationships silently.

2

Foreign key constraints can be deferred or immediate, affecting transaction behavior and locking.

3

Some databases allow surrogate keys (artificial keys) instead of natural candidate keys for performance and simplicity.

When NOT to use

Avoid using natural composite keys when a simple surrogate key (like an auto-increment ID) suffices, especially in large or distributed systems. Also, do not rely solely on foreign keys for data integrity in NoSQL or denormalized databases; use application logic instead.

Production Patterns

In real systems, surrogate primary keys are common for simplicity. Foreign keys enforce relationships but are sometimes omitted in high-performance systems with eventual consistency. Indexes on keys are tuned for query patterns. Cascading updates/deletes are used carefully to avoid data loss.

Connections

Normalization

Keys are fundamental to normalization rules that organize data efficiently.

Understanding keys helps grasp how normalization reduces redundancy and improves data integrity.

Object-Oriented Programming (OOP) References

Foreign keys in databases relate to object references or pointers in OOP.

Knowing foreign keys clarifies how objects link to each other in memory and databases.

Supply Chain Management

Keys in databases mirror unique identifiers and relationships in tracking goods and orders.

Recognizing keys helps understand how complex systems track and relate items reliably.

Common Pitfalls

#1Using a non-unique column as a primary key.

Wrong approach:CREATE TABLE Employees (Name VARCHAR(50) PRIMARY KEY, Age INT);

Correct approach:CREATE TABLE Employees (EmployeeID INT PRIMARY KEY, Name VARCHAR(50), Age INT);

Root cause:Misunderstanding that primary keys must be unique and stable identifiers.

#2Allowing null values in primary key columns.

Wrong approach:CREATE TABLE Products (ProductCode VARCHAR(20) PRIMARY KEY NULL, Description TEXT);

Correct approach:CREATE TABLE Products (ProductCode VARCHAR(20) PRIMARY KEY NOT NULL, Description TEXT);

Root cause:Not knowing that primary keys cannot contain nulls because they identify records uniquely.

#3Not defining foreign key constraints, leading to orphan records.

Wrong approach:CREATE TABLE Orders (OrderID INT PRIMARY KEY, CustomerID INT);

Correct approach:CREATE TABLE Orders (OrderID INT PRIMARY KEY, CustomerID INT, FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID));

Root cause:Ignoring referential integrity and assuming application code will handle relationships.

Key Takeaways

Keys uniquely identify records and maintain relationships in databases, ensuring data is organized and reliable.

Primary keys are chosen from candidate keys and must be unique and not null to serve as stable identifiers.

Foreign keys link tables and enforce referential integrity, allowing databases to model real-world relationships.

Super keys include candidate keys but may have extra attributes; minimal keys are preferred for efficiency.

Understanding keys deeply helps design better databases, avoid common mistakes, and optimize performance.