0
0
PostgreSQLquery~15 mins

INSERT ON CONFLICT (upsert) in PostgreSQL - Deep Dive

Choose your learning style9 modes available
Overview - INSERT ON CONFLICT (upsert)
What is it?
INSERT ON CONFLICT, also known as upsert, is a way to insert a new row into a database table but update an existing row if a conflict occurs, such as a duplicate key. It helps avoid errors when trying to insert data that might already exist. This feature lets you combine insert and update actions into a single command.
Why it matters
Without upsert, you would need to write separate queries to check if a row exists before inserting or updating, which is slower and more error-prone. Upsert makes data handling simpler and more efficient, especially when multiple users or processes might change the data at the same time. It prevents conflicts and keeps your data consistent.
Where it fits
Before learning upsert, you should understand basic SQL INSERT and UPDATE commands and how unique constraints or primary keys work. After mastering upsert, you can explore advanced data integrity techniques, transaction control, and performance tuning in PostgreSQL.
Mental Model
Core Idea
Upsert is a single command that tries to insert data but updates existing data if a conflict happens.
Think of it like...
Imagine mailing a letter to a friend’s house. If they moved (conflict), instead of losing the letter, the post office updates the address and delivers it correctly.
┌───────────────┐
│ Try to Insert │
└──────┬────────┘
       │
       ▼
┌───────────────┐   Conflict?   ┌───────────────┐
│ Insert Success│─────────────▶│ Update Record │
└───────────────┘              └───────────────┘
Build-Up - 7 Steps
1
FoundationBasic INSERT Command
🤔
Concept: Learn how to add new rows to a table using INSERT.
The INSERT command adds new data rows to a table. For example, INSERT INTO users (id, name) VALUES (1, 'Alice'); adds a user with id 1 and name Alice.
Result
A new row is added to the table if the id 1 does not already exist.
Understanding how INSERT works is essential because upsert builds on this to handle conflicts.
2
FoundationUnique Constraints and Conflicts
🤔
Concept: Understand what causes conflicts during insertion.
Tables often have unique constraints like primary keys. If you try to insert a row with a key that already exists, the database raises a conflict error to prevent duplicates.
Result
An error occurs if you insert a duplicate key, stopping the operation.
Knowing what triggers conflicts helps you see why upsert is needed to handle these cases gracefully.
3
IntermediateINSERT ON CONFLICT Syntax
🤔Before reading on: do you think ON CONFLICT can only update existing rows or can it also ignore conflicts? Commit to your answer.
Concept: Learn the syntax to handle conflicts by updating or ignoring them.
PostgreSQL lets you write INSERT ... ON CONFLICT (column) DO UPDATE SET ... to update existing rows on conflict, or DO NOTHING to skip insertion if a conflict occurs.
Result
The command inserts a new row or updates the existing row without error.
Understanding the syntax unlocks the power to combine insert and update in one step.
4
IntermediateUsing DO UPDATE to Modify Rows
🤔Before reading on: do you think DO UPDATE can change all columns or only the conflicting key? Commit to your answer.
Concept: Learn how to specify which columns to update when a conflict happens.
You can set any columns to new values using DO UPDATE SET column = EXCLUDED.column, where EXCLUDED refers to the new data you tried to insert.
Result
The existing row is updated with new values from the insert attempt.
Knowing how to update specific columns lets you control data changes precisely during conflicts.
5
IntermediateUsing DO NOTHING to Skip Conflicts
🤔
Concept: Learn how to ignore conflicts and keep existing data unchanged.
Instead of updating, you can write ON CONFLICT (column) DO NOTHING to skip inserting if a conflict occurs, leaving the existing row as is.
Result
No error occurs, and the existing row remains unchanged.
This option is useful when you want to avoid duplicates but don't want to change existing data.
6
AdvancedHandling Multiple Conflicts and Constraints
🤔Before reading on: can ON CONFLICT handle conflicts on multiple columns at once? Commit to your answer.
Concept: Learn how to specify conflict targets and handle complex constraints.
You can specify one or more columns or constraint names in ON CONFLICT to target specific conflicts. For example, ON CONFLICT ON CONSTRAINT unique_email DO UPDATE ... handles conflicts on a named unique constraint.
Result
Conflicts on specified constraints are handled correctly with update or ignore.
Understanding conflict targets helps you write precise upsert commands for complex tables.
7
ExpertPerformance and Concurrency Considerations
🤔Before reading on: do you think upsert always performs better than separate insert and update? Commit to your answer.
Concept: Learn how upsert behaves under heavy load and concurrent access.
Upsert reduces race conditions by combining insert and update atomically, but it can cause contention on indexes. Proper indexing and understanding locking behavior are important for performance.
Result
Upsert ensures data consistency but may need tuning for high concurrency.
Knowing the internal locking and performance tradeoffs helps you use upsert effectively in production.
Under the Hood
When you run INSERT ON CONFLICT, PostgreSQL tries to insert the row. If a unique constraint violation occurs, it catches the conflict and either updates the existing row or skips insertion based on your command. This happens atomically inside a transaction to prevent race conditions.
Why designed this way?
Before upsert, developers had to write complex code with separate SELECT, INSERT, and UPDATE statements, which caused race conditions and inefficiency. PostgreSQL introduced ON CONFLICT to simplify this common pattern and ensure atomicity and consistency.
┌───────────────┐
│ Start Insert  │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Check Unique  │
│ Constraint    │
└──────┬────────┘
       │ No Conflict
       ▼
┌───────────────┐
│ Insert Row    │
└───────────────┘
       │
       ▼
     Done

If Conflict:
       │
       ▼
┌───────────────┐
│ ON CONFLICT   │
│ DO UPDATE or  │
│ DO NOTHING    │
└──────┬────────┘
       │
       ▼
     Done
Myth Busters - 4 Common Misconceptions
Quick: Does ON CONFLICT always update all columns of the existing row? Commit yes or no.
Common Belief:ON CONFLICT updates the entire existing row with the new data automatically.
Tap to reveal reality
Reality:ON CONFLICT only updates the columns you explicitly specify in the DO UPDATE SET clause.
Why it matters:Assuming all columns update can cause unexpected data loss or stale data if you forget to include some columns.
Quick: Can ON CONFLICT handle conflicts on any column without unique constraints? Commit yes or no.
Common Belief:You can use ON CONFLICT on any column, even if it is not unique or indexed.
Tap to reveal reality
Reality:ON CONFLICT requires a unique constraint or index to detect conflicts; it cannot handle conflicts on non-unique columns.
Why it matters:Trying to use ON CONFLICT without proper constraints will cause errors or unexpected behavior.
Quick: Does DO NOTHING mean the insert silently fails? Commit yes or no.
Common Belief:DO NOTHING causes the insert to fail silently and rollback the transaction.
Tap to reveal reality
Reality:DO NOTHING skips the conflicting insert but does not cause an error or rollback; the transaction continues.
Why it matters:Misunderstanding this can lead to missing data inserts without realizing it.
Quick: Is upsert always faster than separate insert and update? Commit yes or no.
Common Belief:Upsert is always faster and better than running separate insert and update queries.
Tap to reveal reality
Reality:Upsert is atomic and safer but can be slower under heavy concurrency due to locking and index contention.
Why it matters:Assuming upsert is always best can lead to performance issues in high-load systems.
Expert Zone
1
Upsert can cause deadlocks if multiple transactions try to upsert the same rows in different orders, requiring careful transaction design.
2
Using EXCLUDED in DO UPDATE lets you reference the proposed new row, enabling complex conditional updates.
3
Partial unique indexes can be targeted in ON CONFLICT, allowing fine-grained control over conflict detection.
When NOT to use
Avoid upsert when you need to perform complex conditional logic before deciding to insert or update; in such cases, separate SELECT and conditional statements may be better. Also, for bulk inserts where conflicts are rare, batch inserts with error handling might be more efficient.
Production Patterns
In real systems, upsert is used for syncing data from external sources, caching layers, and event processing where data may arrive out of order or duplicated. It is combined with transactions and proper indexing to maintain consistency and performance.
Connections
Transactions
Upsert relies on transactions to ensure atomicity and consistency during insert or update operations.
Understanding transactions helps grasp how upsert prevents race conditions and partial updates.
Unique Constraints
Upsert depends on unique constraints to detect conflicts and decide when to update existing rows.
Knowing how unique constraints work clarifies why upsert can only handle conflicts on constrained columns.
Conflict Resolution in Distributed Systems
Upsert is a form of conflict resolution similar to how distributed systems reconcile data changes from multiple sources.
Recognizing this connection helps understand upsert as a practical solution to data consistency challenges beyond databases.
Common Pitfalls
#1Trying to use ON CONFLICT without a unique constraint on the target column.
Wrong approach:INSERT INTO users (email, name) VALUES ('a@example.com', 'Alice') ON CONFLICT (email) DO UPDATE SET name = EXCLUDED.name;
Correct approach:CREATE UNIQUE INDEX unique_email ON users(email); INSERT INTO users (email, name) VALUES ('a@example.com', 'Alice') ON CONFLICT (email) DO UPDATE SET name = EXCLUDED.name;
Root cause:ON CONFLICT requires a unique constraint or index to detect conflicts; missing this causes errors.
#2Using DO UPDATE without specifying columns to update.
Wrong approach:INSERT INTO products (id, price) VALUES (1, 100) ON CONFLICT (id) DO UPDATE;
Correct approach:INSERT INTO products (id, price) VALUES (1, 100) ON CONFLICT (id) DO UPDATE SET price = EXCLUDED.price;
Root cause:DO UPDATE must specify how to update the existing row; omitting SET clause causes syntax errors.
#3Assuming DO NOTHING inserts the row anyway.
Wrong approach:INSERT INTO orders (id, status) VALUES (10, 'new') ON CONFLICT (id) DO NOTHING;
Correct approach:INSERT INTO orders (id, status) VALUES (10, 'new') ON CONFLICT (id) DO UPDATE SET status = EXCLUDED.status;
Root cause:DO NOTHING skips insertion on conflict; it does not insert or update the row.
Key Takeaways
INSERT ON CONFLICT (upsert) lets you insert new rows or update existing ones in a single atomic command.
It relies on unique constraints to detect conflicts and requires explicit instructions on how to update or ignore conflicts.
Upsert simplifies data handling, prevents race conditions, and improves efficiency compared to separate insert and update queries.
Understanding its syntax, behavior, and performance implications is essential for building reliable and scalable database applications.
Misusing upsert can cause data loss, errors, or performance problems, so careful design and testing are important.