====== Best Practices — Solving Transaction Anomalies with Default Isolation Level ====== > **Core idea:** Don't change the default isolation level globally. > Use `READ COMMITTED` (default) + the right lock technique per situation. > This gives you safety where needed without sacrificing performance everywhere. ---- ===== Table of Contents ===== - [[#1_golden_rules|1. Golden Rules]] - [[#2_dirty_read_already_solved_by_default|2. Dirty Read — Already Solved by Default]] - [[#3_non-repeatable_read_use_select_for_share|3. Non-repeatable Read — Use SELECT FOR SHARE]] - [[#4_phantom_read_use_select_for_share_range_lock|4. Phantom Read — Use SELECT FOR SHARE + Range Lock]] - [[#5_lost_update_use_atomic_update_or_select_for_update|5. Lost Update — Use Atomic Update or SELECT FOR UPDATE]] - [[#6_write_skew_use_select_for_update_on_all_involved_rows|6. Write Skew — Use SELECT FOR UPDATE on All Involved Rows]] - [[#7_quick_reference|7. Quick Reference]] ---- ===== 1. Golden Rules ===== Rule 1 — Keep READ COMMITTED as the default for everything. Rule 2 — Add SELECT FOR UPDATE only where race conditions can occur. Rule 3 — Use atomic updates (SET col = col + n) for simple counters. Rule 4 — Use UNIQUE constraints to enforce uniqueness at the database level. Rule 5 — Only raise to SERIALIZABLE for genuinely complex invariants. Rule 6 — Always lock rows in a consistent order to avoid deadlocks. Rule 7 — Always wrap transactions in try/catch and rollback on error. ---- ===== 2. Dirty Read — Already Solved by Default ===== ==== The Problem ==== Transaction A reads data that Transaction B modified but **has not yet committed**. If B rolls back, A was working with a value that never officially existed. balance = 100 Txn B: UPDATE balance = 200 ← not committed Txn A: SELECT balance → reads 200 ← DIRTY Txn B: ROLLBACK → balance back to 100 Txn A: already used 200 — WRONG ==== Why It Is Already Solved ==== Both **PostgreSQL** and **MySQL** default to an isolation level that prevents dirty reads: ^ Database ^ Default Level ^ Dirty Read ^ | PostgreSQL | READ COMMITTED | ✅ Prevented | | MySQL | REPEATABLE READ | ✅ Prevented | | SQL Server | READ COMMITTED | ✅ Prevented | At `READ COMMITTED`, Transaction A can **only see committed data**. B's uncommitted change stays invisible to everyone else until B commits. ==== Best Practice — Do Nothing Extra ==== -- PostgreSQL / SQL Server: already safe, no changes needed BEGIN; SELECT balance FROM accounts WHERE id = 1; -- Always returns the last committed value -- B's uncommitted changes are never visible here COMMIT; ==== The Only Rule ==== Never use READ UNCOMMITTED in production. It is the only isolation level where dirty reads can happen. -- NEVER do this in production SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; ---- ===== 3. Non-repeatable Read — Use SELECT FOR SHARE ===== ==== The Problem ==== Within the same transaction, reading **the same row twice** returns **different values** because another transaction updated and committed that row in between. balance = 1000 Txn A: SELECT balance → 1000 Txn B: UPDATE balance = 500, COMMIT Txn A: SELECT balance → 500 ← different result, same transaction! **Real scenario:** A generates an audit report — reads balance twice to cross-check. The two reads disagree because someone transferred money in between. ==== Why It Happens at READ COMMITTED ==== At `READ COMMITTED`, each `SELECT` statement gets a **fresh snapshot** of committed data. So the second `SELECT` sees B's committed update — result differs from the first read. ==== Best Practice — SELECT FOR SHARE ==== Lock the row on first read so no one can update it until your transaction finishes. BEGIN; -- Lock the row for reading — others can still read, but cannot update SELECT balance FROM accounts WHERE id = 1 FOR SHARE; -- ← key addition -- → 1000 -- Txn B tries UPDATE balance = 500 -- → B is BLOCKED here, must wait for A to finish -- ... do calculations, generate report ... SELECT balance FROM accounts WHERE id = 1 FOR SHARE; -- → still 1000 ✓ (B could not commit while A held the share lock) COMMIT; -- B is now unblocked and can proceed ==== FOR SHARE vs FOR UPDATE ==== ^ ^ FOR SHARE ^ FOR UPDATE ^ | Multiple readers? | ✅ Yes, many transactions can hold it simultaneously | ❌ No, only one at a time | | Blocks writers? | ✅ Yes | ✅ Yes | | Blocks other readers? | ❌ No | ✅ Yes | | Use when | You only need to read consistently | You plan to update the row | -- Use FOR SHARE when you only READ and want consistency SELECT balance FROM accounts WHERE id = 1 FOR SHARE; -- Use FOR UPDATE when you READ then WRITE SELECT balance FROM accounts WHERE id = 1 FOR UPDATE; UPDATE accounts SET balance = balance - 100 WHERE id = 1; ==== Real-world Example — Audit Report ==== BEGIN; -- Lock all accounts being audited SELECT id, balance FROM accounts WHERE department = 'engineering' FOR SHARE; -- Calculate total from first read -- ... application logic ... -- Read again to verify — guaranteed same values SELECT id, balance FROM accounts WHERE department = 'engineering' FOR SHARE; -- All balances identical to first read ✓ COMMIT; ---- ===== 4. Phantom Read — Use SELECT FOR SHARE + Range Lock ===== ==== The Problem ==== Within the same transaction, running **the same query** twice returns a **different number of rows** because another transaction inserted or deleted a matching row in between. orders table: 3 rows with amount > 100 Txn A: SELECT COUNT(*) WHERE amount > 100 → 3 Txn B: INSERT order amount=500, COMMIT Txn A: SELECT COUNT(*) WHERE amount > 100 → 4 ← phantom row appeared! **Real scenario:** A calculates the total value of large orders for a report. Between two reads, a new large order is inserted — the totals no longer match. ==== Why It Happens at READ COMMITTED ==== Each `SELECT` sees a fresh snapshot. A newly committed row from B becomes visible on A's second `SELECT` even though A's transaction started before B's insert. ==== Best Practice — FOR SHARE on the Range ==== Lock the range of rows matching your condition. No one can insert into that range while your transaction holds the lock. BEGIN; -- Lock all rows matching the condition SELECT COUNT(*), SUM(amount) FROM orders WHERE amount > 100 FOR SHARE; -- ← locks the range -- → 3 rows, total = 1500 -- Txn B tries to INSERT order amount=500 -- → B is BLOCKED, cannot insert into this range -- ... application logic ... SELECT COUNT(*), SUM(amount) FROM orders WHERE amount > 100 FOR SHARE; -- → still 3 rows, total = 1500 ✓ COMMIT; -- B is now unblocked and inserts successfully ==== Important — Index Required for Range Lock ==== -- Without an index, the database may lock the ENTIRE table -- instead of just the matching range — very bad for performance -- Always ensure an index exists on the column used in WHERE CREATE INDEX idx_orders_amount ON orders(amount); -- Now FOR SHARE only locks rows where amount > 100 -- Not the whole table ==== Real-world Example — Financial Report ==== BEGIN; -- Lock the exact range needed for the report SELECT category, COUNT(*) AS order_count, SUM(amount) AS total_amount FROM orders WHERE created_at BETWEEN '2026-01-01' AND '2026-03-31' AND status = 'completed' FOR SHARE; -- → consistent results, no phantom rows can appear COMMIT; ---- ===== 5. Lost Update — Use Atomic Update or SELECT FOR UPDATE ===== ==== The Problem ==== Two transactions both read the same value, both modify it, and **one overwrites the other**. The first update is silently lost. qty = 10 Txn A: read qty = 10 Txn B: read qty = 10 Txn B: write qty = 10 - 3 = 7, COMMIT Txn A: write qty = 10 - 2 = 8, COMMIT ← overwrites B's result! Final: qty = 8 (should be 5) ← B's deduction of 3 is LOST **Real scenarios:** Inventory deduction, bank transfers, like/view counters. ==== Solution A — Atomic Update (simplest, use when possible) ==== Combine read + calculate + write into **one SQL statement**. The database handles it atomically — no race condition possible. -- WRONG: read in application, calculate, then write back -- (race condition between read and write) current_qty = SELECT qty FROM products WHERE id = 1; -- 10 UPDATE products SET qty = current_qty - 2 WHERE id = 1; -- CORRECT: let the database do it atomically UPDATE products SET qty = qty - 2 WHERE id = 1; -- Database reads, subtracts, and writes in one atomic step -- No other transaction can interfere in between -- More examples of atomic updates -- Counter: increment page views UPDATE pages SET views = views + 1 WHERE id = 42; -- Balance: add interest UPDATE accounts SET balance = balance * 1.05 WHERE type = 'savings'; -- Stock: reserve items UPDATE products SET reserved = reserved + 1 WHERE id = 7; ==== When to use Atomic Update ==== * Simple arithmetic only (add, subtract, multiply) * No need to check the current value before writing ==== When NOT to use (use FOR UPDATE instead) ==== * Need to check balance before deducting (e.g. "is there enough stock?") * Need the current value in application logic before writing ==== Solution B — SELECT FOR UPDATE (when you need to check before writing) ==== Lock the row on read so no other transaction can touch it until you commit. BEGIN; -- Step 1: read AND lock the row SELECT qty FROM products WHERE id = 1 FOR UPDATE; -- ← row is now locked -- → qty = 10 -- Step 2: check business rule in application code IF qty < 2 THEN ROLLBACK; RAISE 'Insufficient stock'; END IF; -- Step 3: write safely — no one else could have changed qty UPDATE products SET qty = qty - 2 WHERE id = 1; COMMIT; -- Lock released — other transactions can now proceed ==== What happens to concurrent transactions ==== Txn A: SELECT qty FOR UPDATE → locks row, reads 10 Txn B: SELECT qty FOR UPDATE → BLOCKED, waiting for A Txn A: UPDATE qty = 8, COMMIT lock released Txn B: unblocked, reads qty = 8 (A's result) Txn B: UPDATE qty = 5, COMMIT ✓ correct result ==== Real-world Example — Bank Transfer ==== -- Transfer $100 from account 1 to account 2 BEGIN; -- Lock BOTH accounts before reading -- Always lock in ascending ID order to prevent deadlocks SELECT id, balance FROM accounts WHERE id IN (1, 2) ORDER BY id -- ← consistent lock order, prevents deadlock FOR UPDATE; -- Check sender has enough balance (in application code) -- IF balance_of_1 < 100 THEN ROLLBACK -- Deduct from sender UPDATE accounts SET balance = balance - 100 WHERE id = 1; -- Add to receiver UPDATE accounts SET balance = balance + 100 WHERE id = 2; COMMIT; ==== Deadlock prevention — always lock in consistent order ==== -- RISK OF DEADLOCK: inconsistent order -- Txn A locks account 1, then needs account 2 -- Txn B locks account 2, then needs account 1 -- → both wait forever -- SAFE: always lock by ascending ID SELECT id, balance FROM accounts WHERE id IN (1, 2) ORDER BY id FOR UPDATE; ==== Solution C — Optimistic Lock (when conflicts are rare) ==== No real lock — add a `version` column. When writing, check that version hasn't changed. If it has, someone else updated first — retry. -- Schema ALTER TABLE products ADD COLUMN version INT DEFAULT 0; -- Step 1: read with version SELECT qty, version FROM products WHERE id = 1; -- → qty=10, version=5 -- Step 2: write with version check UPDATE products SET qty = qty - 2, version = version + 1 WHERE id = 1 AND version = 5; -- Check rows affected: -- 1 row → success -- 0 rows → conflict, someone else updated first → retry from step 1 ==== When to use Optimistic Lock ==== * Conflicts are rare (most updates succeed on the first try) * High-concurrency reads, occasional writes * REST APIs, user profile updates, document editing ==== Choosing the Right Solution ==== Do you need to check the current value before writing? │ ├── No → Use ATOMIC UPDATE │ UPDATE table SET col = col + n WHERE id = x │ └── Yes → Are conflicts frequent (many users, same row)? │ ├── Yes → Use SELECT FOR UPDATE │ └── No → Use OPTIMISTIC LOCK (version column) ---- ===== 6. Write Skew — Use SELECT FOR UPDATE on All Involved Rows ===== ==== The Problem ==== Two transactions both read a **shared condition**, both decide it is safe to proceed, and both write to **different rows**. Rule: at least 1 doctor must be on call Alice on_call=true, Bob on_call=true → count = 2 Txn A (Alice): SELECT COUNT(*) on_call=true → 2 Txn B (Bob): SELECT COUNT(*) on_call=true → 2 Txn A: UPDATE Alice on_call=false Txn B: UPDATE Bob on_call=false Final: count = 0 ← RULE VIOLATED ==== Best Practice — SELECT FOR UPDATE on All Rows Involved in the Condition ==== BEGIN; SELECT id, name FROM doctors WHERE on_call = true FOR UPDATE; UPDATE doctors SET on_call = false WHERE id = 1; COMMIT; ==== UNIQUE Constraint Alternative ==== ALTER TABLE bookings ADD CONSTRAINT uq_room_slot UNIQUE (room_id, slot); ---- ===== 7. Quick Reference ===== ==== Solution per anomaly ==== ^ Anomaly ^ Default safe? ^ Best practice solution ^ | Dirty read | ✅ Yes | Nothing needed — default `READ COMMITTED` prevents it | | Non-repeatable read | ❌ No | `SELECT ... FOR SHARE` | | Phantom read | ❌ No | `SELECT ... FOR SHARE` on the range | | Lost update (simple) | ❌ No | Atomic update | | Lost update (with check) | ❌ No | `SELECT ... FOR UPDATE` | | Lost update (low conflict) | ❌ No | Optimistic lock | | Write skew | ❌ No | `SELECT ... FOR UPDATE` on all involved rows | ==== Lock cheat sheet ==== -- Read consistently SELECT ... FOR SHARE; -- Read then write SELECT ... FOR UPDATE; -- Atomic write UPDATE table SET col = col + n WHERE id = x; -- Prevent duplicate inserts ALTER TABLE t ADD CONSTRAINT uq_name UNIQUE (col1, col2); ==== Deadlock prevention checklist ==== ✓ Always lock multiple rows in the same order ✓ Keep transactions short ✓ Do not lock more rows than necessary ✓ Do not perform external calls inside a transaction ✓ Always retry on deadlock/serialization errors ==== Transaction template — production-ready ==== BEGIN; -- 1. Lock what you need SELECT col FROM table WHERE id = x FOR UPDATE; -- 2. Check business rules -- 3. Write changes UPDATE table SET col = new_value WHERE id = x; COMMIT; def run_transaction(conn, fn): MAX_RETRY = 3 for attempt in range(MAX_RETRY): try: with conn.cursor() as cur: cur.execute("BEGIN") result = fn(cur) cur.execute("COMMIT") return result except Exception as e: conn.rollback() if "deadlock" in str(e).lower() or "40001" in str(e): continue raise raise Exception("Transaction failed after max retries") async function runTransaction(pool, fn) { const MAX_RETRY = 3; for (let attempt = 0; attempt < MAX_RETRY; attempt++) { const client = await pool.connect(); try { await client.query('BEGIN'); const result = await fn(client); await client.query('COMMIT'); return result; } catch (err) { await client.query('ROLLBACK'); if (['40P01', '40001'].includes(err.code)) { continue; } throw err; } finally { client.release(); } } throw new Error('Transaction failed after max retries'); } ---- ''For background theory, see: SQL Transaction Anomalies, SQL Isolation Levels.''