⬡ Topic 2 of 18
OLTP Database Design
What Amazon, Uber, Swiggy, and Banking apps use. Online Transaction Processing is the backbone of every production backend — high volume, small fast queries, strict ACID guarantees.
3 sub-sections • 7 design steps • 4 real-world patterns
2.1
OLTP Principles
What makes a database "OLTP" and why it matters
OLTP vs OLAP in one sentence: OLTP is many users doing small, fast operations (place an order, update a cart). OLAP is a few analysts doing massive aggregations (total revenue by region this year). Most apps you build are OLTP.
The 5 Core OLTP Characteristics
High Transaction Volume
Thousands to millions of queries per second. Every row matters, every millisecond counts.
Small Queries
Queries touch a handful of rows — not millions.
WHERE user_id = 42, not GROUP BY country.Fast Inserts
New orders, messages, events — must write in under 5ms. Avoid heavy triggers on write paths.
Fast Updates
Update inventory, change order status, increment counters — single-row updates, always indexed.
ACID Transactions
Atomicity, Consistency, Isolation, Durability. Money never disappears; orders never half-create.
OLTP vs OLAP — Side by Side
| Property | OLTP | OLAP |
|---|---|---|
| Query type | Point lookups, small updates | Aggregations, full scans |
| Data volume per query | Few rows | Millions of rows |
| Concurrent users | Thousands | Few analysts |
| Schema style | Normalized (3NF) | Denormalized (star schema) |
| Primary concern | Latency, throughput | Query speed on big data |
| Examples | PostgreSQL, MySQL | BigQuery, Redshift, Snowflake |
2.2
Designing OLTP Systems
The 7-step process used by senior engineers
Follow these 7 steps every time you design a new OLTP system. We'll use an e-commerce app as our running example throughout.
Step 1
Entities
Entities
Step 2
Relationships
Relationships
Step 3
Normalize
Normalize
Step 4
Transactions
Transactions
Step 5
Indexes
Indexes
Step 6
Audit Cols
Audit Cols
Step 7
Soft Deletes
Soft Deletes
Step 1: Identify Business Entities — List every noun in the system. For e-commerce, that's users, products, orders, payments, etc.
User
→
Address
→
Product
→
Category
→
Inventory
Cart
→
Order
→
OrderItem
→
Payment
-- E-Commerce Entity List
-- Ask: "What are the things (nouns) in this domain?"
-- User management
users → people who buy
addresses → where to ship (user has many addresses)
-- Catalog
categories → Electronics, Clothing, Books…
products → items for sale
inventory → stock levels per product (maybe per warehouse)
-- Shopping
carts → temporary basket before checkout
cart_items → products in the cart
-- Fulfillment
orders → confirmed purchase intent
order_items → individual line items in an order
payments → money transaction records
Step 2: Define Relationships — Map how entities connect. Draw the chain from user to money.
-- Relationship tree (read top to bottom):
User (1)
├── has many → Addresses (N)
├── has many → Orders (N)
│ │
│ └── has many → OrderItems (N)
│ │
│ └── belongs to → Product (1)
│ │
│ └── has one → Inventory
└── has many → Payments (N)
│
└── belongs to → Order (1)
-- Key relationships:
-- users 1:N orders
-- orders 1:N order_items
-- products 1:N order_items (product appears in many orders)
-- orders 1:1 payments (each order has one payment record)
-- products M:N categories (via product_categories junction)
Step 3: Normalize to 3NF — Split data so each fact lives in exactly one place. Remove all redundancy.
-- ❌ BAD: storing customer info directly on the order
orders_bad
┌──────────┬───────────────┬───────────────────┬──────────────┐
│ order_id │ customer_name │ customer_email │ customer_city│
└──────────┴───────────────┴───────────────────┴──────────────┘
-- Problem: if email changes, every order row must be updated!
-- Problem: city depends on user, not the order (transitive dep)
-- ✅ GOOD: normalize — each fact lives in one table
CREATE TABLE users (
user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
email TEXT UNIQUE NOT NULL,
first_name TEXT NOT NULL,
last_name TEXT NOT NULL
);
CREATE TABLE orders (
order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(user_id),
status TEXT NOT NULL DEFAULT 'pending',
total NUMERIC(12,2) NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Customer info is fetched via JOIN — never duplicated
Step 4: Define Transaction Boundaries — Group related writes into a single atomic transaction so they all succeed or all fail together.
Example: Placing an order involves 4 operations. All must succeed or none should commit:
1
Create Order — insert into
orders table with status = 'pending'2
Create Order Items — insert each product line into
order_items3
Reduce Inventory — decrement stock in
inventory for each product4
Create Payment Record — insert into
payments with status = 'pending'
-- All 4 operations in a SINGLE transaction
BEGIN;
-- Step 1: Create the order
INSERT INTO orders (user_id, status, total)
VALUES (42, 'pending', 350.00)
RETURNING order_id; -- capture the new order_id
-- Step 2: Add order items
INSERT INTO order_items (order_id, product_id, quantity, price)
VALUES
(1001, 5, 2, 100.00),
(1001, 9, 1, 150.00);
-- Step 3: Reduce inventory (with a check!)
UPDATE inventory
SET stock = stock - 2
WHERE product_id = 5
AND stock >= 2; -- prevents negative stock
-- Step 4: Create payment record
INSERT INTO payments (order_id, amount, status, method)
VALUES (1001, 350.00, 'pending', 'card');
COMMIT; -- all or nothing!
-- If anything fails above, run ROLLBACK instead:
-- ROLLBACK; → undoes ALL 4 operations atomically
Never split related writes across multiple requests. If your server crashes between step 2 and step 3, you'd have an order with items but no inventory deduction. Always wrap in a transaction.
Step 5: Index Strategy — Add indexes on columns you filter or join on. But don't over-index — every index slows down writes.
-- ✅ Index columns you search by, filter on, or JOIN with
-- users: look up by email (login)
CREATE UNIQUE INDEX idx_users_email
ON users (email);
-- orders: look up all orders for a user (very common)
CREATE INDEX idx_orders_user_id
ON orders (user_id);
-- orders: look up recent orders (dashboard, listing)
CREATE INDEX idx_orders_created_at
ON orders (created_at DESC);
-- order_items: look up items for a given order (always needed)
CREATE INDEX idx_order_items_order_id
ON order_items (order_id);
-- payments: look up payment for an order
CREATE INDEX idx_payments_order_id
ON payments (order_id);
-- ❌ DON'T index columns you never filter on:
-- first_name, last_name, description, notes
-- ❌ DON'T index low-cardinality columns:
-- status (only 'pending','paid','cancelled') → not selective enough
-- ✅ EXCEPTION: partial index for low-cardinality but important subset
CREATE INDEX idx_orders_pending
ON orders (created_at)
WHERE status = 'pending'; -- only index pending orders
Step 6: Audit Columns — Every OLTP table should track who created/changed a row and when. This is essential for debugging, compliance, and data lineage.
-- Standard audit columns — add to EVERY table
CREATE TABLE orders (
order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(user_id),
total NUMERIC(12,2),
status TEXT DEFAULT 'pending',
-- ✅ Audit columns
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
created_by BIGINT REFERENCES users(user_id), -- who placed it
updated_by BIGINT REFERENCES users(user_id) -- who last touched it
);
-- Auto-update updated_at using a trigger:
CREATE OR REPLACE FUNCTION set_updated_at()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_orders_updated_at
BEFORE UPDATE ON orders
FOR EACH ROW EXECUTE FUNCTION set_updated_at();
-- Now every UPDATE automatically sets updated_at = NOW()
Pro tip: Create a reusable trigger function once and attach it to every table. Some teams use a base migration that adds these columns automatically to all new tables.
Step 7: Soft Deletes — Don't physically delete rows in OLTP systems. Mark them as deleted instead. This preserves history, enables undo, and prevents FK violation errors.
-- Add soft delete columns to every table
ALTER TABLE users
ADD COLUMN is_deleted BOOLEAN NOT NULL DEFAULT FALSE,
ADD COLUMN deleted_at TIMESTAMPTZ,
ADD COLUMN deleted_by BIGINT REFERENCES users(user_id);
-- ✅ Soft delete: mark as deleted, don't remove
UPDATE users
SET
is_deleted = TRUE,
deleted_at = NOW(),
deleted_by = 1 -- the admin who deleted
WHERE user_id = 42;
-- ❌ NEVER do this in OLTP:
-- DELETE FROM users WHERE user_id = 42;
-- This would cascade-delete all their orders, payments, etc.
-- Filter out deleted records in every query:
SELECT * FROM users WHERE is_deleted = FALSE;
-- ✅ Better: create a VIEW that hides deleted rows
CREATE VIEW active_users AS
SELECT * FROM users WHERE is_deleted = FALSE;
-- App queries active_users instead of users directly
SELECT * FROM active_users WHERE email = 'alice@example.com';
-- ✅ Partial index for performance (only index non-deleted rows)
CREATE INDEX idx_users_email_active
ON users (email)
WHERE is_deleted = FALSE;
2.3
Common OLTP Patterns
Real schemas used at scale: Auth, E-Commerce, Ride Sharing, AI SaaS
Click any pattern to see its full production schema with SQL:
🔐 User Management
Auth, roles, permissions
users
roles
permissions
user_roles
🛒 E-Commerce
Orders, products, payments
products
orders
order_items
payments
🚗 Ride Sharing
Drivers, trips, locations
drivers
riders
trips
locations
🤖 AI SaaS
Conversations, embeddings, RAG
conversations
messages
documents
embeddings
User Management Pattern — RBAC (Role-Based Access Control) used by virtually every production app. Users have roles, roles have permissions.
users
user_idPK BIGINT
emailUQ TEXT
password_hashTEXT
is_activeBOOL
created_atTIMESTAMPTZ
roles
role_idPK BIGINT
nameUQ TEXT
descriptionTEXT
permissions
permission_idPK BIGINT
nameUQ TEXT
resourceTEXT
actionTEXT
user_roles
user_idFK BIGINT
role_idFK BIGINT
granted_atTIMESTAMPTZ
granted_byFK BIGINT
role_permissions
role_idFK BIGINT
permission_idFK BIGINT
-- Full RBAC schema
CREATE TABLE users (
user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
email TEXT UNIQUE NOT NULL,
password_hash TEXT NOT NULL,
is_active BOOLEAN NOT NULL DEFAULT TRUE,
is_deleted BOOLEAN NOT NULL DEFAULT FALSE,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE roles (
role_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name TEXT UNIQUE NOT NULL, -- 'admin', 'editor', 'viewer'
description TEXT
);
CREATE TABLE permissions (
permission_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name TEXT UNIQUE NOT NULL, -- 'orders:read', 'users:delete'
resource TEXT NOT NULL, -- 'orders', 'users', 'products'
action TEXT NOT NULL -- 'read', 'write', 'delete'
);
-- Junction: users ↔ roles (M:N)
CREATE TABLE user_roles (
user_id BIGINT REFERENCES users(user_id) ON DELETE CASCADE,
role_id BIGINT REFERENCES roles(role_id) ON DELETE CASCADE,
granted_at TIMESTAMPTZ DEFAULT NOW(),
granted_by BIGINT REFERENCES users(user_id),
PRIMARY KEY (user_id, role_id)
);
-- Junction: roles ↔ permissions (M:N)
CREATE TABLE role_permissions (
role_id BIGINT REFERENCES roles(role_id) ON DELETE CASCADE,
permission_id BIGINT REFERENCES permissions(permission_id) ON DELETE CASCADE,
PRIMARY KEY (role_id, permission_id)
);
-- Check if a user has a specific permission:
SELECT COUNT(*) > 0 AS has_permission
FROM user_roles ur
JOIN role_permissions rp ON ur.role_id = rp.role_id
JOIN permissions p ON rp.permission_id = p.permission_id
WHERE ur.user_id = 42
AND p.name = 'orders:read';
E-Commerce Pattern — The canonical OLTP schema. Used by Flipkart, Amazon-style apps. Notice how inventory is separate from products to allow per-warehouse tracking.
products
product_idPK
nameTEXT
priceNUMERIC
skuUQ
inventory
product_idFK
stockINT
reservedINT
orders
order_idPK
user_idFK
statusTEXT
totalNUMERIC
order_items
order_idFK
product_idFK
quantityINT
priceNUMERIC
payments
payment_idPK
order_idFK
amountNUMERIC
statusTEXT
gateway_refTEXT
-- Complete E-Commerce OLTP Schema
CREATE TABLE categories (
category_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name TEXT NOT NULL,
parent_id BIGINT REFERENCES categories(category_id) -- self-ref for hierarchy
);
CREATE TABLE products (
product_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name TEXT NOT NULL,
description TEXT,
sku TEXT UNIQUE NOT NULL, -- Stock Keeping Unit
price NUMERIC(10,2) NOT NULL,
is_active BOOLEAN NOT NULL DEFAULT TRUE,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE inventory (
product_id BIGINT PRIMARY KEY REFERENCES products(product_id),
stock INT NOT NULL DEFAULT 0 CHECK (stock >= 0),
reserved INT NOT NULL DEFAULT 0, -- in-progress orders
updated_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE orders (
order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(user_id),
shipping_address JSONB, -- snapshot of address at order time
status TEXT NOT NULL DEFAULT 'pending',
total NUMERIC(12,2) NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE order_items (
order_id BIGINT NOT NULL REFERENCES orders(order_id),
product_id BIGINT NOT NULL REFERENCES products(product_id),
quantity INT NOT NULL CHECK (quantity > 0),
price NUMERIC(10,2) NOT NULL, -- price at time of order (snapshot)
PRIMARY KEY (order_id, product_id)
);
CREATE TABLE payments (
payment_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
order_id BIGINT UNIQUE NOT NULL REFERENCES orders(order_id),
amount NUMERIC(12,2) NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
method TEXT, -- 'upi', 'card', 'netbanking', 'cod'
gateway_ref TEXT, -- Razorpay/Stripe transaction ID
paid_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW()
);
Ride Sharing Pattern — Used by Uber/Ola. Key challenge: real-time location tracking and matching drivers to riders. Location history is append-only (never updated).
drivers
driver_idPK
user_idFK
vehicle_typeTEXT
is_availableBOOL
ratingNUMERIC
trips
trip_idPK
driver_idFK
rider_idFK
statusTEXT
fareNUMERIC
locations
location_idPK
driver_idFK
lat / lngNUMERIC
recorded_atTIMESTAMPTZ
payments
payment_idPK
trip_idFK
amountNUMERIC
methodTEXT
-- Ride Sharing OLTP Schema
CREATE TABLE drivers (
driver_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
user_id BIGINT UNIQUE NOT NULL REFERENCES users(user_id),
vehicle_type TEXT NOT NULL, -- 'auto', 'bike', 'mini', 'sedan'
license_num TEXT UNIQUE NOT NULL,
is_available BOOLEAN NOT NULL DEFAULT FALSE,
rating NUMERIC(3,2) DEFAULT 5.0, -- running average
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE trips (
trip_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
driver_id BIGINT NOT NULL REFERENCES drivers(driver_id),
rider_id BIGINT NOT NULL REFERENCES users(user_id),
status TEXT NOT NULL DEFAULT 'requested',
-- status: requested → accepted → in_progress → completed / cancelled
pickup_lat NUMERIC(9,6),
pickup_lng NUMERIC(9,6),
dropoff_lat NUMERIC(9,6),
dropoff_lng NUMERIC(9,6),
distance_km NUMERIC(8,2),
fare NUMERIC(8,2),
started_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Location pings from driver app (append-only, NEVER update)
CREATE TABLE driver_locations (
location_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
driver_id BIGINT NOT NULL REFERENCES drivers(driver_id),
lat NUMERIC(9,6) NOT NULL,
lng NUMERIC(9,6) NOT NULL,
recorded_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- This table grows FAST. Partition by day/week (covered in Topic 11)
-- Get latest driver location:
SELECT DISTINCT ON (driver_id)
driver_id, lat, lng, recorded_at
FROM driver_locations
WHERE driver_id = 7
ORDER BY driver_id, recorded_at DESC;
AI SaaS Pattern — Used for apps like ChatGPT, Notion AI, or your own FastAPI + AI agent backend. Multi-tenant with conversation history and vector embeddings for RAG.
organizations
org_idPK
nameTEXT
planTEXT
conversations
conv_idPK
user_idFK
titleTEXT
modelTEXT
messages
msg_idPK
conv_idFK
roleTEXT
contentTEXT
tokensINT
documents
doc_idPK
org_idFK
filenameTEXT
contentTEXT
embeddings
embed_idPK
doc_idFK
chunk_textTEXT
vectorVECTOR
-- AI SaaS Multi-Tenant Schema
-- Requires: pgvector extension for embeddings
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE organizations (
org_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name TEXT NOT NULL,
plan TEXT NOT NULL DEFAULT 'free', -- 'free','pro','enterprise'
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE users (
user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
org_id BIGINT NOT NULL REFERENCES organizations(org_id),
email TEXT UNIQUE NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE conversations (
conv_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id BIGINT NOT NULL REFERENCES users(user_id),
org_id BIGINT NOT NULL REFERENCES organizations(org_id),
title TEXT,
model TEXT NOT NULL DEFAULT 'gpt-4',
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE messages (
msg_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
conv_id UUID NOT NULL REFERENCES conversations(conv_id),
role TEXT NOT NULL, -- 'user', 'assistant', 'system'
content TEXT NOT NULL,
tokens INT, -- track usage for billing
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE documents (
doc_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id BIGINT NOT NULL REFERENCES organizations(org_id),
filename TEXT NOT NULL,
mime_type TEXT,
content TEXT,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- RAG: store chunked text + vector embedding
CREATE TABLE embeddings (
embed_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
doc_id UUID NOT NULL REFERENCES documents(doc_id),
chunk_index INT NOT NULL,
chunk_text TEXT NOT NULL,
vector VECTOR(1536) NOT NULL, -- 1536 dims for OpenAI ada-002
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Semantic search: find most similar chunks to a query vector
SELECT chunk_text, 1 - (vector <=> '[0.1, 0.2, ...]'::VECTOR) AS similarity
FROM embeddings
WHERE doc_id = '...'
ORDER BY vector <=> '[0.1, 0.2, ...]'::VECTOR
LIMIT 5;
Multi-tenancy tip: Always store
org_id on every table and add it to every query. Use Row Level Security (RLS, covered in Topic 17) to enforce tenant isolation automatically at the database level.